Skip to content

Commit effcc0d

Browse files
committed
Pushing the docs to dev/ for branch: master, commit ea169b596ca5913ccd02cacfc09a6ec0d3492702
1 parent 99eaeb8 commit effcc0d

File tree

1,210 files changed

+6957
-5024
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,210 files changed

+6957
-5024
lines changed
Binary file not shown.
Lines changed: 108 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,108 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "code",
5+
"execution_count": null,
6+
"metadata": {
7+
"collapsed": false
8+
},
9+
"outputs": [],
10+
"source": [
11+
"%matplotlib inline"
12+
]
13+
},
14+
{
15+
"cell_type": "markdown",
16+
"metadata": {},
17+
"source": [
18+
"\n# ROC Curve with Visualization API\n\nScikit-learn defines a simple API for creating visualizations for machine\nlearning. The key features of this API is to allow for quick plotting and\nvisual adjustments without recalculation. In this example, we will demonstrate\nhow to use the visualization API by comparing ROC curves.\n"
19+
]
20+
},
21+
{
22+
"cell_type": "code",
23+
"execution_count": null,
24+
"metadata": {
25+
"collapsed": false
26+
},
27+
"outputs": [],
28+
"source": [
29+
"print(__doc__)"
30+
]
31+
},
32+
{
33+
"cell_type": "markdown",
34+
"metadata": {},
35+
"source": [
36+
"Load Data and Train a SVC\n-------------------------\nFirst, we load the wine dataset and convert it to a binary classification\nproblem. Then, we train a support vector classifier on a training dataset.\n\n"
37+
]
38+
},
39+
{
40+
"cell_type": "code",
41+
"execution_count": null,
42+
"metadata": {
43+
"collapsed": false
44+
},
45+
"outputs": [],
46+
"source": [
47+
"import matplotlib.pyplot as plt\nfrom sklearn.svm import SVC\nfrom sklearn.ensemble import RandomForestClassifier\nfrom sklearn.metrics import plot_roc_curve\nfrom sklearn.datasets import load_wine\nfrom sklearn.model_selection import train_test_split\n\nX, y = load_wine(return_X_y=True)\ny = y == 2\n\nX_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)\nsvc = SVC(random_state=42)\nsvc.fit(X_train, y_train)"
48+
]
49+
},
50+
{
51+
"cell_type": "markdown",
52+
"metadata": {},
53+
"source": [
54+
"Plotting the ROC Curve\n----------------------\nNext, we plot the ROC curve with a single call to\n:func:`sklearn.metrics.plot_roc_curve`. The returned `svc_disp` object allows\nus to continue using the already computed ROC curve for the SVC in future\nplots.\n\n"
55+
]
56+
},
57+
{
58+
"cell_type": "code",
59+
"execution_count": null,
60+
"metadata": {
61+
"collapsed": false
62+
},
63+
"outputs": [],
64+
"source": [
65+
"svc_disp = plot_roc_curve(svc, X_test, y_test)\nplt.show()"
66+
]
67+
},
68+
{
69+
"cell_type": "markdown",
70+
"metadata": {},
71+
"source": [
72+
"Training a Random Forest and Plotting the ROC Curve\n--------------------------------------------------------\nWe train a random forest classifier and create a plot comparing it to the SVC\nROC curve. Notice how `svc_disp` uses\n:func:`~sklearn.metrics.RocCurveDisplay.plot` to plot the SVC ROC curve\nwithout recomputing the values of the roc curve itself. Futhermore, we\npass `alpha=0.8` to the plot functions to adjust the alpha values of the\ncurves.\n\n"
73+
]
74+
},
75+
{
76+
"cell_type": "code",
77+
"execution_count": null,
78+
"metadata": {
79+
"collapsed": false
80+
},
81+
"outputs": [],
82+
"source": [
83+
"rfc = RandomForestClassifier(n_estimators=10, random_state=42)\nrfc.fit(X_train, y_train)\nax = plt.gca()\nrfc_disp = plot_roc_curve(rfc, X_test, y_test, ax=ax, alpha=0.8)\nsvc_disp.plot(ax=ax, alpha=0.8)\nplt.show()"
84+
]
85+
}
86+
],
87+
"metadata": {
88+
"kernelspec": {
89+
"display_name": "Python 3",
90+
"language": "python",
91+
"name": "python3"
92+
},
93+
"language_info": {
94+
"codemirror_mode": {
95+
"name": "ipython",
96+
"version": 3
97+
},
98+
"file_extension": ".py",
99+
"mimetype": "text/x-python",
100+
"name": "python",
101+
"nbconvert_exporter": "python",
102+
"pygments_lexer": "ipython3",
103+
"version": "3.7.3"
104+
}
105+
},
106+
"nbformat": 4,
107+
"nbformat_minor": 0
108+
}

dev/_downloads/d33f7865941f1e2c2c62fcc641599cc5/plot_roc_crossval.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@
2626
},
2727
"outputs": [],
2828
"source": [
29-
"print(__doc__)\n\nimport numpy as np\nfrom scipy import interp\nimport matplotlib.pyplot as plt\n\nfrom sklearn import svm, datasets\nfrom sklearn.metrics import roc_curve, auc\nfrom sklearn.model_selection import StratifiedKFold\n\n# #############################################################################\n# Data IO and generation\n\n# Import some data to play with\niris = datasets.load_iris()\nX = iris.data\ny = iris.target\nX, y = X[y != 2], y[y != 2]\nn_samples, n_features = X.shape\n\n# Add noisy features\nrandom_state = np.random.RandomState(0)\nX = np.c_[X, random_state.randn(n_samples, 200 * n_features)]\n\n# #############################################################################\n# Classification and ROC analysis\n\n# Run classifier with cross-validation and plot ROC curves\ncv = StratifiedKFold(n_splits=6)\nclassifier = svm.SVC(kernel='linear', probability=True,\n random_state=random_state)\n\ntprs = []\naucs = []\nmean_fpr = np.linspace(0, 1, 100)\n\ni = 0\nfor train, test in cv.split(X, y):\n probas_ = classifier.fit(X[train], y[train]).predict_proba(X[test])\n # Compute ROC curve and area the curve\n fpr, tpr, thresholds = roc_curve(y[test], probas_[:, 1])\n tprs.append(interp(mean_fpr, fpr, tpr))\n tprs[-1][0] = 0.0\n roc_auc = auc(fpr, tpr)\n aucs.append(roc_auc)\n plt.plot(fpr, tpr, lw=1, alpha=0.3,\n label='ROC fold %d (AUC = %0.2f)' % (i, roc_auc))\n\n i += 1\nplt.plot([0, 1], [0, 1], linestyle='--', lw=2, color='r',\n label='Chance', alpha=.8)\n\nmean_tpr = np.mean(tprs, axis=0)\nmean_tpr[-1] = 1.0\nmean_auc = auc(mean_fpr, mean_tpr)\nstd_auc = np.std(aucs)\nplt.plot(mean_fpr, mean_tpr, color='b',\n label=r'Mean ROC (AUC = %0.2f $\\pm$ %0.2f)' % (mean_auc, std_auc),\n lw=2, alpha=.8)\n\nstd_tpr = np.std(tprs, axis=0)\ntprs_upper = np.minimum(mean_tpr + std_tpr, 1)\ntprs_lower = np.maximum(mean_tpr - std_tpr, 0)\nplt.fill_between(mean_fpr, tprs_lower, tprs_upper, color='grey', alpha=.2,\n label=r'$\\pm$ 1 std. dev.')\n\nplt.xlim([-0.05, 1.05])\nplt.ylim([-0.05, 1.05])\nplt.xlabel('False Positive Rate')\nplt.ylabel('True Positive Rate')\nplt.title('Receiver operating characteristic example')\nplt.legend(loc=\"lower right\")\nplt.show()"
29+
"print(__doc__)\n\nimport numpy as np\nfrom scipy import interp\nimport matplotlib.pyplot as plt\n\nfrom sklearn import svm, datasets\nfrom sklearn.metrics import auc\nfrom sklearn.metrics import plot_roc_curve\nfrom sklearn.model_selection import StratifiedKFold\n\n# #############################################################################\n# Data IO and generation\n\n# Import some data to play with\niris = datasets.load_iris()\nX = iris.data\ny = iris.target\nX, y = X[y != 2], y[y != 2]\nn_samples, n_features = X.shape\n\n# Add noisy features\nrandom_state = np.random.RandomState(0)\nX = np.c_[X, random_state.randn(n_samples, 200 * n_features)]\n\n# #############################################################################\n# Classification and ROC analysis\n\n# Run classifier with cross-validation and plot ROC curves\ncv = StratifiedKFold(n_splits=6)\nclassifier = svm.SVC(kernel='linear', probability=True,\n random_state=random_state)\n\ntprs = []\naucs = []\nmean_fpr = np.linspace(0, 1, 100)\n\nfig, ax = plt.subplots()\nfor i, (train, test) in enumerate(cv.split(X, y)):\n classifier.fit(X[train], y[train])\n viz = plot_roc_curve(classifier, X[test], y[test],\n name='ROC fold {}'.format(i),\n alpha=0.3, lw=1, ax=ax)\n interp_tpr = interp(mean_fpr, viz.fpr, viz.tpr)\n interp_tpr[0] = 0.0\n tprs.append(interp_tpr)\n aucs.append(viz.roc_auc)\n\nax.plot([0, 1], [0, 1], linestyle='--', lw=2, color='r',\n label='Chance', alpha=.8)\n\nmean_tpr = np.mean(tprs, axis=0)\nmean_tpr[-1] = 1.0\nmean_auc = auc(mean_fpr, mean_tpr)\nstd_auc = np.std(aucs)\nax.plot(mean_fpr, mean_tpr, color='b',\n label=r'Mean ROC (AUC = %0.2f $\\pm$ %0.2f)' % (mean_auc, std_auc),\n lw=2, alpha=.8)\n\nstd_tpr = np.std(tprs, axis=0)\ntprs_upper = np.minimum(mean_tpr + std_tpr, 1)\ntprs_lower = np.maximum(mean_tpr - std_tpr, 0)\nax.fill_between(mean_fpr, tprs_lower, tprs_upper, color='grey', alpha=.2,\n label=r'$\\pm$ 1 std. dev.')\n\nax.set(xlim=[-0.05, 1.05], ylim=[-0.05, 1.05],\n title=\"Receiver operating characteristic example\")\nax.legend(loc=\"lower right\")\nplt.show()"
3030
]
3131
}
3232
],
Binary file not shown.
Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
"""
2+
================================
3+
ROC Curve with Visualization API
4+
================================
5+
Scikit-learn defines a simple API for creating visualizations for machine
6+
learning. The key features of this API is to allow for quick plotting and
7+
visual adjustments without recalculation. In this example, we will demonstrate
8+
how to use the visualization API by comparing ROC curves.
9+
"""
10+
print(__doc__)
11+
12+
##############################################################################
13+
# Load Data and Train a SVC
14+
# -------------------------
15+
# First, we load the wine dataset and convert it to a binary classification
16+
# problem. Then, we train a support vector classifier on a training dataset.
17+
import matplotlib.pyplot as plt
18+
from sklearn.svm import SVC
19+
from sklearn.ensemble import RandomForestClassifier
20+
from sklearn.metrics import plot_roc_curve
21+
from sklearn.datasets import load_wine
22+
from sklearn.model_selection import train_test_split
23+
24+
X, y = load_wine(return_X_y=True)
25+
y = y == 2
26+
27+
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)
28+
svc = SVC(random_state=42)
29+
svc.fit(X_train, y_train)
30+
31+
##############################################################################
32+
# Plotting the ROC Curve
33+
# ----------------------
34+
# Next, we plot the ROC curve with a single call to
35+
# :func:`sklearn.metrics.plot_roc_curve`. The returned `svc_disp` object allows
36+
# us to continue using the already computed ROC curve for the SVC in future
37+
# plots.
38+
svc_disp = plot_roc_curve(svc, X_test, y_test)
39+
plt.show()
40+
41+
##############################################################################
42+
# Training a Random Forest and Plotting the ROC Curve
43+
# --------------------------------------------------------
44+
# We train a random forest classifier and create a plot comparing it to the SVC
45+
# ROC curve. Notice how `svc_disp` uses
46+
# :func:`~sklearn.metrics.RocCurveDisplay.plot` to plot the SVC ROC curve
47+
# without recomputing the values of the roc curve itself. Futhermore, we
48+
# pass `alpha=0.8` to the plot functions to adjust the alpha values of the
49+
# curves.
50+
rfc = RandomForestClassifier(n_estimators=10, random_state=42)
51+
rfc.fit(X_train, y_train)
52+
ax = plt.gca()
53+
rfc_disp = plot_roc_curve(rfc, X_test, y_test, ax=ax, alpha=0.8)
54+
svc_disp.plot(ax=ax, alpha=0.8)
55+
plt.show()

dev/_downloads/f314befaab67f267a90ac9b4fd21b13e/plot_roc_crossval.py

Lines changed: 24 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,8 @@
3636
import matplotlib.pyplot as plt
3737

3838
from sklearn import svm, datasets
39-
from sklearn.metrics import roc_curve, auc
39+
from sklearn.metrics import auc
40+
from sklearn.metrics import plot_roc_curve
4041
from sklearn.model_selection import StratifiedKFold
4142

4243
# #############################################################################
@@ -65,40 +66,35 @@
6566
aucs = []
6667
mean_fpr = np.linspace(0, 1, 100)
6768

68-
i = 0
69-
for train, test in cv.split(X, y):
70-
probas_ = classifier.fit(X[train], y[train]).predict_proba(X[test])
71-
# Compute ROC curve and area the curve
72-
fpr, tpr, thresholds = roc_curve(y[test], probas_[:, 1])
73-
tprs.append(interp(mean_fpr, fpr, tpr))
74-
tprs[-1][0] = 0.0
75-
roc_auc = auc(fpr, tpr)
76-
aucs.append(roc_auc)
77-
plt.plot(fpr, tpr, lw=1, alpha=0.3,
78-
label='ROC fold %d (AUC = %0.2f)' % (i, roc_auc))
79-
80-
i += 1
81-
plt.plot([0, 1], [0, 1], linestyle='--', lw=2, color='r',
82-
label='Chance', alpha=.8)
69+
fig, ax = plt.subplots()
70+
for i, (train, test) in enumerate(cv.split(X, y)):
71+
classifier.fit(X[train], y[train])
72+
viz = plot_roc_curve(classifier, X[test], y[test],
73+
name='ROC fold {}'.format(i),
74+
alpha=0.3, lw=1, ax=ax)
75+
interp_tpr = interp(mean_fpr, viz.fpr, viz.tpr)
76+
interp_tpr[0] = 0.0
77+
tprs.append(interp_tpr)
78+
aucs.append(viz.roc_auc)
79+
80+
ax.plot([0, 1], [0, 1], linestyle='--', lw=2, color='r',
81+
label='Chance', alpha=.8)
8382

8483
mean_tpr = np.mean(tprs, axis=0)
8584
mean_tpr[-1] = 1.0
8685
mean_auc = auc(mean_fpr, mean_tpr)
8786
std_auc = np.std(aucs)
88-
plt.plot(mean_fpr, mean_tpr, color='b',
89-
label=r'Mean ROC (AUC = %0.2f $\pm$ %0.2f)' % (mean_auc, std_auc),
90-
lw=2, alpha=.8)
87+
ax.plot(mean_fpr, mean_tpr, color='b',
88+
label=r'Mean ROC (AUC = %0.2f $\pm$ %0.2f)' % (mean_auc, std_auc),
89+
lw=2, alpha=.8)
9190

9291
std_tpr = np.std(tprs, axis=0)
9392
tprs_upper = np.minimum(mean_tpr + std_tpr, 1)
9493
tprs_lower = np.maximum(mean_tpr - std_tpr, 0)
95-
plt.fill_between(mean_fpr, tprs_lower, tprs_upper, color='grey', alpha=.2,
96-
label=r'$\pm$ 1 std. dev.')
97-
98-
plt.xlim([-0.05, 1.05])
99-
plt.ylim([-0.05, 1.05])
100-
plt.xlabel('False Positive Rate')
101-
plt.ylabel('True Positive Rate')
102-
plt.title('Receiver operating characteristic example')
103-
plt.legend(loc="lower right")
94+
ax.fill_between(mean_fpr, tprs_lower, tprs_upper, color='grey', alpha=.2,
95+
label=r'$\pm$ 1 std. dev.')
96+
97+
ax.set(xlim=[-0.05, 1.05], ylim=[-0.05, 1.05],
98+
title="Receiver operating characteristic example")
99+
ax.legend(loc="lower right")
104100
plt.show()

dev/_downloads/scikit-learn-docs.pdf

49.6 KB
Binary file not shown.

dev/_images/iris.png

0 Bytes

0 commit comments

Comments
 (0)