Skip to content

Commit 7ff5690

Browse files
committed
Pushing the docs to dev/ for branch: master, commit e087f8a395f9db97145654c2429b7bdd107fd440
1 parent e60f120 commit 7ff5690

File tree

1,204 files changed

+4128
-3741
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,204 files changed

+4128
-3741
lines changed
Binary file not shown.

dev/_downloads/47fc6db2ef595b75bbf0d40b3b00b7b0/plot_permutation_test_for_classification.py

Lines changed: 118 additions & 51 deletions
Original file line numberDiff line numberDiff line change
@@ -3,67 +3,134 @@
33
Test with permutations the significance of a classification score
44
=================================================================
55
6-
In order to test if a classification score is significative a technique
7-
in repeating the classification procedure after randomizing, permuting,
8-
the labels. The p-value is then given by the percentage of runs for
9-
which the score obtained is greater than the classification score
10-
obtained in the first place.
11-
6+
This example demonstrates the use of
7+
:func:`~sklearn.model_selection.permutation_test_score` to evaluate the
8+
significance of a cross-valdiated score using permutations.
129
"""
1310

14-
# Author: Alexandre Gramfort <[email protected]>
11+
# Authors: Alexandre Gramfort <[email protected]>
12+
# Lucy Liu
1513
# License: BSD 3 clause
14+
#
15+
# Dataset
16+
# -------
17+
#
18+
# We will use the :ref:`iris_dataset`, which consists of measurements taken
19+
# from 3 types of irises.
20+
21+
from sklearn.datasets import load_iris
22+
23+
iris = load_iris()
24+
X = iris.data
25+
y = iris.target
1626

17-
print(__doc__)
27+
# %%
28+
# We will also generate some random feature data (i.e., 2200 features),
29+
# uncorrelated with the class labels in the iris dataset.
1830

1931
import numpy as np
20-
import matplotlib.pyplot as plt
32+
33+
n_uncorrelated_features = 2200
34+
rng = np.random.RandomState(seed=0)
35+
# Use same number of samples as in iris and 2200 features
36+
X_rand = rng.normal(size=(X.shape[0], n_uncorrelated_features))
37+
38+
# %%
39+
# Permutation test score
40+
# ----------------------
41+
#
42+
# Next, we calculate the
43+
# :func:`~sklearn.model_selection.permutation_test_score` using the original
44+
# iris dataset, which strongly predict the labels and
45+
# the randomly generated features and iris labels, which should have
46+
# no dependency between features and labels. We use the
47+
# :class:`~sklearn.svm.SVC` classifier and :ref:`accuracy_score` to evaluate
48+
# the model at each round.
49+
#
50+
# :func:`~sklearn.model_selection.permutation_test_score` generates a null
51+
# distribution by calculating the accuracy of the classifier
52+
# on 1000 different permutations of the dataset, where features
53+
# remain the same but labels undergo different permutations. This is the
54+
# distribution for the null hypothesis which states there is no dependency
55+
# between the features and labels. An empirical p-value is then calculated as
56+
# the percentage of permutations for which the score obtained is greater
57+
# that the score obtained using the original data.
2158

2259
from sklearn.svm import SVC
2360
from sklearn.model_selection import StratifiedKFold
2461
from sklearn.model_selection import permutation_test_score
25-
from sklearn import datasets
2662

63+
clf = SVC(kernel='linear', random_state=7)
64+
cv = StratifiedKFold(2, shuffle=True, random_state=0)
2765

28-
# #############################################################################
29-
# Loading a dataset
30-
iris = datasets.load_iris()
31-
X = iris.data
32-
y = iris.target
33-
n_classes = np.unique(y).size
34-
35-
# Some noisy data not correlated
36-
random = np.random.RandomState(seed=0)
37-
E = random.normal(size=(len(X), 2200))
38-
39-
# Add noisy data to the informative features for make the task harder
40-
X = np.c_[X, E]
41-
42-
svm = SVC(kernel='linear')
43-
cv = StratifiedKFold(2)
44-
45-
score, permutation_scores, pvalue = permutation_test_score(
46-
svm, X, y, scoring="accuracy", cv=cv, n_permutations=100, n_jobs=1)
47-
48-
print("Classification score %s (pvalue : %s)" % (score, pvalue))
49-
50-
# #############################################################################
51-
# View histogram of permutation scores
52-
plt.hist(permutation_scores, 20, label='Permutation scores',
53-
edgecolor='black')
54-
ylim = plt.ylim()
55-
# BUG: vlines(..., linestyle='--') fails on older versions of matplotlib
56-
# plt.vlines(score, ylim[0], ylim[1], linestyle='--',
57-
# color='g', linewidth=3, label='Classification Score'
58-
# ' (pvalue %s)' % pvalue)
59-
# plt.vlines(1.0 / n_classes, ylim[0], ylim[1], linestyle='--',
60-
# color='k', linewidth=3, label='Luck')
61-
plt.plot(2 * [score], ylim, '--g', linewidth=3,
62-
label='Classification Score'
63-
' (pvalue %s)' % pvalue)
64-
plt.plot(2 * [1. / n_classes], ylim, '--k', linewidth=3, label='Luck')
65-
66-
plt.ylim(ylim)
67-
plt.legend()
68-
plt.xlabel('Score')
66+
score_iris, perm_scores_iris, pvalue_iris = permutation_test_score(
67+
clf, X, y, scoring="accuracy", cv=cv, n_permutations=1000)
68+
69+
score_rand, perm_scores_rand, pvalue_rand = permutation_test_score(
70+
clf, X_rand, y, scoring="accuracy", cv=cv, n_permutations=1000)
71+
72+
# %%
73+
# Original data
74+
# ^^^^^^^^^^^^^
75+
#
76+
# Below we plot a histogram of the permutation scores (the null
77+
# distribution). The red line indicates the score obtained by the classifier
78+
# on the original data. The score is much better than those obtained by
79+
# using permuted data and the p-value is thus very low. This indicates that
80+
# there is a low likelihood that this good score would be obtained by chance
81+
# alone. It provides evidence that the iris dataset contains real dependency
82+
# between features and labels and the classifier was able to utilize this
83+
# to obtain good results.
84+
85+
import matplotlib.pyplot as plt
86+
87+
fig, ax = plt.subplots()
88+
89+
ax.hist(perm_scores_iris, bins=20, density=True)
90+
ax.axvline(score_iris, ls='--', color='r')
91+
score_label = (f"Score on original\ndata: {score_iris:.2f}\n"
92+
f"(p-value: {pvalue_iris:.3f})")
93+
ax.text(0.7, 260, score_label, fontsize=12)
94+
ax.set_xlabel("Accuracy score")
95+
_ = ax.set_ylabel("Probability")
96+
97+
# %%
98+
# Random data
99+
# ^^^^^^^^^^^
100+
#
101+
# Below we plot the null distribution for the randomized data. The permutation
102+
# scores are similar to those obtained using the original iris dataset
103+
# because the permutation always destroys any feature label dependency present.
104+
# The score obtained on the original randomized data in this case though, is
105+
# very poor. This results in a large p-value, confirming that there was no
106+
# feature label dependency in the original data.
107+
108+
fig, ax = plt.subplots()
109+
110+
ax.hist(perm_scores_rand, bins=20, density=True)
111+
ax.set_xlim(0.13)
112+
ax.axvline(score_rand, ls='--', color='r')
113+
score_label = (f"Score on original\ndata: {score_rand:.2f}\n"
114+
f"(p-value: {pvalue_rand:.3f})")
115+
ax.text(0.14, 125, score_label, fontsize=12)
116+
ax.set_xlabel("Accuracy score")
117+
ax.set_ylabel("Probability")
69118
plt.show()
119+
120+
# %%
121+
# Another possible reason for obtaining a high p-value is that the classifier
122+
# was not able to use the structure in the data. In this case, the p-value
123+
# would only be low for classifiers that are able to utilize the dependency
124+
# present. In our case above, where the data is random, all classifiers would
125+
# have a high p-value as there is no structure present in the data.
126+
#
127+
# Finally, note that this test has been shown to produce low p-values even
128+
# if there is only weak structure in the data [1]_.
129+
#
130+
# .. topic:: References:
131+
#
132+
# .. [1] Ojala and Garriga. `Permutation Tests for Studying Classifier
133+
# Performance
134+
# <http://www.jmlr.org/papers/volume11/ojala10a/ojala10a.pdf>`_. The
135+
# Journal of Machine Learning Research (2010) vol. 11
136+
#

dev/_downloads/64e850d00f3e594b7bf9079d7b796fcb/plot_permutation_test_for_classification.ipynb

Lines changed: 81 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
"cell_type": "markdown",
1616
"metadata": {},
1717
"source": [
18-
"\n# Test with permutations the significance of a classification score\n\n\nIn order to test if a classification score is significative a technique\nin repeating the classification procedure after randomizing, permuting,\nthe labels. The p-value is then given by the percentage of runs for\nwhich the score obtained is greater than the classification score\nobtained in the first place.\n"
18+
"\n# Test with permutations the significance of a classification score\n\n\nThis example demonstrates the use of\n:func:`~sklearn.model_selection.permutation_test_score` to evaluate the\nsignificance of a cross-valdiated score using permutations.\n"
1919
]
2020
},
2121
{
@@ -26,7 +26,86 @@
2626
},
2727
"outputs": [],
2828
"source": [
29-
"# Author: Alexandre Gramfort <[email protected]>\n# License: BSD 3 clause\n\nprint(__doc__)\n\nimport numpy as np\nimport matplotlib.pyplot as plt\n\nfrom sklearn.svm import SVC\nfrom sklearn.model_selection import StratifiedKFold\nfrom sklearn.model_selection import permutation_test_score\nfrom sklearn import datasets\n\n\n# #############################################################################\n# Loading a dataset\niris = datasets.load_iris()\nX = iris.data\ny = iris.target\nn_classes = np.unique(y).size\n\n# Some noisy data not correlated\nrandom = np.random.RandomState(seed=0)\nE = random.normal(size=(len(X), 2200))\n\n# Add noisy data to the informative features for make the task harder\nX = np.c_[X, E]\n\nsvm = SVC(kernel='linear')\ncv = StratifiedKFold(2)\n\nscore, permutation_scores, pvalue = permutation_test_score(\n svm, X, y, scoring=\"accuracy\", cv=cv, n_permutations=100, n_jobs=1)\n\nprint(\"Classification score %s (pvalue : %s)\" % (score, pvalue))\n\n# #############################################################################\n# View histogram of permutation scores\nplt.hist(permutation_scores, 20, label='Permutation scores',\n edgecolor='black')\nylim = plt.ylim()\n# BUG: vlines(..., linestyle='--') fails on older versions of matplotlib\n# plt.vlines(score, ylim[0], ylim[1], linestyle='--',\n# color='g', linewidth=3, label='Classification Score'\n# ' (pvalue %s)' % pvalue)\n# plt.vlines(1.0 / n_classes, ylim[0], ylim[1], linestyle='--',\n# color='k', linewidth=3, label='Luck')\nplt.plot(2 * [score], ylim, '--g', linewidth=3,\n label='Classification Score'\n ' (pvalue %s)' % pvalue)\nplt.plot(2 * [1. / n_classes], ylim, '--k', linewidth=3, label='Luck')\n\nplt.ylim(ylim)\nplt.legend()\nplt.xlabel('Score')\nplt.show()"
29+
"# Authors: Alexandre Gramfort <[email protected]>\n# Lucy Liu\n# License: BSD 3 clause\n#\n# Dataset\n# -------\n#\n# We will use the :ref:`iris_dataset`, which consists of measurements taken\n# from 3 types of irises.\n\nfrom sklearn.datasets import load_iris\n\niris = load_iris()\nX = iris.data\ny = iris.target"
30+
]
31+
},
32+
{
33+
"cell_type": "markdown",
34+
"metadata": {},
35+
"source": [
36+
"We will also generate some random feature data (i.e., 2200 features),\nuncorrelated with the class labels in the iris dataset.\n\n"
37+
]
38+
},
39+
{
40+
"cell_type": "code",
41+
"execution_count": null,
42+
"metadata": {
43+
"collapsed": false
44+
},
45+
"outputs": [],
46+
"source": [
47+
"import numpy as np\n\nn_uncorrelated_features = 2200\nrng = np.random.RandomState(seed=0)\n# Use same number of samples as in iris and 2200 features\nX_rand = rng.normal(size=(X.shape[0], n_uncorrelated_features))"
48+
]
49+
},
50+
{
51+
"cell_type": "markdown",
52+
"metadata": {},
53+
"source": [
54+
"Permutation test score\n----------------------\n\nNext, we calculate the\n:func:`~sklearn.model_selection.permutation_test_score` using the original\niris dataset, which strongly predict the labels and\nthe randomly generated features and iris labels, which should have\nno dependency between features and labels. We use the\n:class:`~sklearn.svm.SVC` classifier and `accuracy_score` to evaluate\nthe model at each round.\n\n:func:`~sklearn.model_selection.permutation_test_score` generates a null\ndistribution by calculating the accuracy of the classifier\non 1000 different permutations of the dataset, where features\nremain the same but labels undergo different permutations. This is the\ndistribution for the null hypothesis which states there is no dependency\nbetween the features and labels. An empirical p-value is then calculated as\nthe percentage of permutations for which the score obtained is greater\nthat the score obtained using the original data.\n\n"
55+
]
56+
},
57+
{
58+
"cell_type": "code",
59+
"execution_count": null,
60+
"metadata": {
61+
"collapsed": false
62+
},
63+
"outputs": [],
64+
"source": [
65+
"from sklearn.svm import SVC\nfrom sklearn.model_selection import StratifiedKFold\nfrom sklearn.model_selection import permutation_test_score\n\nclf = SVC(kernel='linear', random_state=7)\ncv = StratifiedKFold(2, shuffle=True, random_state=0)\n\nscore_iris, perm_scores_iris, pvalue_iris = permutation_test_score(\n clf, X, y, scoring=\"accuracy\", cv=cv, n_permutations=1000)\n\nscore_rand, perm_scores_rand, pvalue_rand = permutation_test_score(\n clf, X_rand, y, scoring=\"accuracy\", cv=cv, n_permutations=1000)"
66+
]
67+
},
68+
{
69+
"cell_type": "markdown",
70+
"metadata": {},
71+
"source": [
72+
"Original data\n^^^^^^^^^^^^^\n\nBelow we plot a histogram of the permutation scores (the null\ndistribution). The red line indicates the score obtained by the classifier\non the original data. The score is much better than those obtained by\nusing permuted data and the p-value is thus very low. This indicates that\nthere is a low likelihood that this good score would be obtained by chance\nalone. It provides evidence that the iris dataset contains real dependency\nbetween features and labels and the classifier was able to utilize this\nto obtain good results.\n\n"
73+
]
74+
},
75+
{
76+
"cell_type": "code",
77+
"execution_count": null,
78+
"metadata": {
79+
"collapsed": false
80+
},
81+
"outputs": [],
82+
"source": [
83+
"import matplotlib.pyplot as plt\n\nfig, ax = plt.subplots()\n\nax.hist(perm_scores_iris, bins=20, density=True)\nax.axvline(score_iris, ls='--', color='r')\nscore_label = (f\"Score on original\\ndata: {score_iris:.2f}\\n\"\n f\"(p-value: {pvalue_iris:.3f})\")\nax.text(0.7, 260, score_label, fontsize=12)\nax.set_xlabel(\"Accuracy score\")\n_ = ax.set_ylabel(\"Probability\")"
84+
]
85+
},
86+
{
87+
"cell_type": "markdown",
88+
"metadata": {},
89+
"source": [
90+
"Random data\n^^^^^^^^^^^\n\nBelow we plot the null distribution for the randomized data. The permutation\nscores are similar to those obtained using the original iris dataset\nbecause the permutation always destroys any feature label dependency present.\nThe score obtained on the original randomized data in this case though, is\nvery poor. This results in a large p-value, confirming that there was no\nfeature label dependency in the original data.\n\n"
91+
]
92+
},
93+
{
94+
"cell_type": "code",
95+
"execution_count": null,
96+
"metadata": {
97+
"collapsed": false
98+
},
99+
"outputs": [],
100+
"source": [
101+
"fig, ax = plt.subplots()\n\nax.hist(perm_scores_rand, bins=20, density=True)\nax.set_xlim(0.13)\nax.axvline(score_rand, ls='--', color='r')\nscore_label = (f\"Score on original\\ndata: {score_rand:.2f}\\n\"\n f\"(p-value: {pvalue_rand:.3f})\")\nax.text(0.14, 125, score_label, fontsize=12)\nax.set_xlabel(\"Accuracy score\")\nax.set_ylabel(\"Probability\")\nplt.show()"
102+
]
103+
},
104+
{
105+
"cell_type": "markdown",
106+
"metadata": {},
107+
"source": [
108+
"Another possible reason for obtaining a high p-value is that the classifier\nwas not able to use the structure in the data. In this case, the p-value\nwould only be low for classifiers that are able to utilize the dependency\npresent. In our case above, where the data is random, all classifiers would\nhave a high p-value as there is no structure present in the data.\n\nFinally, note that this test has been shown to produce low p-values even\nif there is only weak structure in the data [1]_.\n\n.. topic:: References:\n\n .. [1] Ojala and Garriga. `Permutation Tests for Studying Classifier\n Performance\n <http://www.jmlr.org/papers/volume11/ojala10a/ojala10a.pdf>`_. The\n Journal of Machine Learning Research (2010) vol. 11\n\n\n"
30109
]
31110
}
32111
],
Binary file not shown.

dev/_downloads/scikit-learn-docs.pdf

6.75 KB
Binary file not shown.

dev/_images/iris.png

0 Bytes
-219 Bytes
-219 Bytes
-239 Bytes
-239 Bytes

0 commit comments

Comments
 (0)