Skip to content

Commit 8e387f2

Browse files
committed
Pushing the docs for revision for branch: master, commit 3d0c8f8b68584efdf65ba3f1bd21925e65aad3c3
1 parent 9bba2a6 commit 8e387f2

File tree

780 files changed

+4019
-2893
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

780 files changed

+4019
-2893
lines changed

dev/_downloads/plot_f_test_vs_mi.py

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
"""
2+
===========================================
3+
Comparison of F-test and mutual information
4+
===========================================
5+
6+
This example illustrates the differences between univariate F-test statistics
7+
and mutual information.
8+
9+
We consider 3 features x_1, x_2, x_3 distributed uniformly over [0, 1], the
10+
target depends on them as follows:
11+
12+
y = x_1 + sin(6 * pi * x_2) + 0.1 * N(0, 1), that is the third features is completely irrelevant.
13+
14+
The code below plots the dependency of y against individual x_i and normalized
15+
values of univariate F-tests statistics and mutual information.
16+
17+
As F-test captures only linear dependency, it rates x_1 as the most
18+
discriminative feature. On the other hand, mutual information can capture any
19+
kind of dependency between variables and it rates x_2 as the most
20+
discriminative feature, which probably agrees better with our intuitive
21+
perception for this example. Both methods correctly marks x_3 as irrelevant.
22+
"""
23+
print(__doc__)
24+
25+
import numpy as np
26+
import matplotlib.pyplot as plt
27+
from sklearn.feature_selection import f_regression, mutual_info_regression
28+
29+
np.random.seed(0)
30+
X = np.random.rand(1000, 3)
31+
y = X[:, 0] + np.sin(6 * np.pi * X[:, 1]) + 0.1 * np.random.randn(1000)
32+
33+
f_test, _ = f_regression(X, y)
34+
f_test /= np.max(f_test)
35+
36+
mi = mutual_info_regression(X, y)
37+
mi /= np.max(mi)
38+
39+
plt.figure(figsize=(15, 5))
40+
for i in range(3):
41+
plt.subplot(1, 3, i + 1)
42+
plt.scatter(X[:, i], y)
43+
plt.xlabel("$x_{}$".format(i + 1), fontsize=14)
44+
if i == 0:
45+
plt.ylabel("$y$", fontsize=14)
46+
plt.title("F-test={:.2f}, MI={:.2f}".format(f_test[i], mi[i]),
47+
fontsize=16)
48+
plt.show()
49+

dev/_downloads/plot_rfe_digits.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,4 +33,4 @@
3333
plt.matshow(ranking, cmap=plt.cm.Blues)
3434
plt.colorbar()
3535
plt.title("Ranking of pixels with RFE")
36-
plt.show()
36+
plt.show()
-105 Bytes
-105 Bytes
-222 Bytes
-222 Bytes
-328 Bytes
-328 Bytes
11 Bytes
11 Bytes

0 commit comments

Comments
 (0)