Skip to content

Commit 152f478

Browse files
committed
Pushing the docs to dev/ for branch: master, commit 3e92edbab8b1d210523e775d2050039055dda1d7
1 parent 8ecee42 commit 152f478

File tree

1,205 files changed

+4402
-4138
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,205 files changed

+4402
-4138
lines changed
Binary file not shown.

dev/_downloads/a9a92784a7617f5a14aa93d32f95dff7/plot_voting_regressor.ipynb

Lines changed: 56 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
"cell_type": "markdown",
1616
"metadata": {},
1717
"source": [
18-
"\n# Plot individual and voting regression predictions\n\n\n.. currentmodule:: sklearn\n\nPlot individual and averaged regression predictions for Boston dataset.\n\nFirst, three exemplary regressors are initialized\n(:class:`~ensemble.GradientBoostingRegressor`,\n:class:`~ensemble.RandomForestRegressor`, and\n:class:`~linear_model.LinearRegression`) and used to initialize a\n:class:`~ensemble.VotingRegressor`.\n\nThe red starred dots are the averaged predictions.\n"
18+
"\n# Plot individual and voting regression predictions\n\n\n.. currentmodule:: sklearn\n\nA voting regressor is an ensemble meta-estimator that fits base regressors each\non the whole dataset. It, then, averages the individual predictions to form a\nfinal prediction.\nWe will use three different regressors to predict the data:\n:class:`~ensemble.GradientBoostingRegressor`,\n:class:`~ensemble.RandomForestRegressor`, and\n:class:`~linear_model.LinearRegression`).\nThen, using them we will make voting regressor\n:class:`~ensemble.VotingRegressor`.\n\nFinally, we will plot all of them for comparison.\n\nWe will work with the diabetes dataset which consists of the 10 features\ncollected from a cohort of diabetes patients. The target is the disease\nprogression after one year from the baseline.\n"
1919
]
2020
},
2121
{
@@ -26,7 +26,61 @@
2626
},
2727
"outputs": [],
2828
"source": [
29-
"print(__doc__)\n\nimport matplotlib.pyplot as plt\n\nfrom sklearn import datasets\nfrom sklearn.ensemble import GradientBoostingRegressor\nfrom sklearn.ensemble import RandomForestRegressor\nfrom sklearn.linear_model import LinearRegression\nfrom sklearn.ensemble import VotingRegressor\n\n# Loading some example data\nX, y = datasets.load_boston(return_X_y=True)\n\n# Training classifiers\nreg1 = GradientBoostingRegressor(random_state=1, n_estimators=10)\nreg2 = RandomForestRegressor(random_state=1, n_estimators=10)\nreg3 = LinearRegression()\nereg = VotingRegressor([('gb', reg1), ('rf', reg2), ('lr', reg3)])\nreg1.fit(X, y)\nreg2.fit(X, y)\nreg3.fit(X, y)\nereg.fit(X, y)\n\nxt = X[:20]\n\nplt.figure()\nplt.plot(reg1.predict(xt), 'gd', label='GradientBoostingRegressor')\nplt.plot(reg2.predict(xt), 'b^', label='RandomForestRegressor')\nplt.plot(reg3.predict(xt), 'ys', label='LinearRegression')\nplt.plot(ereg.predict(xt), 'r*', label='VotingRegressor')\nplt.tick_params(axis='x', which='both', bottom=False, top=False,\n labelbottom=False)\nplt.ylabel('predicted')\nplt.xlabel('training samples')\nplt.legend(loc=\"best\")\nplt.title('Comparison of individual predictions with averaged')\nplt.show()"
29+
"print(__doc__)\n\nimport matplotlib.pyplot as plt\n\nfrom sklearn import datasets\nfrom sklearn.ensemble import GradientBoostingRegressor\nfrom sklearn.ensemble import RandomForestRegressor\nfrom sklearn.linear_model import LinearRegression\nfrom sklearn.ensemble import VotingRegressor"
30+
]
31+
},
32+
{
33+
"cell_type": "markdown",
34+
"metadata": {},
35+
"source": [
36+
"Training classifiers\n--------------------------------\n\nFirst, we are going to load diabetes dataset and initiate gradient boosting\nregressor, random forest regressor and linear regression. Next, we are going\nto use each of them to build the voting regressor:\n\n"
37+
]
38+
},
39+
{
40+
"cell_type": "code",
41+
"execution_count": null,
42+
"metadata": {
43+
"collapsed": false
44+
},
45+
"outputs": [],
46+
"source": [
47+
"X, y = datasets.load_diabetes(return_X_y=True)\n\n# Train classifiers\nreg1 = GradientBoostingRegressor(random_state=1)\nreg2 = RandomForestRegressor(random_state=1)\nreg3 = LinearRegression()\n\nreg1.fit(X, y)\nreg2.fit(X, y)\nreg3.fit(X, y)\n\nereg = VotingRegressor([('gb', reg1), ('rf', reg2), ('lr', reg3)])\nereg.fit(X, y)"
48+
]
49+
},
50+
{
51+
"cell_type": "markdown",
52+
"metadata": {},
53+
"source": [
54+
"Making predictions\n--------------------------------\n\nNow we will use each of the regressors to make 20 first predictions about the\ndiabetes dataset.\n\n"
55+
]
56+
},
57+
{
58+
"cell_type": "code",
59+
"execution_count": null,
60+
"metadata": {
61+
"collapsed": false
62+
},
63+
"outputs": [],
64+
"source": [
65+
"xt = X[:20]\n\npred1 = reg1.predict(xt)\npred2 = reg2.predict(xt)\npred3 = reg3.predict(xt)\npred4 = ereg.predict(xt)"
66+
]
67+
},
68+
{
69+
"cell_type": "markdown",
70+
"metadata": {},
71+
"source": [
72+
"Plot the results\n--------------------------------\n\nFinally, we will visualize the 20 predictions. The red stars show the average\nprediction\n\n"
73+
]
74+
},
75+
{
76+
"cell_type": "code",
77+
"execution_count": null,
78+
"metadata": {
79+
"collapsed": false
80+
},
81+
"outputs": [],
82+
"source": [
83+
"plt.figure()\nplt.plot(pred1, 'gd', label='GradientBoostingRegressor')\nplt.plot(pred2, 'b^', label='RandomForestRegressor')\nplt.plot(pred3, 'ys', label='LinearRegression')\nplt.plot(pred4, 'r*', ms=10, label='VotingRegressor')\n\nplt.tick_params(axis='x', which='both', bottom=False, top=False,\n labelbottom=False)\nplt.ylabel('predicted')\nplt.xlabel('training samples')\nplt.legend(loc=\"best\")\nplt.title('Regressor predictions and their average')\n\nplt.show()"
3084
]
3185
}
3286
],

dev/_downloads/acb1430b51f399d6660add7428cadb67/plot_voting_regressor.py

Lines changed: 53 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -5,15 +5,21 @@
55
66
.. currentmodule:: sklearn
77
8-
Plot individual and averaged regression predictions for Boston dataset.
9-
10-
First, three exemplary regressors are initialized
11-
(:class:`~ensemble.GradientBoostingRegressor`,
8+
A voting regressor is an ensemble meta-estimator that fits base regressors each
9+
on the whole dataset. It, then, averages the individual predictions to form a
10+
final prediction.
11+
We will use three different regressors to predict the data:
12+
:class:`~ensemble.GradientBoostingRegressor`,
1213
:class:`~ensemble.RandomForestRegressor`, and
13-
:class:`~linear_model.LinearRegression`) and used to initialize a
14+
:class:`~linear_model.LinearRegression`).
15+
Then, using them we will make voting regressor
1416
:class:`~ensemble.VotingRegressor`.
1517
16-
The red starred dots are the averaged predictions.
18+
Finally, we will plot all of them for comparison.
19+
20+
We will work with the diabetes dataset which consists of the 10 features
21+
collected from a cohort of diabetes patients. The target is the disease
22+
progression after one year from the baseline.
1723
1824
"""
1925
print(__doc__)
@@ -26,30 +32,60 @@
2632
from sklearn.linear_model import LinearRegression
2733
from sklearn.ensemble import VotingRegressor
2834

29-
# Loading some example data
30-
X, y = datasets.load_boston(return_X_y=True)
31-
35+
##############################################################################
3236
# Training classifiers
33-
reg1 = GradientBoostingRegressor(random_state=1, n_estimators=10)
34-
reg2 = RandomForestRegressor(random_state=1, n_estimators=10)
37+
# --------------------------------
38+
#
39+
# First, we are going to load diabetes dataset and initiate gradient boosting
40+
# regressor, random forest regressor and linear regression. Next, we are going
41+
# to use each of them to build the voting regressor:
42+
43+
X, y = datasets.load_diabetes(return_X_y=True)
44+
45+
# Train classifiers
46+
reg1 = GradientBoostingRegressor(random_state=1)
47+
reg2 = RandomForestRegressor(random_state=1)
3548
reg3 = LinearRegression()
36-
ereg = VotingRegressor([('gb', reg1), ('rf', reg2), ('lr', reg3)])
49+
3750
reg1.fit(X, y)
3851
reg2.fit(X, y)
3952
reg3.fit(X, y)
53+
54+
ereg = VotingRegressor([('gb', reg1), ('rf', reg2), ('lr', reg3)])
4055
ereg.fit(X, y)
4156

57+
##############################################################################
58+
# Making predictions
59+
# --------------------------------
60+
#
61+
# Now we will use each of the regressors to make 20 first predictions about the
62+
# diabetes dataset.
63+
4264
xt = X[:20]
4365

66+
pred1 = reg1.predict(xt)
67+
pred2 = reg2.predict(xt)
68+
pred3 = reg3.predict(xt)
69+
pred4 = ereg.predict(xt)
70+
71+
##############################################################################
72+
# Plot the results
73+
# --------------------------------
74+
#
75+
# Finally, we will visualize the 20 predictions. The red stars show the average
76+
# prediction
77+
4478
plt.figure()
45-
plt.plot(reg1.predict(xt), 'gd', label='GradientBoostingRegressor')
46-
plt.plot(reg2.predict(xt), 'b^', label='RandomForestRegressor')
47-
plt.plot(reg3.predict(xt), 'ys', label='LinearRegression')
48-
plt.plot(ereg.predict(xt), 'r*', label='VotingRegressor')
79+
plt.plot(pred1, 'gd', label='GradientBoostingRegressor')
80+
plt.plot(pred2, 'b^', label='RandomForestRegressor')
81+
plt.plot(pred3, 'ys', label='LinearRegression')
82+
plt.plot(pred4, 'r*', ms=10, label='VotingRegressor')
83+
4984
plt.tick_params(axis='x', which='both', bottom=False, top=False,
5085
labelbottom=False)
5186
plt.ylabel('predicted')
5287
plt.xlabel('training samples')
5388
plt.legend(loc="best")
54-
plt.title('Comparison of individual predictions with averaged')
89+
plt.title('Regressor predictions and their average')
90+
5591
plt.show()
Binary file not shown.

dev/_downloads/scikit-learn-docs.pdf

1.39 KB
Binary file not shown.

dev/_images/iris.png

0 Bytes

0 commit comments

Comments
 (0)