Skip to content

Commit ee58ee6

Browse files
committed
Pushing the docs to dev/ for branch: master, commit 04be1a97993342dcae7ff2736f85c5ab4eeb1266
1 parent 9976d26 commit ee58ee6

File tree

982 files changed

+3487
-3474
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

982 files changed

+3487
-3474
lines changed
11 Bytes
Binary file not shown.
11 Bytes
Binary file not shown.

dev/_downloads/plot_adaboost_hastie_10_2.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
"cell_type": "markdown",
1616
"metadata": {},
1717
"source": [
18-
"\n# Discrete versus Real AdaBoost\n\n\nThis example is based on Figure 10.2 from Hastie et al 2009 [1] and illustrates\nthe difference in performance between the discrete SAMME [2] boosting\nalgorithm and real SAMME.R boosting algorithm. Both algorithms are evaluated\non a binary classification task where the target Y is a non-linear function\nof 10 input features.\n\nDiscrete SAMME AdaBoost adapts based on errors in predicted class labels\nwhereas real SAMME.R uses the predicted class probabilities.\n\n.. [1] T. Hastie, R. Tibshirani and J. Friedman, \"Elements of Statistical\n Learning Ed. 2\", Springer, 2009.\n\n.. [2] J. Zhu, H. Zou, S. Rosset, T. Hastie, \"Multi-class AdaBoost\", 2009.\n\n\n"
18+
"\n# Discrete versus Real AdaBoost\n\n\nThis example is based on Figure 10.2 from Hastie et al 2009 [1]_ and\nillustrates the difference in performance between the discrete SAMME [2]_\nboosting algorithm and real SAMME.R boosting algorithm. Both algorithms are\nevaluated on a binary classification task where the target Y is a non-linear\nfunction of 10 input features.\n\nDiscrete SAMME AdaBoost adapts based on errors in predicted class labels\nwhereas real SAMME.R uses the predicted class probabilities.\n\n.. [1] T. Hastie, R. Tibshirani and J. Friedman, \"Elements of Statistical\n Learning Ed. 2\", Springer, 2009.\n\n.. [2] J. Zhu, H. Zou, S. Rosset, T. Hastie, \"Multi-class AdaBoost\", 2009.\n\n\n"
1919
]
2020
},
2121
{

dev/_downloads/plot_adaboost_hastie_10_2.py

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -3,11 +3,11 @@
33
Discrete versus Real AdaBoost
44
=============================
55
6-
This example is based on Figure 10.2 from Hastie et al 2009 [1] and illustrates
7-
the difference in performance between the discrete SAMME [2] boosting
8-
algorithm and real SAMME.R boosting algorithm. Both algorithms are evaluated
9-
on a binary classification task where the target Y is a non-linear function
10-
of 10 input features.
6+
This example is based on Figure 10.2 from Hastie et al 2009 [1]_ and
7+
illustrates the difference in performance between the discrete SAMME [2]_
8+
boosting algorithm and real SAMME.R boosting algorithm. Both algorithms are
9+
evaluated on a binary classification task where the target Y is a non-linear
10+
function of 10 input features.
1111
1212
Discrete SAMME AdaBoost adapts based on errors in predicted class labels
1313
whereas real SAMME.R uses the predicted class probabilities.

dev/_downloads/plot_adaboost_multiclass.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
"cell_type": "markdown",
1616
"metadata": {},
1717
"source": [
18-
"\n# Multi-class AdaBoosted Decision Trees\n\n\nThis example reproduces Figure 1 of Zhu et al [1] and shows how boosting can\nimprove prediction accuracy on a multi-class problem. The classification\ndataset is constructed by taking a ten-dimensional standard normal distribution\nand defining three classes separated by nested concentric ten-dimensional\nspheres such that roughly equal numbers of samples are in each class (quantiles\nof the $\\chi^2$ distribution).\n\nThe performance of the SAMME and SAMME.R [1] algorithms are compared. SAMME.R\nuses the probability estimates to update the additive model, while SAMME uses\nthe classifications only. As the example illustrates, the SAMME.R algorithm\ntypically converges faster than SAMME, achieving a lower test error with fewer\nboosting iterations. The error of each algorithm on the test set after each\nboosting iteration is shown on the left, the classification error on the test\nset of each tree is shown in the middle, and the boost weight of each tree is\nshown on the right. All trees have a weight of one in the SAMME.R algorithm and\ntherefore are not shown.\n\n.. [1] J. Zhu, H. Zou, S. Rosset, T. Hastie, \"Multi-class AdaBoost\", 2009.\n\n\n"
18+
"\n# Multi-class AdaBoosted Decision Trees\n\n\nThis example reproduces Figure 1 of Zhu et al [1]_ and shows how boosting can\nimprove prediction accuracy on a multi-class problem. The classification\ndataset is constructed by taking a ten-dimensional standard normal distribution\nand defining three classes separated by nested concentric ten-dimensional\nspheres such that roughly equal numbers of samples are in each class (quantiles\nof the $\\chi^2$ distribution).\n\nThe performance of the SAMME and SAMME.R [1]_ algorithms are compared. SAMME.R\nuses the probability estimates to update the additive model, while SAMME uses\nthe classifications only. As the example illustrates, the SAMME.R algorithm\ntypically converges faster than SAMME, achieving a lower test error with fewer\nboosting iterations. The error of each algorithm on the test set after each\nboosting iteration is shown on the left, the classification error on the test\nset of each tree is shown in the middle, and the boost weight of each tree is\nshown on the right. All trees have a weight of one in the SAMME.R algorithm and\ntherefore are not shown.\n\n.. [1] J. Zhu, H. Zou, S. Rosset, T. Hastie, \"Multi-class AdaBoost\", 2009.\n\n\n"
1919
]
2020
},
2121
{

dev/_downloads/plot_adaboost_multiclass.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,14 +3,14 @@
33
Multi-class AdaBoosted Decision Trees
44
=====================================
55
6-
This example reproduces Figure 1 of Zhu et al [1] and shows how boosting can
6+
This example reproduces Figure 1 of Zhu et al [1]_ and shows how boosting can
77
improve prediction accuracy on a multi-class problem. The classification
88
dataset is constructed by taking a ten-dimensional standard normal distribution
99
and defining three classes separated by nested concentric ten-dimensional
1010
spheres such that roughly equal numbers of samples are in each class (quantiles
1111
of the :math:`\chi^2` distribution).
1212
13-
The performance of the SAMME and SAMME.R [1] algorithms are compared. SAMME.R
13+
The performance of the SAMME and SAMME.R [1]_ algorithms are compared. SAMME.R
1414
uses the probability estimates to update the additive model, while SAMME uses
1515
the classifications only. As the example illustrates, the SAMME.R algorithm
1616
typically converges faster than SAMME, achieving a lower test error with fewer

dev/_downloads/plot_adaboost_regression.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
"cell_type": "markdown",
1616
"metadata": {},
1717
"source": [
18-
"\n# Decision Tree Regression with AdaBoost\n\n\nA decision tree is boosted using the AdaBoost.R2 [1] algorithm on a 1D\nsinusoidal dataset with a small amount of Gaussian noise.\n299 boosts (300 decision trees) is compared with a single decision tree\nregressor. As the number of boosts is increased the regressor can fit more\ndetail.\n\n.. [1] H. Drucker, \"Improving Regressors using Boosting Techniques\", 1997.\n\n\n"
18+
"\n# Decision Tree Regression with AdaBoost\n\n\nA decision tree is boosted using the AdaBoost.R2 [1]_ algorithm on a 1D\nsinusoidal dataset with a small amount of Gaussian noise.\n299 boosts (300 decision trees) is compared with a single decision tree\nregressor. As the number of boosts is increased the regressor can fit more\ndetail.\n\n.. [1] H. Drucker, \"Improving Regressors using Boosting Techniques\", 1997.\n\n\n"
1919
]
2020
},
2121
{

dev/_downloads/plot_adaboost_regression.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
Decision Tree Regression with AdaBoost
44
======================================
55
6-
A decision tree is boosted using the AdaBoost.R2 [1] algorithm on a 1D
6+
A decision tree is boosted using the AdaBoost.R2 [1]_ algorithm on a 1D
77
sinusoidal dataset with a small amount of Gaussian noise.
88
299 boosts (300 decision trees) is compared with a single decision tree
99
regressor. As the number of boosts is increased the regressor can fit more

dev/_downloads/plot_ensemble_oob.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
"cell_type": "markdown",
1616
"metadata": {},
1717
"source": [
18-
"\n# OOB Errors for Random Forests\n\n\nThe ``RandomForestClassifier`` is trained using *bootstrap aggregation*, where\neach new tree is fit from a bootstrap sample of the training observations\n$z_i = (x_i, y_i)$. The *out-of-bag* (OOB) error is the average error for\neach $z_i$ calculated using predictions from the trees that do not\ncontain $z_i$ in their respective bootstrap sample. This allows the\n``RandomForestClassifier`` to be fit and validated whilst being trained [1].\n\nThe example below demonstrates how the OOB error can be measured at the\naddition of each new tree during training. The resulting plot allows a\npractitioner to approximate a suitable value of ``n_estimators`` at which the\nerror stabilizes.\n\n.. [1] T. Hastie, R. Tibshirani and J. Friedman, \"Elements of Statistical\n Learning Ed. 2\", p592-593, Springer, 2009.\n\n\n"
18+
"\n# OOB Errors for Random Forests\n\n\nThe ``RandomForestClassifier`` is trained using *bootstrap aggregation*, where\neach new tree is fit from a bootstrap sample of the training observations\n$z_i = (x_i, y_i)$. The *out-of-bag* (OOB) error is the average error for\neach $z_i$ calculated using predictions from the trees that do not\ncontain $z_i$ in their respective bootstrap sample. This allows the\n``RandomForestClassifier`` to be fit and validated whilst being trained [1]_.\n\nThe example below demonstrates how the OOB error can be measured at the\naddition of each new tree during training. The resulting plot allows a\npractitioner to approximate a suitable value of ``n_estimators`` at which the\nerror stabilizes.\n\n.. [1] T. Hastie, R. Tibshirani and J. Friedman, \"Elements of Statistical\n Learning Ed. 2\", p592-593, Springer, 2009.\n\n\n"
1919
]
2020
},
2121
{

dev/_downloads/plot_ensemble_oob.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
:math:`z_i = (x_i, y_i)`. The *out-of-bag* (OOB) error is the average error for
99
each :math:`z_i` calculated using predictions from the trees that do not
1010
contain :math:`z_i` in their respective bootstrap sample. This allows the
11-
``RandomForestClassifier`` to be fit and validated whilst being trained [1].
11+
``RandomForestClassifier`` to be fit and validated whilst being trained [1]_.
1212
1313
The example below demonstrates how the OOB error can be measured at the
1414
addition of each new tree during training. The resulting plot allows a

0 commit comments

Comments
 (0)