marcelobeckmann
diff --git a/‎0.19/.buildinfo
Lines changed: 1 addition & 1 deletion b/‎0.19/.buildinfo
Lines changed: 1 addition & 1 deletion
diff --git a/‎0.19/_downloads/auto_examples_jupyter.zip
1022 Bytes b/‎0.19/_downloads/auto_examples_jupyter.zip
1022 Bytes
diff --git a/‎0.19/_downloads/auto_examples_python.zip
996 Bytes b/‎0.19/_downloads/auto_examples_python.zip
996 Bytes
diff --git a/‎0.19/_downloads/plot_adaboost_hastie_10_2.ipynb
Lines changed: 1 addition & 1 deletion b/‎0.19/_downloads/plot_adaboost_hastie_10_2.ipynb
Lines changed: 1 addition & 1 deletion
diff --git a/‎0.19/_downloads/plot_adaboost_hastie_10_2.py
Lines changed: 5 additions & 5 deletions b/‎0.19/_downloads/plot_adaboost_hastie_10_2.py
Lines changed: 5 additions & 5 deletions
diff --git a/‎0.19/_downloads/plot_adaboost_multiclass.ipynb
Lines changed: 1 addition & 1 deletion b/‎0.19/_downloads/plot_adaboost_multiclass.ipynb
Lines changed: 1 addition & 1 deletion
diff --git a/‎0.19/_downloads/plot_adaboost_multiclass.py
Lines changed: 2 additions & 2 deletions b/‎0.19/_downloads/plot_adaboost_multiclass.py
Lines changed: 2 additions & 2 deletions
diff --git a/‎0.19/_downloads/plot_adaboost_regression.ipynb
Lines changed: 1 addition & 1 deletion b/‎0.19/_downloads/plot_adaboost_regression.ipynb
Lines changed: 1 addition & 1 deletion
diff --git a/‎0.19/_downloads/plot_adaboost_regression.py
Lines changed: 1 addition & 1 deletion b/‎0.19/_downloads/plot_adaboost_regression.py
Lines changed: 1 addition & 1 deletion
diff --git a/‎0.19/_downloads/plot_classifier_chain_yeast.ipynb
Lines changed: 2 additions & 2 deletions b/‎0.19/_downloads/plot_classifier_chain_yeast.ipynb
Lines changed: 2 additions & 2 deletions
@@ -1,4 +1,4 @@
 # Sphinx build info version 1
 # This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
-config: e99d4a45a43b31c63e0f3a38d1b81704
+config: feae17352b9a1e879fecaeccb69c70be
 tags: 645f666f9bcd5a90fca523b33c5a78b7
@@ -15,7 +15,7 @@
       "cell_type": "markdown",
       "metadata": {},
       "source": [
-        "\n# Discrete versus Real AdaBoost\n\n\nThis example is based on Figure 10.2 from Hastie et al 2009 [1] and illustrates\nthe difference in performance between the discrete SAMME [2] boosting\nalgorithm and real SAMME.R boosting algorithm. Both algorithms are evaluated\non a binary classification task where the target Y is a non-linear function\nof 10 input features.\n\nDiscrete SAMME AdaBoost adapts based on errors in predicted class labels\nwhereas real SAMME.R uses the predicted class probabilities.\n\n.. [1] T. Hastie, R. Tibshirani and J. Friedman, \"Elements of Statistical\n    Learning Ed. 2\", Springer, 2009.\n\n.. [2] J. Zhu, H. Zou, S. Rosset, T. Hastie, \"Multi-class AdaBoost\", 2009.\n\n\n"
+        "\n# Discrete versus Real AdaBoost\n\n\nThis example is based on Figure 10.2 from Hastie et al 2009 [1]_ and\nillustrates the difference in performance between the discrete SAMME [2]_\nboosting algorithm and real SAMME.R boosting algorithm. Both algorithms are\nevaluated on a binary classification task where the target Y is a non-linear\nfunction of 10 input features.\n\nDiscrete SAMME AdaBoost adapts based on errors in predicted class labels\nwhereas real SAMME.R uses the predicted class probabilities.\n\n.. [1] T. Hastie, R. Tibshirani and J. Friedman, \"Elements of Statistical\n    Learning Ed. 2\", Springer, 2009.\n\n.. [2] J. Zhu, H. Zou, S. Rosset, T. Hastie, \"Multi-class AdaBoost\", 2009.\n\n\n"
       ]
     },
     {
 
@@ -3,11 +3,11 @@
 Discrete versus Real AdaBoost
 =============================
 
-This example is based on Figure 10.2 from Hastie et al 2009 [1] and illustrates
-the difference in performance between the discrete SAMME [2] boosting
-algorithm and real SAMME.R boosting algorithm. Both algorithms are evaluated
-on a binary classification task where the target Y is a non-linear function
-of 10 input features.
+This example is based on Figure 10.2 from Hastie et al 2009 [1]_ and
+illustrates the difference in performance between the discrete SAMME [2]_
+boosting algorithm and real SAMME.R boosting algorithm. Both algorithms are
+evaluated on a binary classification task where the target Y is a non-linear
+function of 10 input features.
 
 Discrete SAMME AdaBoost adapts based on errors in predicted class labels
 whereas real SAMME.R uses the predicted class probabilities.
 
@@ -15,7 +15,7 @@
       "cell_type": "markdown",
       "metadata": {},
       "source": [
-        "\n# Multi-class AdaBoosted Decision Trees\n\n\nThis example reproduces Figure 1 of Zhu et al [1] and shows how boosting can\nimprove prediction accuracy on a multi-class problem. The classification\ndataset is constructed by taking a ten-dimensional standard normal distribution\nand defining three classes separated by nested concentric ten-dimensional\nspheres such that roughly equal numbers of samples are in each class (quantiles\nof the $\\chi^2$ distribution).\n\nThe performance of the SAMME and SAMME.R [1] algorithms are compared. SAMME.R\nuses the probability estimates to update the additive model, while SAMME  uses\nthe classifications only. As the example illustrates, the SAMME.R algorithm\ntypically converges faster than SAMME, achieving a lower test error with fewer\nboosting iterations. The error of each algorithm on the test set after each\nboosting iteration is shown on the left, the classification error on the test\nset of each tree is shown in the middle, and the boost weight of each tree is\nshown on the right. All trees have a weight of one in the SAMME.R algorithm and\ntherefore are not shown.\n\n.. [1] J. Zhu, H. Zou, S. Rosset, T. Hastie, \"Multi-class AdaBoost\", 2009.\n\n\n"
+        "\n# Multi-class AdaBoosted Decision Trees\n\n\nThis example reproduces Figure 1 of Zhu et al [1]_ and shows how boosting can\nimprove prediction accuracy on a multi-class problem. The classification\ndataset is constructed by taking a ten-dimensional standard normal distribution\nand defining three classes separated by nested concentric ten-dimensional\nspheres such that roughly equal numbers of samples are in each class (quantiles\nof the $\\chi^2$ distribution).\n\nThe performance of the SAMME and SAMME.R [1]_ algorithms are compared. SAMME.R\nuses the probability estimates to update the additive model, while SAMME  uses\nthe classifications only. As the example illustrates, the SAMME.R algorithm\ntypically converges faster than SAMME, achieving a lower test error with fewer\nboosting iterations. The error of each algorithm on the test set after each\nboosting iteration is shown on the left, the classification error on the test\nset of each tree is shown in the middle, and the boost weight of each tree is\nshown on the right. All trees have a weight of one in the SAMME.R algorithm and\ntherefore are not shown.\n\n.. [1] J. Zhu, H. Zou, S. Rosset, T. Hastie, \"Multi-class AdaBoost\", 2009.\n\n\n"
       ]
     },
     {
 
@@ -3,14 +3,14 @@
 Multi-class AdaBoosted Decision Trees
 =====================================
 
-This example reproduces Figure 1 of Zhu et al [1] and shows how boosting can
+This example reproduces Figure 1 of Zhu et al [1]_ and shows how boosting can
 improve prediction accuracy on a multi-class problem. The classification
 dataset is constructed by taking a ten-dimensional standard normal distribution
 and defining three classes separated by nested concentric ten-dimensional
 spheres such that roughly equal numbers of samples are in each class (quantiles
 of the :math:`\chi^2` distribution).
 
-The performance of the SAMME and SAMME.R [1] algorithms are compared. SAMME.R
+The performance of the SAMME and SAMME.R [1]_ algorithms are compared. SAMME.R
 uses the probability estimates to update the additive model, while SAMME  uses
 the classifications only. As the example illustrates, the SAMME.R algorithm
 typically converges faster than SAMME, achieving a lower test error with fewer
 
@@ -15,7 +15,7 @@
       "cell_type": "markdown",
       "metadata": {},
       "source": [
-        "\n# Decision Tree Regression with AdaBoost\n\n\nA decision tree is boosted using the AdaBoost.R2 [1] algorithm on a 1D\nsinusoidal dataset with a small amount of Gaussian noise.\n299 boosts (300 decision trees) is compared with a single decision tree\nregressor. As the number of boosts is increased the regressor can fit more\ndetail.\n\n.. [1] H. Drucker, \"Improving Regressors using Boosting Techniques\", 1997.\n\n\n"
+        "\n# Decision Tree Regression with AdaBoost\n\n\nA decision tree is boosted using the AdaBoost.R2 [1]_ algorithm on a 1D\nsinusoidal dataset with a small amount of Gaussian noise.\n299 boosts (300 decision trees) is compared with a single decision tree\nregressor. As the number of boosts is increased the regressor can fit more\ndetail.\n\n.. [1] H. Drucker, \"Improving Regressors using Boosting Techniques\", 1997.\n\n\n"
       ]
     },
     {
 
@@ -3,7 +3,7 @@
 Decision Tree Regression with AdaBoost
 ======================================
 
-A decision tree is boosted using the AdaBoost.R2 [1] algorithm on a 1D
+A decision tree is boosted using the AdaBoost.R2 [1]_ algorithm on a 1D
 sinusoidal dataset with a small amount of Gaussian noise.
 299 boosts (300 decision trees) is compared with a single decision tree
 regressor. As the number of boosts is increased the regressor can fit more
 
@@ -15,7 +15,7 @@
       "cell_type": "markdown",
       "metadata": {},
       "source": [
-        "\n# Classifier Chain\n\nExample of using classifier chain on a multilabel dataset.\n\nFor this example we will use the `yeast\nhttp://mldata.org/repository/data/viewslug/yeast/`_ dataset which\ncontains 2417 datapoints each with 103 features and 14 possible labels. Each\ndatapoint has at least one label. As a baseline we first train a logistic\nregression classifier for each of the 14 labels. To evaluate the performance\nof these classifiers we predict on a held-out test set and calculate the\n`User Guide <jaccard_similarity_score>`.\n\nNext we create 10 classifier chains. Each classifier chain contains a\nlogistic regression model for each of the 14 labels. The models in each\nchain are ordered randomly. In addition to the 103 features in the dataset,\neach model gets the predictions of the preceding models in the chain as\nfeatures (note that by default at training time each model gets the true\nlabels as features). These additional features allow each chain to exploit\ncorrelations among the classes. The Jaccard similarity score for each chain\ntends to be greater than that of the set independent logistic models.\n\nBecause the models in each chain are arranged randomly there is significant\nvariation in performance among the chains. Presumably there is an optimal\nordering of the classes in a chain that will yield the best performance.\nHowever we do not know that ordering a priori. Instead we can construct an\nvoting ensemble of classifier chains by averaging the binary predictions of\nthe chains and apply a threshold of 0.5. The Jaccard similarity score of the\nensemble is greater than that of the independent models and tends to exceed\nthe score of each chain in the ensemble (although this is not guaranteed\nwith randomly ordered chains).\n\n"
+        "\n# Classifier Chain\n\nExample of using classifier chain on a multilabel dataset.\n\nFor this example we will use the `yeast\n<http://mldata.org/repository/data/viewslug/yeast>`_ dataset which contains\n2417 datapoints each with 103 features and 14 possible labels. Each\ndata point has at least one label. As a baseline we first train a logistic\nregression classifier for each of the 14 labels. To evaluate the performance of\nthese classifiers we predict on a held-out test set and calculate the\n`jaccard similarity score <jaccard_similarity_score>`.\n\nNext we create 10 classifier chains. Each classifier chain contains a\nlogistic regression model for each of the 14 labels. The models in each\nchain are ordered randomly. In addition to the 103 features in the dataset,\neach model gets the predictions of the preceding models in the chain as\nfeatures (note that by default at training time each model gets the true\nlabels as features). These additional features allow each chain to exploit\ncorrelations among the classes. The Jaccard similarity score for each chain\ntends to be greater than that of the set independent logistic models.\n\nBecause the models in each chain are arranged randomly there is significant\nvariation in performance among the chains. Presumably there is an optimal\nordering of the classes in a chain that will yield the best performance.\nHowever we do not know that ordering a priori. Instead we can construct an\nvoting ensemble of classifier chains by averaging the binary predictions of\nthe chains and apply a threshold of 0.5. The Jaccard similarity score of the\nensemble is greater than that of the independent models and tends to exceed\nthe score of each chain in the ensemble (although this is not guaranteed\nwith randomly ordered chains).\n\n"
       ]
     },
     {
@@ -26,7 +26,7 @@
       },
       "outputs": [],
       "source": [
-        "print(__doc__)\n\n# Author: Adam Kleczewski\n# License: BSD 3 clause\n\nimport numpy as np\nimport matplotlib.pyplot as plt\nfrom sklearn.multioutput import ClassifierChain\nfrom sklearn.model_selection import train_test_split\nfrom sklearn.multiclass import OneVsRestClassifier\nfrom sklearn.metrics import jaccard_similarity_score\nfrom sklearn.linear_model import LogisticRegression\nfrom sklearn.datasets import fetch_mldata\n\n# Load a multi-label dataset\nyeast = fetch_mldata('yeast')\nX = yeast['data']\nY = yeast['target'].transpose().toarray()\nX_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=.2,\n                                                    random_state=0)\n\n# Fit an independent logistic regression model for each class using the\n# OneVsRestClassifier wrapper.\novr = OneVsRestClassifier(LogisticRegression())\novr.fit(X_train, Y_train)\nY_pred_ovr = ovr.predict(X_test)\novr_jaccard_score = jaccard_similarity_score(Y_test, Y_pred_ovr)\n\n# Fit an ensemble of logistic regression classifier chains and take the\n# take the average prediction of all the chains.\nchains = [ClassifierChain(LogisticRegression(), order='random', random_state=i)\n          for i in range(10)]\nfor chain in chains:\n    chain.fit(X_train, Y_train)\n\nY_pred_chains = np.array([chain.predict(X_test) for chain in\n                          chains])\nchain_jaccard_scores = [jaccard_similarity_score(Y_test, Y_pred_chain >= .5)\n                        for Y_pred_chain in Y_pred_chains]\n\nY_pred_ensemble = Y_pred_chains.mean(axis=0)\nensemble_jaccard_score = jaccard_similarity_score(Y_test,\n                                                  Y_pred_ensemble >= .5)\n\nmodel_scores = [ovr_jaccard_score] + chain_jaccard_scores\nmodel_scores.append(ensemble_jaccard_score)\n\nmodel_names = ('Independent Models',\n               'Chain 1',\n               'Chain 2',\n               'Chain 3',\n               'Chain 4',\n               'Chain 5',\n               'Chain 6',\n               'Chain 7',\n               'Chain 8',\n               'Chain 9',\n               'Chain 10',\n               'Ensemble Average')\n\ny_pos = np.arange(len(model_names))\ny_pos[1:] += 1\ny_pos[-1] += 1\n\n# Plot the Jaccard similarity scores for the independent model, each of the\n# chains, and the ensemble (note that the vertical axis on this plot does\n# not begin at 0).\n\nfig = plt.figure(figsize=(7, 4))\nplt.title('Classifier Chain Ensemble')\nplt.xticks(y_pos, model_names, rotation='vertical')\nplt.ylabel('Jaccard Similarity Score')\nplt.ylim([min(model_scores) * .9, max(model_scores) * 1.1])\ncolors = ['r'] + ['b'] * len(chain_jaccard_scores) + ['g']\nplt.bar(y_pos, model_scores, align='center', alpha=0.5, color=colors)\nplt.show()"
+        "print(__doc__)\n\n# Author: Adam Kleczewski\n# License: BSD 3 clause\n\nimport numpy as np\nimport matplotlib.pyplot as plt\nfrom sklearn.multioutput import ClassifierChain\nfrom sklearn.model_selection import train_test_split\nfrom sklearn.multiclass import OneVsRestClassifier\nfrom sklearn.metrics import jaccard_similarity_score\nfrom sklearn.linear_model import LogisticRegression\nfrom sklearn.datasets import fetch_mldata\n\n# Load a multi-label dataset\nyeast = fetch_mldata('yeast')\nX = yeast['data']\nY = yeast['target'].transpose().toarray()\nX_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=.2,\n                                                    random_state=0)\n\n# Fit an independent logistic regression model for each class using the\n# OneVsRestClassifier wrapper.\novr = OneVsRestClassifier(LogisticRegression())\novr.fit(X_train, Y_train)\nY_pred_ovr = ovr.predict(X_test)\novr_jaccard_score = jaccard_similarity_score(Y_test, Y_pred_ovr)\n\n# Fit an ensemble of logistic regression classifier chains and take the\n# take the average prediction of all the chains.\nchains = [ClassifierChain(LogisticRegression(), order='random', random_state=i)\n          for i in range(10)]\nfor chain in chains:\n    chain.fit(X_train, Y_train)\n\nY_pred_chains = np.array([chain.predict(X_test) for chain in\n                          chains])\nchain_jaccard_scores = [jaccard_similarity_score(Y_test, Y_pred_chain >= .5)\n                        for Y_pred_chain in Y_pred_chains]\n\nY_pred_ensemble = Y_pred_chains.mean(axis=0)\nensemble_jaccard_score = jaccard_similarity_score(Y_test,\n                                                  Y_pred_ensemble >= .5)\n\nmodel_scores = [ovr_jaccard_score] + chain_jaccard_scores\nmodel_scores.append(ensemble_jaccard_score)\n\nmodel_names = ('Independent',\n               'Chain 1',\n               'Chain 2',\n               'Chain 3',\n               'Chain 4',\n               'Chain 5',\n               'Chain 6',\n               'Chain 7',\n               'Chain 8',\n               'Chain 9',\n               'Chain 10',\n               'Ensemble')\n\nx_pos = np.arange(len(model_names))\n\n# Plot the Jaccard similarity scores for the independent model, each of the\n# chains, and the ensemble (note that the vertical axis on this plot does\n# not begin at 0).\n\nfig, ax = plt.subplots(figsize=(7, 4))\nax.grid(True)\nax.set_title('Classifier Chain Ensemble Performance Comparison')\nax.set_xticks(x_pos)\nax.set_xticklabels(model_names, rotation='vertical')\nax.set_ylabel('Jaccard Similarity Score')\nax.set_ylim([min(model_scores) * .9, max(model_scores) * 1.1])\ncolors = ['r'] + ['b'] * len(chain_jaccard_scores) + ['g']\nax.bar(x_pos, model_scores, alpha=0.5, color=colors)\nplt.tight_layout()\nplt.show()"
       ]
     }
   ],
Original file line number	Diff line number	Diff line change
`@@ -15,7 +15,7 @@`
`15`	`15`	`"cell_type": "markdown",`
`16`	`16`	`"metadata": {},`
`17`	`17`	`"source": [`
`18`		- "\n# Discrete versus Real AdaBoost\n\n\nThis example is based on Figure 10.2 from Hastie et al 2009 [1] and illustrates\nthe difference in performance between the discrete SAMME [2] boosting\nalgorithm and real SAMME.R boosting algorithm. Both algorithms are evaluated\non a binary classification task where the target Y is a non-linear function\nof 10 input features.\n\nDiscrete SAMME AdaBoost adapts based on errors in predicted class labels\nwhereas real SAMME.R uses the predicted class probabilities.\n\n.. [1] T. Hastie, R. Tibshirani and J. Friedman, \"Elements of Statistical\n Learning Ed. 2\", Springer, 2009.\n\n.. [2] J. Zhu, H. Zou, S. Rosset, T. Hastie, \"Multi-class AdaBoost\", 2009.\n\n\n"
	`18`	+ "\n# Discrete versus Real AdaBoost\n\n\nThis example is based on Figure 10.2 from Hastie et al 2009 [1]_ and\nillustrates the difference in performance between the discrete SAMME [2]_\nboosting algorithm and real SAMME.R boosting algorithm. Both algorithms are\nevaluated on a binary classification task where the target Y is a non-linear\nfunction of 10 input features.\n\nDiscrete SAMME AdaBoost adapts based on errors in predicted class labels\nwhereas real SAMME.R uses the predicted class probabilities.\n\n.. [1] T. Hastie, R. Tibshirani and J. Friedman, \"Elements of Statistical\n Learning Ed. 2\", Springer, 2009.\n\n.. [2] J. Zhu, H. Zou, S. Rosset, T. Hastie, \"Multi-class AdaBoost\", 2009.\n\n\n"
`19`	`19`	`]`
`20`	`20`	`},`
`21`	`21`	`{`