scikit-learn
diff --git a/‎dev/_downloads/auto_examples_jupyter.zip
158 Bytes b/‎dev/_downloads/auto_examples_jupyter.zip
158 Bytes
diff --git a/‎dev/_downloads/auto_examples_python.zip
156 Bytes b/‎dev/_downloads/auto_examples_python.zip
156 Bytes
diff --git a/‎dev/_downloads/plot_nested_cross_validation_iris.ipynb
Lines changed: 1 addition & 1 deletion b/‎dev/_downloads/plot_nested_cross_validation_iris.ipynb
Lines changed: 1 addition & 1 deletion
diff --git a/‎dev/_downloads/plot_nested_cross_validation_iris.py
Lines changed: 7 additions & 5 deletions b/‎dev/_downloads/plot_nested_cross_validation_iris.py
Lines changed: 7 additions & 5 deletions
diff --git a/‎dev/_downloads/scikit-learn-docs.pdf
-6.08 KB b/‎dev/_downloads/scikit-learn-docs.pdf
-6.08 KB
diff --git a/‎dev/_images/sphx_glr_plot_agglomerative_clustering_001.png
-99 Bytes b/‎dev/_images/sphx_glr_plot_agglomerative_clustering_001.png
-99 Bytes
diff --git a/‎dev/_images/sphx_glr_plot_agglomerative_clustering_0011.png
-99 Bytes b/‎dev/_images/sphx_glr_plot_agglomerative_clustering_0011.png
-99 Bytes
diff --git a/‎dev/_images/sphx_glr_plot_agglomerative_clustering_002.png
-114 Bytes b/‎dev/_images/sphx_glr_plot_agglomerative_clustering_002.png
-114 Bytes
diff --git a/‎dev/_images/sphx_glr_plot_agglomerative_clustering_0021.png
-114 Bytes b/‎dev/_images/sphx_glr_plot_agglomerative_clustering_0021.png
-114 Bytes
diff --git a/‎dev/_images/sphx_glr_plot_agglomerative_clustering_003.png
515 Bytes b/‎dev/_images/sphx_glr_plot_agglomerative_clustering_003.png
515 Bytes
@@ -15,7 +15,7 @@
     }, 
     {
       "source": [
-        "\n# Nested versus non-nested cross-validation\n\n\nThis example compares non-nested and nested cross-validation strategies on a\nclassifier of the iris data set. Nested cross-validation (CV) is often used to\ntrain a model in which hyperparameters also need to be optimized. Nested CV\nestimates the generalization error of the underlying model and its\n(hyper)parameter search. Choosing the parameters that maximize non-nested CV\nbiases the model to the dataset, yielding an overly-optimistic score.\n\nModel selection without nested CV uses the same data to tune model parameters\nand evaluate model performance. Information may thus \"leak\" into the model\nand overfit the data. The magnitude of this effect is primarily dependent on\nthe size of the dataset and the stability of the model. See Cawley and Talbot\n[1]_ for an analysis of these issues.\n\nTo avoid this problem, nested CV effectively uses a series of\ntrain/validation/test set splits. In the inner loop, the score is approximately\nmaximized by fitting a model to each training set, and then directly maximized\nin selecting (hyper)parameters over the validation set. In the outer loop,\ngeneralization error is estimated by averaging test set scores over several\ndataset splits.\n\nThe example below uses a support vector classifier with a non-linear kernel to\nbuild a model with optimized hyperparameters by grid search. We compare the\nperformance of non-nested and nested CV strategies by taking the difference\nbetween their scores.\n\n.. topic:: See Also:\n\n    - `cross_validation`\n    - `grid_search`\n\n.. topic:: References:\n\n    .. [1] `Cawley, G.C.; Talbot, N.L.C. On over-fitting in model selection and\n     subsequent selection bias in performance evaluation.\n     J. Mach. Learn. Res 2010,11, 2079-2107.\n     <http://jmlr.csail.mit.edu/papers/volume11/cawley10a/cawley10a.pdf>`_\n\n\n"
+        "\n# Nested versus non-nested cross-validation\n\n\nThis example compares non-nested and nested cross-validation strategies on a\nclassifier of the iris data set. Nested cross-validation (CV) is often used to\ntrain a model in which hyperparameters also need to be optimized. Nested CV\nestimates the generalization error of the underlying model and its\n(hyper)parameter search. Choosing the parameters that maximize non-nested CV\nbiases the model to the dataset, yielding an overly-optimistic score.\n\nModel selection without nested CV uses the same data to tune model parameters\nand evaluate model performance. Information may thus \"leak\" into the model\nand overfit the data. The magnitude of this effect is primarily dependent on\nthe size of the dataset and the stability of the model. See Cawley and Talbot\n[1]_ for an analysis of these issues.\n\nTo avoid this problem, nested CV effectively uses a series of\ntrain/validation/test set splits. In the inner loop (here executed by\n:class:`GridSearchCV <sklearn.model_selection.GridSearchCV>`), the score is\napproximately maximized by fitting a model to each training set, and then\ndirectly maximized in selecting (hyper)parameters over the validation set. In\nthe outer loop (here in :func:`cross_val_score\n<sklearn.model_selection.cross_val_score>`), generalization error is estimated\nby averaging test set scores over several dataset splits.\n\nThe example below uses a support vector classifier with a non-linear kernel to\nbuild a model with optimized hyperparameters by grid search. We compare the\nperformance of non-nested and nested CV strategies by taking the difference\nbetween their scores.\n\n.. topic:: See Also:\n\n    - `cross_validation`\n    - `grid_search`\n\n.. topic:: References:\n\n    .. [1] `Cawley, G.C.; Talbot, N.L.C. On over-fitting in model selection and\n     subsequent selection bias in performance evaluation.\n     J. Mach. Learn. Res 2010,11, 2079-2107.\n     <http://jmlr.csail.mit.edu/papers/volume11/cawley10a/cawley10a.pdf>`_\n\n\n"
       ], 
       "cell_type": "markdown", 
       "metadata": {}
 
@@ -17,11 +17,13 @@
 [1]_ for an analysis of these issues.
 
 To avoid this problem, nested CV effectively uses a series of
-train/validation/test set splits. In the inner loop, the score is approximately
-maximized by fitting a model to each training set, and then directly maximized
-in selecting (hyper)parameters over the validation set. In the outer loop,
-generalization error is estimated by averaging test set scores over several
-dataset splits.
+train/validation/test set splits. In the inner loop (here executed by
+:class:`GridSearchCV <sklearn.model_selection.GridSearchCV>`), the score is
+approximately maximized by fitting a model to each training set, and then
+directly maximized in selecting (hyper)parameters over the validation set. In
+the outer loop (here in :func:`cross_val_score
+<sklearn.model_selection.cross_val_score>`), generalization error is estimated
+by averaging test set scores over several dataset splits.
 
 The example below uses a support vector classifier with a non-linear kernel to
 build a model with optimized hyperparameters by grid search. We compare the
Original file line number	Diff line number	Diff line change
`@@ -15,7 +15,7 @@`
`15`	`15`	`},`
`16`	`16`	`{`
`17`	`17`	`"source": [`
`18`		- "\n# Nested versus non-nested cross-validation\n\n\nThis example compares non-nested and nested cross-validation strategies on a\nclassifier of the iris data set. Nested cross-validation (CV) is often used to\ntrain a model in which hyperparameters also need to be optimized. Nested CV\nestimates the generalization error of the underlying model and its\n(hyper)parameter search. Choosing the parameters that maximize non-nested CV\nbiases the model to the dataset, yielding an overly-optimistic score.\n\nModel selection without nested CV uses the same data to tune model parameters\nand evaluate model performance. Information may thus \"leak\" into the model\nand overfit the data. The magnitude of this effect is primarily dependent on\nthe size of the dataset and the stability of the model. See Cawley and Talbot\n[1]_ for an analysis of these issues.\n\nTo avoid this problem, nested CV effectively uses a series of\ntrain/validation/test set splits. In the inner loop, the score is approximately\nmaximized by fitting a model to each training set, and then directly maximized\nin selecting (hyper)parameters over the validation set. In the outer loop,\ngeneralization error is estimated by averaging test set scores over several\ndataset splits.\n\nThe example below uses a support vector classifier with a non-linear kernel to\nbuild a model with optimized hyperparameters by grid search. We compare the\nperformance of non-nested and nested CV strategies by taking the difference\nbetween their scores.\n\n.. topic:: See Also:\n\n - `cross_validation`\n - `grid_search`\n\n.. topic:: References:\n\n .. [1] `Cawley, G.C.; Talbot, N.L.C. On over-fitting in model selection and\n subsequent selection bias in performance evaluation.\n J. Mach. Learn. Res 2010,11, 2079-2107.\n <http://jmlr.csail.mit.edu/papers/volume11/cawley10a/cawley10a.pdf>`_\n\n\n"
	`18`	+ "\n# Nested versus non-nested cross-validation\n\n\nThis example compares non-nested and nested cross-validation strategies on a\nclassifier of the iris data set. Nested cross-validation (CV) is often used to\ntrain a model in which hyperparameters also need to be optimized. Nested CV\nestimates the generalization error of the underlying model and its\n(hyper)parameter search. Choosing the parameters that maximize non-nested CV\nbiases the model to the dataset, yielding an overly-optimistic score.\n\nModel selection without nested CV uses the same data to tune model parameters\nand evaluate model performance. Information may thus \"leak\" into the model\nand overfit the data. The magnitude of this effect is primarily dependent on\nthe size of the dataset and the stability of the model. See Cawley and Talbot\n[1]_ for an analysis of these issues.\n\nTo avoid this problem, nested CV effectively uses a series of\ntrain/validation/test set splits. In the inner loop (here executed by\n:class:`GridSearchCV <sklearn.model_selection.GridSearchCV>`), the score is\napproximately maximized by fitting a model to each training set, and then\ndirectly maximized in selecting (hyper)parameters over the validation set. In\nthe outer loop (here in :func:`cross_val_score\n<sklearn.model_selection.cross_val_score>`), generalization error is estimated\nby averaging test set scores over several dataset splits.\n\nThe example below uses a support vector classifier with a non-linear kernel to\nbuild a model with optimized hyperparameters by grid search. We compare the\nperformance of non-nested and nested CV strategies by taking the difference\nbetween their scores.\n\n.. topic:: See Also:\n\n - `cross_validation`\n - `grid_search`\n\n.. topic:: References:\n\n .. [1] `Cawley, G.C.; Talbot, N.L.C. On over-fitting in model selection and\n subsequent selection bias in performance evaluation.\n J. Mach. Learn. Res 2010,11, 2079-2107.\n <http://jmlr.csail.mit.edu/papers/volume11/cawley10a/cawley10a.pdf>`_\n\n\n"
`19`	`19`	`],`
`20`	`20`	`"cell_type": "markdown",`
`21`	`21`	`"metadata": {}`