Skip to content

Commit 7f3580e

Browse files
committed
Pushing the docs to dev/ for branch: master, commit f4e7d2b19a9432f66ed22b01bae76f31af1db00f
1 parent 662021d commit 7f3580e

File tree

1,107 files changed

+3392
-3392
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,107 files changed

+3392
-3392
lines changed
3 Bytes
Binary file not shown.
3 Bytes
Binary file not shown.

dev/_downloads/plot_all_scaling.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
"cell_type": "markdown",
1616
"metadata": {},
1717
"source": [
18-
"\n# Compare the effect of different scalers on data with outliers\n\n\nFeature 0 (median income in a block) and feature 5 (number of households) of\nthe `California housing dataset\n<http://www.dcc.fc.up.pt/~ltorgo/Regression/cal_housing.html>`_ have very\ndifferent scales and contain some very large outliers. These two\ncharacteristics lead to difficulties to visualize the data and, more\nimportantly, they can degrade the predictive performance of many machine\nlearning algorithms. Unscaled data can also slow down or even prevent the\nconvergence of many gradient-based estimators.\n\nIndeed many estimators are designed with the assumption that each feature takes\nvalues close to zero or more importantly that all features vary on comparable\nscales. In particular, metric-based and gradient-based estimators often assume\napproximately standardized data (centered features with unit variances). A\nnotable exception are decision tree-based estimators that are robust to\narbitrary scaling of the data.\n\nThis example uses different scalers, transformers, and normalizers to bring the\ndata within a pre-defined range.\n\nScalers are linear (or more precisely affine) transformers and differ from each\nother in the way to estimate the parameters used to shift and scale each\nfeature.\n\n``QuantileTransformer`` provides non-linear transformations in which distances\nbetween marginal outliers and inliers are shrunk. ``PowerTransformer`` provides\nnon-linear transformations in which data is mapped to a normal distribution to\nstabilize variance and minimize skewness.\n\nUnlike the previous transformations, normalization refers to a per sample\ntransformation instead of a per feature transformation.\n\nThe following code is a bit verbose, feel free to jump directly to the analysis\nof the results_.\n\n\n"
18+
"\n# Compare the effect of different scalers on data with outliers\n\n\nFeature 0 (median income in a block) and feature 5 (number of households) of\nthe `California housing dataset\n<https://www.dcc.fc.up.pt/~ltorgo/Regression/cal_housing.html>`_ have very\ndifferent scales and contain some very large outliers. These two\ncharacteristics lead to difficulties to visualize the data and, more\nimportantly, they can degrade the predictive performance of many machine\nlearning algorithms. Unscaled data can also slow down or even prevent the\nconvergence of many gradient-based estimators.\n\nIndeed many estimators are designed with the assumption that each feature takes\nvalues close to zero or more importantly that all features vary on comparable\nscales. In particular, metric-based and gradient-based estimators often assume\napproximately standardized data (centered features with unit variances). A\nnotable exception are decision tree-based estimators that are robust to\narbitrary scaling of the data.\n\nThis example uses different scalers, transformers, and normalizers to bring the\ndata within a pre-defined range.\n\nScalers are linear (or more precisely affine) transformers and differ from each\nother in the way to estimate the parameters used to shift and scale each\nfeature.\n\n``QuantileTransformer`` provides non-linear transformations in which distances\nbetween marginal outliers and inliers are shrunk. ``PowerTransformer`` provides\nnon-linear transformations in which data is mapped to a normal distribution to\nstabilize variance and minimize skewness.\n\nUnlike the previous transformations, normalization refers to a per sample\ntransformation instead of a per feature transformation.\n\nThe following code is a bit verbose, feel free to jump directly to the analysis\nof the results_.\n\n\n"
1919
]
2020
},
2121
{

dev/_downloads/plot_all_scaling.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
99
Feature 0 (median income in a block) and feature 5 (number of households) of
1010
the `California housing dataset
11-
<http://www.dcc.fc.up.pt/~ltorgo/Regression/cal_housing.html>`_ have very
11+
<https://www.dcc.fc.up.pt/~ltorgo/Regression/cal_housing.html>`_ have very
1212
different scales and contain some very large outliers. These two
1313
characteristics lead to difficulties to visualize the data and, more
1414
importantly, they can degrade the predictive performance of many machine

dev/_downloads/plot_digits_last_image.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
"cell_type": "markdown",
1616
"metadata": {},
1717
"source": [
18-
"\n# The Digit Dataset\n\n\nThis dataset is made up of 1797 8x8 images. Each image,\nlike the one shown below, is of a hand-written digit.\nIn order to utilize an 8x8 figure like this, we'd have to\nfirst transform it into a feature vector with length 64.\n\nSee `here\n<http://archive.ics.uci.edu/ml/datasets/Pen-Based+Recognition+of+Handwritten+Digits>`_\nfor more information about this dataset.\n\n"
18+
"\n# The Digit Dataset\n\n\nThis dataset is made up of 1797 8x8 images. Each image,\nlike the one shown below, is of a hand-written digit.\nIn order to utilize an 8x8 figure like this, we'd have to\nfirst transform it into a feature vector with length 64.\n\nSee `here\n<https://archive.ics.uci.edu/ml/datasets/Pen-Based+Recognition+of+Handwritten+Digits>`_\nfor more information about this dataset.\n\n"
1919
]
2020
},
2121
{

dev/_downloads/plot_digits_last_image.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@
1212
first transform it into a feature vector with length 64.
1313
1414
See `here
15-
<http://archive.ics.uci.edu/ml/datasets/Pen-Based+Recognition+of+Handwritten+Digits>`_
15+
<https://archive.ics.uci.edu/ml/datasets/Pen-Based+Recognition+of+Handwritten+Digits>`_
1616
for more information about this dataset.
1717
"""
1818
print(__doc__)

dev/_downloads/plot_t_sne_perplexity.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
"cell_type": "markdown",
1616
"metadata": {},
1717
"source": [
18-
"\n=============================================================================\nt-SNE: The effect of various perplexity values on the shape\n=============================================================================\n\nAn illustration of t-SNE on the two concentric circles and the S-curve\ndatasets for different perplexity values.\n\nWe observe a tendency towards clearer shapes as the preplexity value increases.\n\nThe size, the distance and the shape of clusters may vary upon initialization,\nperplexity values and does not always convey a meaning.\n\nAs shown below, t-SNE for higher perplexities finds meaningful topology of\ntwo concentric circles, however the size and the distance of the circles varies\nslightly from the original. Contrary to the two circles dataset, the shapes\nvisually diverge from S-curve topology on the S-curve dataset even for\nlarger perplexity values.\n\nFor further details, \"How to Use t-SNE Effectively\"\nhttp://distill.pub/2016/misread-tsne/ provides a good discussion of the\neffects of various parameters, as well as interactive plots to explore\nthose effects.\n\n"
18+
"\n=============================================================================\nt-SNE: The effect of various perplexity values on the shape\n=============================================================================\n\nAn illustration of t-SNE on the two concentric circles and the S-curve\ndatasets for different perplexity values.\n\nWe observe a tendency towards clearer shapes as the preplexity value increases.\n\nThe size, the distance and the shape of clusters may vary upon initialization,\nperplexity values and does not always convey a meaning.\n\nAs shown below, t-SNE for higher perplexities finds meaningful topology of\ntwo concentric circles, however the size and the distance of the circles varies\nslightly from the original. Contrary to the two circles dataset, the shapes\nvisually diverge from S-curve topology on the S-curve dataset even for\nlarger perplexity values.\n\nFor further details, \"How to Use t-SNE Effectively\"\nhttps://distill.pub/2016/misread-tsne/ provides a good discussion of the\neffects of various parameters, as well as interactive plots to explore\nthose effects.\n\n"
1919
]
2020
},
2121
{

dev/_downloads/plot_t_sne_perplexity.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@
1818
larger perplexity values.
1919
2020
For further details, "How to Use t-SNE Effectively"
21-
http://distill.pub/2016/misread-tsne/ provides a good discussion of the
21+
https://distill.pub/2016/misread-tsne/ provides a good discussion of the
2222
effects of various parameters, as well as interactive plots to explore
2323
those effects.
2424
"""

dev/_downloads/scikit-learn-docs.pdf

-1.04 KB
Binary file not shown.

dev/_images/iris.png

0 Bytes

0 commit comments

Comments
 (0)