scikit-learn
diff --git a/‎dev/_downloads/0ca65f327d0d82be7fdda748f857d5b4/plot_poisson_regression_non_normal_loss.ipynb
Lines changed: 3 additions & 3 deletions b/‎dev/_downloads/0ca65f327d0d82be7fdda748f857d5b4/plot_poisson_regression_non_normal_loss.ipynb
Lines changed: 3 additions & 3 deletions
diff --git a/‎dev/_downloads/3409d9766d352cc9f9b169d4a799a87a/auto_examples_python.zip
500 Bytes b/‎dev/_downloads/3409d9766d352cc9f9b169d4a799a87a/auto_examples_python.zip
500 Bytes
diff --git a/‎dev/_downloads/d34667f097c619f8afda4bc936e7af21/auto_examples_jupyter.zip
493 Bytes b/‎dev/_downloads/d34667f097c619f8afda4bc936e7af21/auto_examples_jupyter.zip
493 Bytes
diff --git a/‎dev/_downloads/f686bae9e47a0517ddbf86ced97151b6/plot_poisson_regression_non_normal_loss.py
Lines changed: 10 additions & 1 deletion b/‎dev/_downloads/f686bae9e47a0517ddbf86ced97151b6/plot_poisson_regression_non_normal_loss.py
Lines changed: 10 additions & 1 deletion
diff --git a/‎dev/_downloads/scikit-learn-docs.pdf
31 KB b/‎dev/_downloads/scikit-learn-docs.pdf
31 KB
diff --git a/‎dev/_images/iris.png
0 Bytes b/‎dev/_images/iris.png
0 Bytes
diff --git a/‎dev/_images/sphx_glr_plot_agglomerative_clustering_001.png
-276 Bytes b/‎dev/_images/sphx_glr_plot_agglomerative_clustering_001.png
-276 Bytes
diff --git a/‎dev/_images/sphx_glr_plot_agglomerative_clustering_0011.png
-276 Bytes b/‎dev/_images/sphx_glr_plot_agglomerative_clustering_0011.png
-276 Bytes
diff --git a/‎dev/_images/sphx_glr_plot_agglomerative_clustering_003.png
-4 Bytes b/‎dev/_images/sphx_glr_plot_agglomerative_clustering_003.png
-4 Bytes
diff --git a/‎dev/_images/sphx_glr_plot_agglomerative_clustering_0031.png
-4 Bytes b/‎dev/_images/sphx_glr_plot_agglomerative_clustering_0031.png
-4 Bytes
@@ -123,7 +123,7 @@
       "cell_type": "markdown",
       "metadata": {},
       "source": [
-        "(Generalized) Linear models\n---------------------------\n\nWe start by modeling the target variable with the (l2 penalized) least\nsquares linear regression model, more comonly known as Ridge regression. We\nuse a low penalization `alpha`, as we expect such a linear model to under-fit\non such a large dataset.\n\n"
+        "(Generalized) linear models\n---------------------------\n\nWe start by modeling the target variable with the (l2 penalized) least\nsquares linear regression model, more comonly known as Ridge regression. We\nuse a low penalization `alpha`, as we expect such a linear model to under-fit\non such a large dataset.\n\n"
       ]
     },
     {
@@ -159,7 +159,7 @@
       "cell_type": "markdown",
       "metadata": {},
       "source": [
-        "Next we fit the Poisson regressor on the target variable. We set the\nregularization strength ``alpha`` to approximately 1e-6 over number of\nsamples (i.e. `1e-12`) in order to mimic the Ridge regressor whose L2 penalty\nterm scales differently with the number of samples.\n\n"
+        "Next we fit the Poisson regressor on the target variable. We set the\nregularization strength ``alpha`` to approximately 1e-6 over number of\nsamples (i.e. `1e-12`) in order to mimic the Ridge regressor whose L2 penalty\nterm scales differently with the number of samples.\n\nSince the Poisson regressor internally models the log of the expected target\nvalue instead of the expected value directly (log vs identity link function),\nthe relationship between X and y is not exactly linear anymore. Therefore the\nPoisson regressor is called a Generalized Linear Model (GLM) rather than a\nvanilla linear model as is the case for Ridge regression.\n\n"
       ]
     },
     {
@@ -177,7 +177,7 @@
       "cell_type": "markdown",
       "metadata": {},
       "source": [
-        "Finally, we will consider a non-linear model, namely Gradient Boosting\nRegression Trees. Tree-based models do not require the categorical data to be\none-hot encoded: instead, we can encode each category label with an arbitrary\ninteger using :class:`~sklearn.preprocessing.OrdinalEncoder`. With this\nencoding, the trees will treat the categorical features as ordered features,\nwhich might not be always a desired behavior. However this effect is limited\nfor deep enough trees which are able to recover the categorical nature of the\nfeatures. The main advantage of the\n:class:`~sklearn.preprocessing.OrdinalEncoder` over the\n:class:`~sklearn.preprocessing.OneHotEncoder` is that it will make training\nfaster.\n\nGradient Boosting also gives the possibility to fit the trees with a Poisson\nloss (with an implicit log-link function) instead of the default\nleast-squares loss. Here we only fit trees with the Poisson loss to keep this\nexample concise.\n\n"
+        "Gradient Boosting Regression Trees for Poisson regression\n---------------------------------------------------------\n\nFinally, we will consider a non-linear model, namely Gradient Boosting\nRegression Trees. Tree-based models do not require the categorical data to be\none-hot encoded: instead, we can encode each category label with an arbitrary\ninteger using :class:`~sklearn.preprocessing.OrdinalEncoder`. With this\nencoding, the trees will treat the categorical features as ordered features,\nwhich might not be always a desired behavior. However this effect is limited\nfor deep enough trees which are able to recover the categorical nature of the\nfeatures. The main advantage of the\n:class:`~sklearn.preprocessing.OrdinalEncoder` over the\n:class:`~sklearn.preprocessing.OneHotEncoder` is that it will make training\nfaster.\n\nGradient Boosting also gives the possibility to fit the trees with a Poisson\nloss (with an implicit log-link function) instead of the default\nleast-squares loss. Here we only fit trees with the Poisson loss to keep this\nexample concise.\n\n"
       ]
     },
     {
 
@@ -184,7 +184,7 @@ def score_estimator(estimator, df_test):
 score_estimator(dummy, df_test)
 
 ##############################################################################
-# (Generalized) Linear models
+# (Generalized) linear models
 # ---------------------------
 #
 # We start by modeling the target variable with the (l2 penalized) least
@@ -217,6 +217,12 @@ def score_estimator(estimator, df_test):
 # regularization strength ``alpha`` to approximately 1e-6 over number of
 # samples (i.e. `1e-12`) in order to mimic the Ridge regressor whose L2 penalty
 # term scales differently with the number of samples.
+#
+# Since the Poisson regressor internally models the log of the expected target
+# value instead of the expected value directly (log vs identity link function),
+# the relationship between X and y is not exactly linear anymore. Therefore the
+# Poisson regressor is called a Generalized Linear Model (GLM) rather than a
+# vanilla linear model as is the case for Ridge regression.
 
 from sklearn.linear_model import PoissonRegressor
 
@@ -233,6 +239,9 @@ def score_estimator(estimator, df_test):
 score_estimator(poisson_glm, df_test)
 
 ##############################################################################
+# Gradient Boosting Regression Trees for Poisson regression
+# ---------------------------------------------------------
+#
 # Finally, we will consider a non-linear model, namely Gradient Boosting
 # Regression Trees. Tree-based models do not require the categorical data to be
 # one-hot encoded: instead, we can encode each category label with an arbitrary
Original file line number	Diff line number	Diff line change
`@@ -123,7 +123,7 @@`
`123`	`123`	`"cell_type": "markdown",`
`124`	`124`	`"metadata": {},`
`125`	`125`	`"source": [`
`126`		- "(Generalized) Linear models\n---------------------------\n\nWe start by modeling the target variable with the (l2 penalized) least\nsquares linear regression model, more comonly known as Ridge regression. We\nuse a low penalization `alpha`, as we expect such a linear model to under-fit\non such a large dataset.\n\n"
	`126`	+ "(Generalized) linear models\n---------------------------\n\nWe start by modeling the target variable with the (l2 penalized) least\nsquares linear regression model, more comonly known as Ridge regression. We\nuse a low penalization `alpha`, as we expect such a linear model to under-fit\non such a large dataset.\n\n"
`127`	`127`	`]`
`128`	`128`	`},`
`129`	`129`	`{`
`@@ -159,7 +159,7 @@`
`159`	`159`	`"cell_type": "markdown",`
`160`	`160`	`"metadata": {},`
`161`	`161`	`"source": [`
`162`		- "Next we fit the Poisson regressor on the target variable. We set the\nregularization strength ``alpha`` to approximately 1e-6 over number of\nsamples (i.e. `1e-12`) in order to mimic the Ridge regressor whose L2 penalty\nterm scales differently with the number of samples.\n\n"
	`162`	+ "Next we fit the Poisson regressor on the target variable. We set the\nregularization strength ``alpha`` to approximately 1e-6 over number of\nsamples (i.e. `1e-12`) in order to mimic the Ridge regressor whose L2 penalty\nterm scales differently with the number of samples.\n\nSince the Poisson regressor internally models the log of the expected target\nvalue instead of the expected value directly (log vs identity link function),\nthe relationship between X and y is not exactly linear anymore. Therefore the\nPoisson regressor is called a Generalized Linear Model (GLM) rather than a\nvanilla linear model as is the case for Ridge regression.\n\n"
`163`	`163`	`]`
`164`	`164`	`},`
`165`	`165`	`{`
`@@ -177,7 +177,7 @@`
`177`	`177`	`"cell_type": "markdown",`
`178`	`178`	`"metadata": {},`
`179`	`179`	`"source": [`
`180`		- "Finally, we will consider a non-linear model, namely Gradient Boosting\nRegression Trees. Tree-based models do not require the categorical data to be\none-hot encoded: instead, we can encode each category label with an arbitrary\ninteger using :class:`~sklearn.preprocessing.OrdinalEncoder`. With this\nencoding, the trees will treat the categorical features as ordered features,\nwhich might not be always a desired behavior. However this effect is limited\nfor deep enough trees which are able to recover the categorical nature of the\nfeatures. The main advantage of the\n:class:`~sklearn.preprocessing.OrdinalEncoder` over the\n:class:`~sklearn.preprocessing.OneHotEncoder` is that it will make training\nfaster.\n\nGradient Boosting also gives the possibility to fit the trees with a Poisson\nloss (with an implicit log-link function) instead of the default\nleast-squares loss. Here we only fit trees with the Poisson loss to keep this\nexample concise.\n\n"
	`180`	+ "Gradient Boosting Regression Trees for Poisson regression\n---------------------------------------------------------\n\nFinally, we will consider a non-linear model, namely Gradient Boosting\nRegression Trees. Tree-based models do not require the categorical data to be\none-hot encoded: instead, we can encode each category label with an arbitrary\ninteger using :class:`~sklearn.preprocessing.OrdinalEncoder`. With this\nencoding, the trees will treat the categorical features as ordered features,\nwhich might not be always a desired behavior. However this effect is limited\nfor deep enough trees which are able to recover the categorical nature of the\nfeatures. The main advantage of the\n:class:`~sklearn.preprocessing.OrdinalEncoder` over the\n:class:`~sklearn.preprocessing.OneHotEncoder` is that it will make training\nfaster.\n\nGradient Boosting also gives the possibility to fit the trees with a Poisson\nloss (with an implicit log-link function) instead of the default\nleast-squares loss. Here we only fit trees with the Poisson loss to keep this\nexample concise.\n\n"
`181`	`181`	`]`
`182`	`182`	`},`
`183`	`183`	`{`