krishnatray
diff --git a/‎dev/_downloads/face_recognition.py
Lines changed: 3 additions & 1 deletion b/‎dev/_downloads/face_recognition.py
Lines changed: 3 additions & 1 deletion
diff --git a/‎dev/_downloads/plot_gpr_co2.py
Lines changed: 27 additions & 21 deletions b/‎dev/_downloads/plot_gpr_co2.py
Lines changed: 27 additions & 21 deletions
diff --git a/‎dev/_sources/auto_examples/applications/face_recognition.txt
Lines changed: 4 additions & 2 deletions b/‎dev/_sources/auto_examples/applications/face_recognition.txt
Lines changed: 4 additions & 2 deletions
diff --git a/‎dev/_sources/auto_examples/gaussian_process/plot_gpr_co2.txt
Lines changed: 30 additions & 24 deletions b/‎dev/_sources/auto_examples/gaussian_process/plot_gpr_co2.txt
Lines changed: 30 additions & 24 deletions
diff --git a/‎dev/_sources/datasets/index.txt
Lines changed: 6 additions & 6 deletions b/‎dev/_sources/datasets/index.txt
Lines changed: 6 additions & 6 deletions
diff --git a/‎dev/_sources/datasets/rcv1.txt
Lines changed: 2 additions & 2 deletions b/‎dev/_sources/datasets/rcv1.txt
Lines changed: 2 additions & 2 deletions
diff --git a/‎dev/_sources/modules/decomposition.txt
Lines changed: 1 addition & 0 deletions b/‎dev/_sources/modules/decomposition.txt
Lines changed: 1 addition & 0 deletions
diff --git a/‎dev/_sources/modules/feature_selection.txt
Lines changed: 2 additions & 0 deletions b/‎dev/_sources/modules/feature_selection.txt
Lines changed: 2 additions & 0 deletions
diff --git a/‎dev/_sources/modules/gaussian_process.txt
Lines changed: 33 additions & 26 deletions b/‎dev/_sources/modules/gaussian_process.txt
Lines changed: 33 additions & 26 deletions
diff --git a/‎dev/_sources/modules/multiclass.txt
Lines changed: 4 additions & 4 deletions b/‎dev/_sources/modules/multiclass.txt
Lines changed: 4 additions & 4 deletions
@@ -12,8 +12,9 @@
 
 Expected results for the top 5 most represented people in the dataset::
 
+================== ============ ======= ========== =======
                    precision    recall  f1-score   support
-
+================== ============ ======= ========== =======
      Ariel Sharon       0.67      0.92      0.77        13
      Colin Powell       0.75      0.78      0.76        60
   Donald Rumsfeld       0.78      0.67      0.72        27
@@ -23,6 +24,7 @@
        Tony Blair       0.81      0.69      0.75        36
 
       avg / total       0.80      0.80      0.80       322
+================== ============ ======= ========== =======
 
 """
 from __future__ import print_function
 
@@ -13,34 +13,40 @@
 
 The kernel is composed of several terms that are responsible for explaining
 different properties of the signal:
- - a long term, smooth rising trend is to be explained by an RBF kernel. The
-   RBF kernel with a large length-scale enforces this component to be smooth;
-   it is not enforced that the trend is rising which leaves this choice to the
-   GP. The specific length-scale and the amplitude are free hyperparameters.
- - a seasonal component, which is to be explained by the periodic
-   ExpSineSquared kernel with a fixed periodicity of 1 year. The length-scale
-   of this periodic component, controlling its smoothness, is a free parameter.
-   In order to allow decaying away from exact periodicity, the product with an
-   RBF kernel is taken. The length-scale of this RBF component controls the
-   decay time and is a further free parameter.
- - smaller, medium term irregularities are to be explained by a
-   RationalQuadratic kernel component, whose length-scale and alpha parameter,
-   which determines the diffuseness of the length-scales, are to be determined.
-   According to [RW2006], these irregularities can better be explained by
-   a RationalQuadratic than an RBF kernel component, probably because it can
-   accommodate several length-scales.
- - a "noise" term, consisting of an RBF kernel contribution, which shall
-   explain the correlated noise components such as local weather phenomena,
-   and a WhiteKernel contribution for the white noise. The relative amplitudes
-   and the RBF's length scale are further free parameters.
+
+- a long term, smooth rising trend is to be explained by an RBF kernel. The
+  RBF kernel with a large length-scale enforces this component to be smooth;
+  it is not enforced that the trend is rising which leaves this choice to the
+  GP. The specific length-scale and the amplitude are free hyperparameters.
+
+- a seasonal component, which is to be explained by the periodic
+  ExpSineSquared kernel with a fixed periodicity of 1 year. The length-scale
+  of this periodic component, controlling its smoothness, is a free parameter.
+  In order to allow decaying away from exact periodicity, the product with an
+  RBF kernel is taken. The length-scale of this RBF component controls the
+  decay time and is a further free parameter.
+
+- smaller, medium term irregularities are to be explained by a
+  RationalQuadratic kernel component, whose length-scale and alpha parameter,
+  which determines the diffuseness of the length-scales, are to be determined.
+  According to [RW2006], these irregularities can better be explained by
+  a RationalQuadratic than an RBF kernel component, probably because it can
+  accommodate several length-scales.
+
+- a "noise" term, consisting of an RBF kernel contribution, which shall
+  explain the correlated noise components such as local weather phenomena,
+  and a WhiteKernel contribution for the white noise. The relative amplitudes
+  and the RBF's length scale are further free parameters.
 
 Maximizing the log-marginal-likelihood after subtracting the target's mean
-yields the following kernel with an LML of -83.214:
+yields the following kernel with an LML of -83.214::
+
    34.4**2 * RBF(length_scale=41.8)
    + 3.27**2 * RBF(length_scale=180) * ExpSineSquared(length_scale=1.44,
                                                       periodicity=1)
    + 0.446**2 * RationalQuadratic(alpha=17.7, length_scale=0.957)
    + 0.197**2 * RBF(length_scale=0.138) + WhiteKernel(noise_level=0.0336)
+
 Thus, most of the target signal (34.4ppm) is explained by a long-term rising
 trend (length-scale 41.8 years). The periodic component has an amplitude of
 3.27ppm, a decay time of 180 years and a length-scale of 1.44. The long decay
 
@@ -16,8 +16,9 @@ The dataset used in this example is a preprocessed excerpt of the
 
 Expected results for the top 5 most represented people in the dataset::
 
+================== ============ ======= ========== =======
                    precision    recall  f1-score   support
-
+================== ============ ======= ========== =======
      Ariel Sharon       0.67      0.92      0.77        13
      Colin Powell       0.75      0.78      0.76        60
   Donald Rumsfeld       0.78      0.67      0.72        27
@@ -27,11 +28,12 @@ Gerhard Schroeder       0.76      0.76      0.76        25
        Tony Blair       0.81      0.69      0.75        36
 
       avg / total       0.80      0.80      0.80       322
+================== ============ ======= ========== =======
 
 
 
 **Python source code:** :download:`face_recognition.py <face_recognition.py>`
 
 .. literalinclude:: face_recognition.py
-    :lines: 28-
+    :lines: 30-
 
@@ -17,34 +17,40 @@ model the CO2 concentration as a function of the time t.
 
 The kernel is composed of several terms that are responsible for explaining
 different properties of the signal:
- - a long term, smooth rising trend is to be explained by an RBF kernel. The
-   RBF kernel with a large length-scale enforces this component to be smooth;
-   it is not enforced that the trend is rising which leaves this choice to the
-   GP. The specific length-scale and the amplitude are free hyperparameters.
- - a seasonal component, which is to be explained by the periodic
-   ExpSineSquared kernel with a fixed periodicity of 1 year. The length-scale
-   of this periodic component, controlling its smoothness, is a free parameter.
-   In order to allow decaying away from exact periodicity, the product with an
-   RBF kernel is taken. The length-scale of this RBF component controls the
-   decay time and is a further free parameter.
- - smaller, medium term irregularities are to be explained by a
-   RationalQuadratic kernel component, whose length-scale and alpha parameter,
-   which determines the diffuseness of the length-scales, are to be determined.
-   According to [RW2006], these irregularities can better be explained by
-   a RationalQuadratic than an RBF kernel component, probably because it can
-   accommodate several length-scales.
- - a "noise" term, consisting of an RBF kernel contribution, which shall
-   explain the correlated noise components such as local weather phenomena,
-   and a WhiteKernel contribution for the white noise. The relative amplitudes
-   and the RBF's length scale are further free parameters.
+
+- a long term, smooth rising trend is to be explained by an RBF kernel. The
+  RBF kernel with a large length-scale enforces this component to be smooth;
+  it is not enforced that the trend is rising which leaves this choice to the
+  GP. The specific length-scale and the amplitude are free hyperparameters.
+
+- a seasonal component, which is to be explained by the periodic
+  ExpSineSquared kernel with a fixed periodicity of 1 year. The length-scale
+  of this periodic component, controlling its smoothness, is a free parameter.
+  In order to allow decaying away from exact periodicity, the product with an
+  RBF kernel is taken. The length-scale of this RBF component controls the
+  decay time and is a further free parameter.
+
+- smaller, medium term irregularities are to be explained by a
+  RationalQuadratic kernel component, whose length-scale and alpha parameter,
+  which determines the diffuseness of the length-scales, are to be determined.
+  According to [RW2006], these irregularities can better be explained by
+  a RationalQuadratic than an RBF kernel component, probably because it can
+  accommodate several length-scales.
+
+- a "noise" term, consisting of an RBF kernel contribution, which shall
+  explain the correlated noise components such as local weather phenomena,
+  and a WhiteKernel contribution for the white noise. The relative amplitudes
+  and the RBF's length scale are further free parameters.
 
 Maximizing the log-marginal-likelihood after subtracting the target's mean
-yields the following kernel with an LML of -83.214:
+yields the following kernel with an LML of -83.214::
+
    34.4**2 * RBF(length_scale=41.8)
    + 3.27**2 * RBF(length_scale=180) * ExpSineSquared(length_scale=1.44,
                                                       periodicity=1)
    + 0.446**2 * RationalQuadratic(alpha=17.7, length_scale=0.957)
    + 0.197**2 * RBF(length_scale=0.138) + WhiteKernel(noise_level=0.0336)
+
 Thus, most of the target signal (34.4ppm) is explained by a long-term rising
 trend (length-scale 41.8 years). The periodic component has an amplitude of
 3.27ppm, a decay time of 180 years and a length-scale of 1.44. The long decay
@@ -74,8 +80,8 @@ confident predictions until around 2015.
 **Python source code:** :download:`plot_gpr_co2.py <plot_gpr_co2.py>`
 
 .. literalinclude:: plot_gpr_co2.py
-    :lines: 54-
+    :lines: 60-
 
-**Total running time of the example:**  31.99 seconds
-( 0 minutes  31.99 seconds)
+**Total running time of the example:**  33.09 seconds
+( 0 minutes  33.09 seconds)
 
@@ -267,26 +267,26 @@ features::
 
 .. include:: rcv1.rst
 
-.. _boston_house_prices
+.. _boston_house_prices:
 
 .. include:: ../../sklearn/datasets/descr/boston_house_prices.rst
 
-.. _breast_cancer
+.. _breast_cancer:
 
 .. include:: ../../sklearn/datasets/descr/breast_cancer.rst
 
-.. _diabetes
+.. _diabetes:
 
 .. include:: ../../sklearn/datasets/descr/diabetes.rst
 
-.. _digits
+.. _digits:
 
 .. include:: ../../sklearn/datasets/descr/digits.rst
 
-.. _iris
+.. _iris:
 
 .. include:: ../../sklearn/datasets/descr/iris.rst
 
-.. _linnerud
+.. _linnerud:
 
 .. include:: ../../sklearn/datasets/descr/linnerud.rst
@@ -41,10 +41,10 @@ There are 103 topics, each represented by a string. Their corpus frequencies spa
     >>> rcv1.target_names[:3].tolist()  # doctest: +SKIP
     ['E11', 'ECAT', 'M11']
 
-The dataset will be downloaded from the `dataset's homepage`_ if necessary.
+The dataset will be downloaded from the `rcv1 homepage`_ if necessary.
 The compressed size is about 656 MB.
 
-.. _dataset's homepage: http://jmlr.csail.mit.edu/papers/volume5/lewis04a/
+.. _rcv1 homepage: http://jmlr.csail.mit.edu/papers/volume5/lewis04a/
 
 
 .. topic:: References
 
@@ -776,6 +776,7 @@ a corpus with :math:`D` documents and :math:`K` topics:
   2. For each document :math:`d`, draw :math:`\theta_d \sim Dirichlet(\alpha), \: d=1...D`
 
   3. For each word :math:`i` in document :math:`d`:
+
     a. Draw a topic index :math:`z_{di} \sim Multinomial(\theta_d)`
     b. Draw the observed word :math:`w_{ij} \sim Multinomial(beta_{z_{di}}.)`
 
 
@@ -153,6 +153,8 @@ For examples on how it is to be used refer to the sections below.
       most important features from the Boston dataset without knowing the
       threshold beforehand.
 
+.. _l1_feature_selection:
+
 L1-based feature selection
 --------------------------
 
 
@@ -67,12 +67,15 @@ level from the data (see example below).
 
 The implementation is based on Algorithm 2.1 of [RW2006]_. In addition to
 the API of standard sklearn estimators, GaussianProcessRegressor:
-     * allows prediction without prior fitting (based on the GP prior)
-     * provides an additional method ``sample_y(X)``, which evaluates samples
-       drawn from the GPR (prior or posterior) at given inputs
-     * exposes a method ``log_marginal_likelihood(theta)``, which can be used
-       externally for other ways of selecting hyperparameters, e.g., via
-       Markov chain Monte Carlo.
+
+* allows prediction without prior fitting (based on the GP prior)
+
+* provides an additional method ``sample_y(X)``, which evaluates samples
+  drawn from the GPR (prior or posterior) at given inputs
+
+* exposes a method ``log_marginal_likelihood(theta)``, which can be used
+  externally for other ways of selecting hyperparameters, e.g., via
+  Markov chain Monte Carlo.
 
 
 GPR examples
@@ -171,26 +174,30 @@ model the CO2 concentration as a function of the time t.
 
 The kernel is composed of several terms that are responsible for explaining
 different properties of the signal:
- - a long term, smooth rising trend is to be explained by an RBF kernel. The
-   RBF kernel with a large length-scale enforces this component to be smooth;
-   it is not enforced that the trend is rising which leaves this choice to the
-   GP. The specific length-scale and the amplitude are free hyperparameters.
- - a seasonal component, which is to be explained by the periodic
-   ExpSineSquared kernel with a fixed periodicity of 1 year. The length-scale
-   of this periodic component, controlling its smoothness, is a free parameter.
-   In order to allow decaying away from exact periodicity, the product with an
-   RBF kernel is taken. The length-scale of this RBF component controls the
-   decay time and is a further free parameter.
- - smaller, medium term irregularities are to be explained by a
-   RationalQuadratic kernel component, whose length-scale and alpha parameter,
-   which determines the diffuseness of the length-scales, are to be determined.
-   According to [RW2006]_, these irregularities can better be explained by
-   a RationalQuadratic than an RBF kernel component, probably because it can
-   accommodate several length-scales.
- - a "noise" term, consisting of an RBF kernel contribution, which shall
-   explain the correlated noise components such as local weather phenomena,
-   and a WhiteKernel contribution for the white noise. The relative amplitudes
-   and the RBF's length scale are further free parameters.
+
+- a long term, smooth rising trend is to be explained by an RBF kernel. The
+  RBF kernel with a large length-scale enforces this component to be smooth;
+  it is not enforced that the trend is rising which leaves this choice to the
+  GP. The specific length-scale and the amplitude are free hyperparameters.
+
+- a seasonal component, which is to be explained by the periodic
+  ExpSineSquared kernel with a fixed periodicity of 1 year. The length-scale
+  of this periodic component, controlling its smoothness, is a free parameter.
+  In order to allow decaying away from exact periodicity, the product with an
+  RBF kernel is taken. The length-scale of this RBF component controls the
+  decay time and is a further free parameter.
+
+- smaller, medium term irregularities are to be explained by a
+  RationalQuadratic kernel component, whose length-scale and alpha parameter,
+  which determines the diffuseness of the length-scales, are to be determined.
+  According to [RW2006]_, these irregularities can better be explained by
+  a RationalQuadratic than an RBF kernel component, probably because it can
+  accommodate several length-scales.
+
+- a "noise" term, consisting of an RBF kernel contribution, which shall
+  explain the correlated noise components such as local weather phenomena,
+  and a WhiteKernel contribution for the white noise. The relative amplitudes
+  and the RBF's length scale are further free parameters.
 
 Maximizing the log-marginal-likelihood after subtracting the target's mean
 yields the following kernel with an LML of -83.214:
 
@@ -214,7 +214,7 @@ code book. The code size is the dimensionality of the aforementioned space.
 Intuitively, each class should be represented by a code as unique as
 possible and a good code book should be designed to optimize classification
 accuracy. In this implementation, we simply use a randomly-generated code
-book as advocated in [2]_ although more elaborate methods may be added in the
+book as advocated in [3]_ although more elaborate methods may be added in the
 future.
 
 At fitting time, one binary classifier per bit in the code book is fitted.
@@ -261,16 +261,16 @@ Below is an example of multiclass learning using Output-Codes::
 
 .. topic:: References:
 
-    .. [1] "Solving multiclass learning problems via error-correcting output codes",
+    .. [2] "Solving multiclass learning problems via error-correcting output codes",
         Dietterich T., Bakiri G.,
         Journal of Artificial Intelligence Research 2,
         1995.
 
-    .. [2] "The error coding method and PICTs",
+    .. [3] "The error coding method and PICTs",
         James G., Hastie T.,
         Journal of Computational and Graphical statistics 7,
         1998.
 
-    .. [3] "The Elements of Statistical Learning",
+    .. [4] "The Elements of Statistical Learning",
         Hastie T., Tibshirani R., Friedman J., page 606 (second-edition)
         2008.