scikit-learn
diff --git a/‎dev/.buildinfo
Lines changed: 1 addition & 1 deletion b/‎dev/.buildinfo
Lines changed: 1 addition & 1 deletion
diff --git a/‎dev/_downloads/07fcc19ba03226cd3d83d4e40ec44385/auto_examples_python.zip
59 Bytes b/‎dev/_downloads/07fcc19ba03226cd3d83d4e40ec44385/auto_examples_python.zip
59 Bytes
diff --git a/‎dev/_downloads/6f1e7a639e0699d6164445b55e6c116d/auto_examples_jupyter.zip
54 Bytes b/‎dev/_downloads/6f1e7a639e0699d6164445b55e6c116d/auto_examples_jupyter.zip
54 Bytes
diff --git a/‎dev/_downloads/764d061a261a2e06ad21ec9133361b2d/plot_precision_recall.ipynb
Lines changed: 1 addition & 1 deletion b/‎dev/_downloads/764d061a261a2e06ad21ec9133361b2d/plot_precision_recall.ipynb
Lines changed: 1 addition & 1 deletion
diff --git a/‎dev/_downloads/98161c8b335acb98de356229c1005819/plot_precision_recall.py
Lines changed: 4 additions & 3 deletions b/‎dev/_downloads/98161c8b335acb98de356229c1005819/plot_precision_recall.py
Lines changed: 4 additions & 3 deletions
diff --git a/‎dev/_downloads/scikit-learn-docs.zip
-15.5 KB b/‎dev/_downloads/scikit-learn-docs.zip
-15.5 KB
diff --git a/‎dev/_images/sphx_glr_plot_agglomerative_clustering_003.png
166 Bytes b/‎dev/_images/sphx_glr_plot_agglomerative_clustering_003.png
166 Bytes
diff --git a/‎dev/_images/sphx_glr_plot_agglomerative_clustering_004.png
-176 Bytes b/‎dev/_images/sphx_glr_plot_agglomerative_clustering_004.png
-176 Bytes
diff --git a/‎dev/_images/sphx_glr_plot_anomaly_comparison_001.png
-247 Bytes b/‎dev/_images/sphx_glr_plot_anomaly_comparison_001.png
-247 Bytes
diff --git a/‎dev/_images/sphx_glr_plot_anomaly_comparison_thumb.png
-14 Bytes b/‎dev/_images/sphx_glr_plot_anomaly_comparison_thumb.png
-14 Bytes
@@ -1,4 +1,4 @@
 # Sphinx build info version 1
 # This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
-config: 5a526b24b98f10b90bee64a93afd2a98
+config: a728e4d6f5e9004cd02b486e8d35ddfe
 tags: 645f666f9bcd5a90fca523b33c5a78b7
@@ -4,7 +4,7 @@
       "cell_type": "markdown",
       "metadata": {},
       "source": [
-        "\n# Precision-Recall\n\nExample of Precision-Recall metric to evaluate classifier output quality.\n\nPrecision-Recall is a useful measure of success of prediction when the\nclasses are very imbalanced. In information retrieval, precision is a\nmeasure of result relevancy, while recall is a measure of how many truly\nrelevant results are returned.\n\nThe precision-recall curve shows the tradeoff between precision and\nrecall for different threshold. A high area under the curve represents\nboth high recall and high precision, where high precision relates to a\nlow false positive rate, and high recall relates to a low false negative\nrate. High scores for both show that the classifier is returning accurate\nresults (high precision), as well as returning a majority of all positive\nresults (high recall).\n\nA system with high recall but low precision returns many results, but most of\nits predicted labels are incorrect when compared to the training labels. A\nsystem with high precision but low recall is just the opposite, returning very\nfew results, but most of its predicted labels are correct when compared to the\ntraining labels. An ideal system with high precision and high recall will\nreturn many results, with all results labeled correctly.\n\nPrecision ($P$) is defined as the number of true positives ($T_p$)\nover the number of true positives plus the number of false positives\n($F_p$).\n\n$P = \\frac{T_p}{T_p+F_p}$\n\nRecall ($R$) is defined as the number of true positives ($T_p$)\nover the number of true positives plus the number of false negatives\n($F_n$).\n\n$R = \\frac{T_p}{T_p + F_n}$\n\nThese quantities are also related to the ($F_1$) score, which is defined\nas the harmonic mean of precision and recall.\n\n$F1 = 2\\frac{P \\times R}{P+R}$\n\nNote that the precision may not decrease with recall. The\ndefinition of precision ($\\frac{T_p}{T_p + F_p}$) shows that lowering\nthe threshold of a classifier may increase the denominator, by increasing the\nnumber of results returned. If the threshold was previously set too high, the\nnew results may all be true positives, which will increase precision. If the\nprevious threshold was about right or too low, further lowering the threshold\nwill introduce false positives, decreasing precision.\n\nRecall is defined as $\\frac{T_p}{T_p+F_n}$, where $T_p+F_n$ does\nnot depend on the classifier threshold. This means that lowering the classifier\nthreshold may increase recall, by increasing the number of true positive\nresults. It is also possible that lowering the threshold may leave recall\nunchanged, while the precision fluctuates.\n\nThe relationship between recall and precision can be observed in the\nstairstep area of the plot - at the edges of these steps a small change\nin the threshold considerably reduces precision, with only a minor gain in\nrecall.\n\n**Average precision** (AP) summarizes such a plot as the weighted mean of\nprecisions achieved at each threshold, with the increase in recall from the\nprevious threshold used as the weight:\n\n$\\text{AP} = \\sum_n (R_n - R_{n-1}) P_n$\n\nwhere $P_n$ and $R_n$ are the precision and recall at the\nnth threshold. A pair $(R_k, P_k)$ is referred to as an\n*operating point*.\n\nAP and the trapezoidal area under the operating points\n(:func:`sklearn.metrics.auc`) are common ways to summarize a precision-recall\ncurve that lead to different results. Read more in the\n`User Guide <precision_recall_f_measure_metrics>`.\n\nPrecision-recall curves are typically used in binary classification to study\nthe output of a classifier. In order to extend the precision-recall curve and\naverage precision to multi-class or multi-label classification, it is necessary\nto binarize the output. One curve can be drawn per label, but one can also draw\na precision-recall curve by considering each element of the label indicator\nmatrix as a binary prediction (micro-averaging).\n\n<div class=\"alert alert-info\"><h4>Note</h4><p>See also :func:`sklearn.metrics.average_precision_score`,\n             :func:`sklearn.metrics.recall_score`,\n             :func:`sklearn.metrics.precision_score`,\n             :func:`sklearn.metrics.f1_score`</p></div>\n"
+        "\n# Precision-Recall\n\nExample of Precision-Recall metric to evaluate classifier output quality.\n\nPrecision-Recall is a useful measure of success of prediction when the\nclasses are very imbalanced. In information retrieval, precision is a\nmeasure of result relevancy, while recall is a measure of how many truly\nrelevant results are returned.\n\nThe precision-recall curve shows the tradeoff between precision and\nrecall for different threshold. A high area under the curve represents\nboth high recall and high precision, where high precision relates to a\nlow false positive rate, and high recall relates to a low false negative\nrate. High scores for both show that the classifier is returning accurate\nresults (high precision), as well as returning a majority of all positive\nresults (high recall).\n\nA system with high recall but low precision returns many results, but most of\nits predicted labels are incorrect when compared to the training labels. A\nsystem with high precision but low recall is just the opposite, returning very\nfew results, but most of its predicted labels are correct when compared to the\ntraining labels. An ideal system with high precision and high recall will\nreturn many results, with all results labeled correctly.\n\nPrecision ($P$) is defined as the number of true positives ($T_p$)\nover the number of true positives plus the number of false positives\n($F_p$).\n\n$P = \\frac{T_p}{T_p+F_p}$\n\nRecall ($R$) is defined as the number of true positives ($T_p$)\nover the number of true positives plus the number of false negatives\n($F_n$).\n\n$R = \\frac{T_p}{T_p + F_n}$\n\nThese quantities are also related to the $F_1$ score, which is the\nharmonic mean of precision and recall. Thus, we can compute the $F_1$\nusing the following formula:\n\n$F_1 = \\frac{2T_p}{2T_p + F_p + F_n}$\n\nNote that the precision may not decrease with recall. The\ndefinition of precision ($\\frac{T_p}{T_p + F_p}$) shows that lowering\nthe threshold of a classifier may increase the denominator, by increasing the\nnumber of results returned. If the threshold was previously set too high, the\nnew results may all be true positives, which will increase precision. If the\nprevious threshold was about right or too low, further lowering the threshold\nwill introduce false positives, decreasing precision.\n\nRecall is defined as $\\frac{T_p}{T_p+F_n}$, where $T_p+F_n$ does\nnot depend on the classifier threshold. This means that lowering the classifier\nthreshold may increase recall, by increasing the number of true positive\nresults. It is also possible that lowering the threshold may leave recall\nunchanged, while the precision fluctuates.\n\nThe relationship between recall and precision can be observed in the\nstairstep area of the plot - at the edges of these steps a small change\nin the threshold considerably reduces precision, with only a minor gain in\nrecall.\n\n**Average precision** (AP) summarizes such a plot as the weighted mean of\nprecisions achieved at each threshold, with the increase in recall from the\nprevious threshold used as the weight:\n\n$\\text{AP} = \\sum_n (R_n - R_{n-1}) P_n$\n\nwhere $P_n$ and $R_n$ are the precision and recall at the\nnth threshold. A pair $(R_k, P_k)$ is referred to as an\n*operating point*.\n\nAP and the trapezoidal area under the operating points\n(:func:`sklearn.metrics.auc`) are common ways to summarize a precision-recall\ncurve that lead to different results. Read more in the\n`User Guide <precision_recall_f_measure_metrics>`.\n\nPrecision-recall curves are typically used in binary classification to study\nthe output of a classifier. In order to extend the precision-recall curve and\naverage precision to multi-class or multi-label classification, it is necessary\nto binarize the output. One curve can be drawn per label, but one can also draw\na precision-recall curve by considering each element of the label indicator\nmatrix as a binary prediction (micro-averaging).\n\n<div class=\"alert alert-info\"><h4>Note</h4><p>See also :func:`sklearn.metrics.average_precision_score`,\n             :func:`sklearn.metrics.recall_score`,\n             :func:`sklearn.metrics.precision_score`,\n             :func:`sklearn.metrics.f1_score`</p></div>\n"
       ]
     },
     {
 
@@ -37,10 +37,11 @@
 
 :math:`R = \\frac{T_p}{T_p + F_n}`
 
-These quantities are also related to the (:math:`F_1`) score, which is defined
-as the harmonic mean of precision and recall.
+These quantities are also related to the :math:`F_1` score, which is the
+harmonic mean of precision and recall. Thus, we can compute the :math:`F_1`
+using the following formula:
 
-:math:`F1 = 2\\frac{P \\times R}{P+R}`
+:math:`F_1 = \\frac{2T_p}{2T_p + F_p + F_n}`
 
 Note that the precision may not decrease with recall. The
 definition of precision (:math:`\\frac{T_p}{T_p + F_p}`) shows that lowering
Original file line number	Diff line number	Diff line change
`@@ -4,7 +4,7 @@`
`4`	`4`	`"cell_type": "markdown",`
`5`	`5`	`"metadata": {},`
`6`	`6`	`"source": [`
`7`		- "\n# Precision-Recall\n\nExample of Precision-Recall metric to evaluate classifier output quality.\n\nPrecision-Recall is a useful measure of success of prediction when the\nclasses are very imbalanced. In information retrieval, precision is a\nmeasure of result relevancy, while recall is a measure of how many truly\nrelevant results are returned.\n\nThe precision-recall curve shows the tradeoff between precision and\nrecall for different threshold. A high area under the curve represents\nboth high recall and high precision, where high precision relates to a\nlow false positive rate, and high recall relates to a low false negative\nrate. High scores for both show that the classifier is returning accurate\nresults (high precision), as well as returning a majority of all positive\nresults (high recall).\n\nA system with high recall but low precision returns many results, but most of\nits predicted labels are incorrect when compared to the training labels. A\nsystem with high precision but low recall is just the opposite, returning very\nfew results, but most of its predicted labels are correct when compared to the\ntraining labels. An ideal system with high precision and high recall will\nreturn many results, with all results labeled correctly.\n\nPrecision ($P$) is defined as the number of true positives ($T_p$)\nover the number of true positives plus the number of false positives\n($F_p$).\n\n$P = \\frac{T_p}{T_p+F_p}$\n\nRecall ($R$) is defined as the number of true positives ($T_p$)\nover the number of true positives plus the number of false negatives\n($F_n$).\n\n$R = \\frac{T_p}{T_p + F_n}$\n\nThese quantities are also related to the ($F_1$) score, which is defined\nas the harmonic mean of precision and recall.\n\n$F1 = 2\\frac{P \\times R}{P+R}$\n\nNote that the precision may not decrease with recall. The\ndefinition of precision ($\\frac{T_p}{T_p + F_p}$) shows that lowering\nthe threshold of a classifier may increase the denominator, by increasing the\nnumber of results returned. If the threshold was previously set too high, the\nnew results may all be true positives, which will increase precision. If the\nprevious threshold was about right or too low, further lowering the threshold\nwill introduce false positives, decreasing precision.\n\nRecall is defined as $\\frac{T_p}{T_p+F_n}$, where $T_p+F_n$ does\nnot depend on the classifier threshold. This means that lowering the classifier\nthreshold may increase recall, by increasing the number of true positive\nresults. It is also possible that lowering the threshold may leave recall\nunchanged, while the precision fluctuates.\n\nThe relationship between recall and precision can be observed in the\nstairstep area of the plot - at the edges of these steps a small change\nin the threshold considerably reduces precision, with only a minor gain in\nrecall.\n\nAverage precision (AP) summarizes such a plot as the weighted mean of\nprecisions achieved at each threshold, with the increase in recall from the\nprevious threshold used as the weight:\n\n$\\text{AP} = \\sum_n (R_n - R_{n-1}) P_n$\n\nwhere $P_n$ and $R_n$ are the precision and recall at the\nnth threshold. A pair $(R_k, P_k)$ is referred to as an\noperating point.\n\nAP and the trapezoidal area under the operating points\n(:func:`sklearn.metrics.auc`) are common ways to summarize a precision-recall\ncurve that lead to different results. Read more in the\n`User Guide <precision_recall_f_measure_metrics>`.\n\nPrecision-recall curves are typically used in binary classification to study\nthe output of a classifier. In order to extend the precision-recall curve and\naverage precision to multi-class or multi-label classification, it is necessary\nto binarize the output. One curve can be drawn per label, but one can also draw\na precision-recall curve by considering each element of the label indicator\nmatrix as a binary prediction (micro-averaging).\n\n<div class=\"alert alert-info\"><h4>Note</h4><p>See also :func:`sklearn.metrics.average_precision_score`,\n :func:`sklearn.metrics.recall_score`,\n :func:`sklearn.metrics.precision_score`,\n :func:`sklearn.metrics.f1_score`</p></div>\n"
	`7`	+ "\n# Precision-Recall\n\nExample of Precision-Recall metric to evaluate classifier output quality.\n\nPrecision-Recall is a useful measure of success of prediction when the\nclasses are very imbalanced. In information retrieval, precision is a\nmeasure of result relevancy, while recall is a measure of how many truly\nrelevant results are returned.\n\nThe precision-recall curve shows the tradeoff between precision and\nrecall for different threshold. A high area under the curve represents\nboth high recall and high precision, where high precision relates to a\nlow false positive rate, and high recall relates to a low false negative\nrate. High scores for both show that the classifier is returning accurate\nresults (high precision), as well as returning a majority of all positive\nresults (high recall).\n\nA system with high recall but low precision returns many results, but most of\nits predicted labels are incorrect when compared to the training labels. A\nsystem with high precision but low recall is just the opposite, returning very\nfew results, but most of its predicted labels are correct when compared to the\ntraining labels. An ideal system with high precision and high recall will\nreturn many results, with all results labeled correctly.\n\nPrecision ($P$) is defined as the number of true positives ($T_p$)\nover the number of true positives plus the number of false positives\n($F_p$).\n\n$P = \\frac{T_p}{T_p+F_p}$\n\nRecall ($R$) is defined as the number of true positives ($T_p$)\nover the number of true positives plus the number of false negatives\n($F_n$).\n\n$R = \\frac{T_p}{T_p + F_n}$\n\nThese quantities are also related to the $F_1$ score, which is the\nharmonic mean of precision and recall. Thus, we can compute the $F_1$\nusing the following formula:\n\n$F_1 = \\frac{2T_p}{2T_p + F_p + F_n}$\n\nNote that the precision may not decrease with recall. The\ndefinition of precision ($\\frac{T_p}{T_p + F_p}$) shows that lowering\nthe threshold of a classifier may increase the denominator, by increasing the\nnumber of results returned. If the threshold was previously set too high, the\nnew results may all be true positives, which will increase precision. If the\nprevious threshold was about right or too low, further lowering the threshold\nwill introduce false positives, decreasing precision.\n\nRecall is defined as $\\frac{T_p}{T_p+F_n}$, where $T_p+F_n$ does\nnot depend on the classifier threshold. This means that lowering the classifier\nthreshold may increase recall, by increasing the number of true positive\nresults. It is also possible that lowering the threshold may leave recall\nunchanged, while the precision fluctuates.\n\nThe relationship between recall and precision can be observed in the\nstairstep area of the plot - at the edges of these steps a small change\nin the threshold considerably reduces precision, with only a minor gain in\nrecall.\n\nAverage precision (AP) summarizes such a plot as the weighted mean of\nprecisions achieved at each threshold, with the increase in recall from the\nprevious threshold used as the weight:\n\n$\\text{AP} = \\sum_n (R_n - R_{n-1}) P_n$\n\nwhere $P_n$ and $R_n$ are the precision and recall at the\nnth threshold. A pair $(R_k, P_k)$ is referred to as an\noperating point.\n\nAP and the trapezoidal area under the operating points\n(:func:`sklearn.metrics.auc`) are common ways to summarize a precision-recall\ncurve that lead to different results. Read more in the\n`User Guide <precision_recall_f_measure_metrics>`.\n\nPrecision-recall curves are typically used in binary classification to study\nthe output of a classifier. In order to extend the precision-recall curve and\naverage precision to multi-class or multi-label classification, it is necessary\nto binarize the output. One curve can be drawn per label, but one can also draw\na precision-recall curve by considering each element of the label indicator\nmatrix as a binary prediction (micro-averaging).\n\n<div class=\"alert alert-info\"><h4>Note</h4><p>See also :func:`sklearn.metrics.average_precision_score`,\n :func:`sklearn.metrics.recall_score`,\n :func:`sklearn.metrics.precision_score`,\n :func:`sklearn.metrics.f1_score`</p></div>\n"
`8`	`8`	`]`
`9`	`9`	`},`
`10`	`10`	`{`