scikit-learn
diff --git a/‎dev/_downloads/07fcc19ba03226cd3d83d4e40ec44385/auto_examples_python.zip
56 Bytes b/‎dev/_downloads/07fcc19ba03226cd3d83d4e40ec44385/auto_examples_python.zip
56 Bytes
diff --git a/‎dev/_downloads/6d4f620ec6653356eb970c2a6ed62081/plot_calibration_curve.ipynb
Lines changed: 1 addition & 1 deletion b/‎dev/_downloads/6d4f620ec6653356eb970c2a6ed62081/plot_calibration_curve.ipynb
Lines changed: 1 addition & 1 deletion
diff --git a/‎dev/_downloads/6f1e7a639e0699d6164445b55e6c116d/auto_examples_jupyter.zip
48 Bytes b/‎dev/_downloads/6f1e7a639e0699d6164445b55e6c116d/auto_examples_jupyter.zip
48 Bytes
diff --git a/‎dev/_downloads/7c06490f380b1e20e9558c6c5fde70ed/plot_adjusted_for_chance_measures.ipynb
Lines changed: 4 additions & 4 deletions b/‎dev/_downloads/7c06490f380b1e20e9558c6c5fde70ed/plot_adjusted_for_chance_measures.ipynb
Lines changed: 4 additions & 4 deletions
@@ -80,7 +80,7 @@
       "cell_type": "markdown",
       "metadata": {},
       "source": [
-        "Uncalibrated :class:`~sklearn.naive_bayes.GaussianNB` is poorly calibrated\nbecause of\nthe redundant features which violate the assumption of feature-independence\nand result in an overly confident classifier, which is indicated by the\ntypical transposed-sigmoid curve. Calibration of the probabilities of\n:class:`~sklearn.naive_bayes.GaussianNB` with `isotonic` can fix\nthis issue as can be seen from the nearly diagonal calibration curve.\n:ref:sigmoid regression `<sigmoid_regressor>` also improves calibration\nslightly,\nalbeit not as strongly as the non-parametric isotonic regression. This can be\nattributed to the fact that we have plenty of calibration data such that the\ngreater flexibility of the non-parametric model can be exploited.\n\nBelow we will make a quantitative analysis considering several classification\nmetrics: `brier_score_loss`, `log_loss`,\n`precision, recall, F1 score <precision_recall_f_measure_metrics>` and\n`ROC AUC <roc_metrics>`.\n\n"
+        "Uncalibrated :class:`~sklearn.naive_bayes.GaussianNB` is poorly calibrated\nbecause of\nthe redundant features which violate the assumption of feature-independence\nand result in an overly confident classifier, which is indicated by the\ntypical transposed-sigmoid curve. Calibration of the probabilities of\n:class:`~sklearn.naive_bayes.GaussianNB` with `isotonic` can fix\nthis issue as can be seen from the nearly diagonal calibration curve.\n`Sigmoid regression <sigmoid_regressor>` also improves calibration\nslightly,\nalbeit not as strongly as the non-parametric isotonic regression. This can be\nattributed to the fact that we have plenty of calibration data such that the\ngreater flexibility of the non-parametric model can be exploited.\n\nBelow we will make a quantitative analysis considering several classification\nmetrics: `brier_score_loss`, `log_loss`,\n`precision, recall, F1 score <precision_recall_f_measure_metrics>` and\n`ROC AUC <roc_metrics>`.\n\n"
       ]
     },
     {
 
@@ -15,7 +15,7 @@
       "cell_type": "markdown",
       "metadata": {},
       "source": [
-        "\n# Adjustment for chance in clustering performance evaluation\n\nThis notebook explores the impact of uniformly-distributed random labeling on\nthe behavior of some clustering evaluation metrics. For such purpose, the\nmetrics are computed with a fixed number of samples and as a function of the number\nof clusters assigned by the estimator. The example is divided into two\nexperiments:\n\n- a first experiment with fixed \"ground truth labels\" (and therefore fixed\n  number of classes) and randomly \"predicted labels\";\n- a second experiment with varying \"ground truth labels\", randomly \"predicted\n  labels\". The \"predicted labels\" have the same number of classes and clusters\n  as the \"ground truth labels\".\n"
+        "\n# Adjustment for chance in clustering performance evaluation\nThis notebook explores the impact of uniformly-distributed random labeling on\nthe behavior of some clustering evaluation metrics. For such purpose, the\nmetrics are computed with a fixed number of samples and as a function of the number\nof clusters assigned by the estimator. The example is divided into two\nexperiments:\n\n- a first experiment with fixed \"ground truth labels\" (and therefore fixed\n  number of classes) and randomly \"predicted labels\";\n- a second experiment with varying \"ground truth labels\", randomly \"predicted\n  labels\". The \"predicted labels\" have the same number of classes and clusters\n  as the \"ground truth labels\".\n"
       ]
     },
     {
@@ -98,7 +98,7 @@
       },
       "outputs": [],
       "source": [
-        "import matplotlib.pyplot as plt\nimport matplotlib.style as style\n\nn_samples = 1000\nn_classes = 10\nn_clusters_range = np.linspace(2, 100, 10).astype(int)\nplots = []\nnames = []\n\nstyle.use(\"seaborn-colorblind\")\nplt.figure(1)\n\nfor marker, (score_name, score_func) in zip(\"d^vx.,\", score_funcs):\n\n    scores = fixed_classes_uniform_labelings_scores(\n        score_func, n_samples, n_clusters_range, n_classes=n_classes\n    )\n    plots.append(\n        plt.errorbar(\n            n_clusters_range,\n            scores.mean(axis=1),\n            scores.std(axis=1),\n            alpha=0.8,\n            linewidth=1,\n            marker=marker,\n        )[0]\n    )\n    names.append(score_name)\n\nplt.title(\n    \"Clustering measures for random uniform labeling\\n\"\n    f\"against reference assignment with {n_classes} classes\"\n)\nplt.xlabel(f\"Number of clusters (Number of samples is fixed to {n_samples})\")\nplt.ylabel(\"Score value\")\nplt.ylim(bottom=-0.05, top=1.05)\nplt.legend(plots, names)\nplt.show()"
+        "import matplotlib.pyplot as plt\nimport seaborn as sns\n\nn_samples = 1000\nn_classes = 10\nn_clusters_range = np.linspace(2, 100, 10).astype(int)\nplots = []\nnames = []\n\nsns.color_palette(\"colorblind\")\nplt.figure(1)\n\nfor marker, (score_name, score_func) in zip(\"d^vx.,\", score_funcs):\n    scores = fixed_classes_uniform_labelings_scores(\n        score_func, n_samples, n_clusters_range, n_classes=n_classes\n    )\n    plots.append(\n        plt.errorbar(\n            n_clusters_range,\n            scores.mean(axis=1),\n            scores.std(axis=1),\n            alpha=0.8,\n            linewidth=1,\n            marker=marker,\n        )[0]\n    )\n    names.append(score_name)\n\nplt.title(\n    \"Clustering measures for random uniform labeling\\n\"\n    f\"against reference assignment with {n_classes} classes\"\n)\nplt.xlabel(f\"Number of clusters (Number of samples is fixed to {n_samples})\")\nplt.ylabel(\"Score value\")\nplt.ylim(bottom=-0.05, top=1.05)\nplt.legend(plots, names, bbox_to_anchor=(0.5, 0.5))\nplt.show()"
       ]
     },
     {
@@ -134,14 +134,14 @@
       },
       "outputs": [],
       "source": [
-        "n_samples = 100\nn_clusters_range = np.linspace(2, n_samples, 10).astype(int)\n\nplt.figure(2)\n\nplots = []\nnames = []\n\nfor marker, (score_name, score_func) in zip(\"d^vx.,\", score_funcs):\n\n    scores = uniform_labelings_scores(score_func, n_samples, n_clusters_range)\n    plots.append(\n        plt.errorbar(\n            n_clusters_range,\n            np.median(scores, axis=1),\n            scores.std(axis=1),\n            alpha=0.8,\n            linewidth=2,\n            marker=marker,\n        )[0]\n    )\n    names.append(score_name)\n\nplt.title(\n    \"Clustering measures for 2 random uniform labelings\\nwith equal number of clusters\"\n)\nplt.xlabel(f\"Number of clusters (Number of samples is fixed to {n_samples})\")\nplt.ylabel(\"Score value\")\nplt.legend(plots, names)\nplt.ylim(bottom=-0.05, top=1.05)\nplt.show()"
+        "n_samples = 100\nn_clusters_range = np.linspace(2, n_samples, 10).astype(int)\n\nplt.figure(2)\n\nplots = []\nnames = []\n\nfor marker, (score_name, score_func) in zip(\"d^vx.,\", score_funcs):\n    scores = uniform_labelings_scores(score_func, n_samples, n_clusters_range)\n    plots.append(\n        plt.errorbar(\n            n_clusters_range,\n            np.median(scores, axis=1),\n            scores.std(axis=1),\n            alpha=0.8,\n            linewidth=2,\n            marker=marker,\n        )[0]\n    )\n    names.append(score_name)\n\nplt.title(\n    \"Clustering measures for 2 random uniform labelings\\nwith equal number of clusters\"\n)\nplt.xlabel(f\"Number of clusters (Number of samples is fixed to {n_samples})\")\nplt.ylabel(\"Score value\")\nplt.legend(plots, names)\nplt.ylim(bottom=-0.05, top=1.05)\nplt.show()"
       ]
     },
     {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
-        "We observe similar results as for the first experiment: adjusted for chance\nmetrics stay constantly near zero while other metrics tend to get larger with\nfiner-grained labelings. The mean V-measure of random labeling increases\nsignificantly as the number of clusters is closer to the total number of\nsamples used to compute the measure. Furthermore, raw Mutual Information is\nunbounded from above and its scale depends on the dimensions of the clustering\nproblem and the cardinality of the ground truth classes.\n\nOnly adjusted measures can hence be safely used as a consensus index to\nevaluate the average stability of clustering algorithms for a given value of k\non various overlapping sub-samples of the dataset.\n\nNon-adjusted clustering evaluation metric can therefore be misleading as they\noutput large values for fine-grained labelings, one could be lead to think\nthat the labeling has captured meaningful groups while they can be totally\nrandom. In particular, such non-adjusted metrics should not be used to compare\nthe results of different clustering algorithms that output a different number\nof clusters.\n\n"
+        "We observe similar results as for the first experiment: adjusted for chance\nmetrics stay constantly near zero while other metrics tend to get larger with\nfiner-grained labelings. The mean V-measure of random labeling increases\nsignificantly as the number of clusters is closer to the total number of\nsamples used to compute the measure. Furthermore, raw Mutual Information is\nunbounded from above and its scale depends on the dimensions of the clustering\nproblem and the cardinality of the ground truth classes. This is why the\ncurve goes off the chart.\n\nOnly adjusted measures can hence be safely used as a consensus index to\nevaluate the average stability of clustering algorithms for a given value of k\non various overlapping sub-samples of the dataset.\n\nNon-adjusted clustering evaluation metric can therefore be misleading as they\noutput large values for fine-grained labelings, one could be lead to think\nthat the labeling has captured meaningful groups while they can be totally\nrandom. In particular, such non-adjusted metrics should not be used to compare\nthe results of different clustering algorithms that output a different number\nof clusters.\n\n"
       ]
     }
   ],
Original file line number	Diff line number	Diff line change
`@@ -80,7 +80,7 @@`
`80`	`80`	`"cell_type": "markdown",`
`81`	`81`	`"metadata": {},`
`82`	`82`	`"source": [`
`83`		- "Uncalibrated :class:`~sklearn.naive_bayes.GaussianNB` is poorly calibrated\nbecause of\nthe redundant features which violate the assumption of feature-independence\nand result in an overly confident classifier, which is indicated by the\ntypical transposed-sigmoid curve. Calibration of the probabilities of\n:class:`~sklearn.naive_bayes.GaussianNB` with `isotonic` can fix\nthis issue as can be seen from the nearly diagonal calibration curve.\n:ref:sigmoid regression `<sigmoid_regressor>` also improves calibration\nslightly,\nalbeit not as strongly as the non-parametric isotonic regression. This can be\nattributed to the fact that we have plenty of calibration data such that the\ngreater flexibility of the non-parametric model can be exploited.\n\nBelow we will make a quantitative analysis considering several classification\nmetrics: `brier_score_loss`, `log_loss`,\n`precision, recall, F1 score <precision_recall_f_measure_metrics>` and\n`ROC AUC <roc_metrics>`.\n\n"
	`83`	+ "Uncalibrated :class:`~sklearn.naive_bayes.GaussianNB` is poorly calibrated\nbecause of\nthe redundant features which violate the assumption of feature-independence\nand result in an overly confident classifier, which is indicated by the\ntypical transposed-sigmoid curve. Calibration of the probabilities of\n:class:`~sklearn.naive_bayes.GaussianNB` with `isotonic` can fix\nthis issue as can be seen from the nearly diagonal calibration curve.\n`Sigmoid regression <sigmoid_regressor>` also improves calibration\nslightly,\nalbeit not as strongly as the non-parametric isotonic regression. This can be\nattributed to the fact that we have plenty of calibration data such that the\ngreater flexibility of the non-parametric model can be exploited.\n\nBelow we will make a quantitative analysis considering several classification\nmetrics: `brier_score_loss`, `log_loss`,\n`precision, recall, F1 score <precision_recall_f_measure_metrics>` and\n`ROC AUC <roc_metrics>`.\n\n"
`84`	`84`	`]`
`85`	`85`	`},`
`86`	`86`	`{`
Original file line number	Diff line number	Diff line change
`@@ -15,7 +15,7 @@`
`15`	`15`	`"cell_type": "markdown",`
`16`	`16`	`"metadata": {},`
`17`	`17`	`"source": [`
`18`		- "\n# Adjustment for chance in clustering performance evaluation\n\nThis notebook explores the impact of uniformly-distributed random labeling on\nthe behavior of some clustering evaluation metrics. For such purpose, the\nmetrics are computed with a fixed number of samples and as a function of the number\nof clusters assigned by the estimator. The example is divided into two\nexperiments:\n\n- a first experiment with fixed \"ground truth labels\" (and therefore fixed\n number of classes) and randomly \"predicted labels\";\n- a second experiment with varying \"ground truth labels\", randomly \"predicted\n labels\". The \"predicted labels\" have the same number of classes and clusters\n as the \"ground truth labels\".\n"
	`18`	+ "\n# Adjustment for chance in clustering performance evaluation\nThis notebook explores the impact of uniformly-distributed random labeling on\nthe behavior of some clustering evaluation metrics. For such purpose, the\nmetrics are computed with a fixed number of samples and as a function of the number\nof clusters assigned by the estimator. The example is divided into two\nexperiments:\n\n- a first experiment with fixed \"ground truth labels\" (and therefore fixed\n number of classes) and randomly \"predicted labels\";\n- a second experiment with varying \"ground truth labels\", randomly \"predicted\n labels\". The \"predicted labels\" have the same number of classes and clusters\n as the \"ground truth labels\".\n"
`19`	`19`	`]`
`20`	`20`	`},`
`21`	`21`	`{`
`@@ -98,7 +98,7 @@`
`98`	`98`	`},`
`99`	`99`	`"outputs": [],`
`100`	`100`	`"source": [`
`101`		- "import matplotlib.pyplot as plt\nimport matplotlib.style as style\n\nn_samples = 1000\nn_classes = 10\nn_clusters_range = np.linspace(2, 100, 10).astype(int)\nplots = []\nnames = []\n\nstyle.use(\"seaborn-colorblind\")\nplt.figure(1)\n\nfor marker, (score_name, score_func) in zip(\"d^vx.,\", score_funcs):\n\n scores = fixed_classes_uniform_labelings_scores(\n score_func, n_samples, n_clusters_range, n_classes=n_classes\n )\n plots.append(\n plt.errorbar(\n n_clusters_range,\n scores.mean(axis=1),\n scores.std(axis=1),\n alpha=0.8,\n linewidth=1,\n marker=marker,\n )[0]\n )\n names.append(score_name)\n\nplt.title(\n \"Clustering measures for random uniform labeling\\n\"\n f\"against reference assignment with {n_classes} classes\"\n)\nplt.xlabel(f\"Number of clusters (Number of samples is fixed to {n_samples})\")\nplt.ylabel(\"Score value\")\nplt.ylim(bottom=-0.05, top=1.05)\nplt.legend(plots, names)\nplt.show()"
	`101`	+ "import matplotlib.pyplot as plt\nimport seaborn as sns\n\nn_samples = 1000\nn_classes = 10\nn_clusters_range = np.linspace(2, 100, 10).astype(int)\nplots = []\nnames = []\n\nsns.color_palette(\"colorblind\")\nplt.figure(1)\n\nfor marker, (score_name, score_func) in zip(\"d^vx.,\", score_funcs):\n scores = fixed_classes_uniform_labelings_scores(\n score_func, n_samples, n_clusters_range, n_classes=n_classes\n )\n plots.append(\n plt.errorbar(\n n_clusters_range,\n scores.mean(axis=1),\n scores.std(axis=1),\n alpha=0.8,\n linewidth=1,\n marker=marker,\n )[0]\n )\n names.append(score_name)\n\nplt.title(\n \"Clustering measures for random uniform labeling\\n\"\n f\"against reference assignment with {n_classes} classes\"\n)\nplt.xlabel(f\"Number of clusters (Number of samples is fixed to {n_samples})\")\nplt.ylabel(\"Score value\")\nplt.ylim(bottom=-0.05, top=1.05)\nplt.legend(plots, names, bbox_to_anchor=(0.5, 0.5))\nplt.show()"
`102`	`102`	`]`
`103`	`103`	`},`
`104`	`104`	`{`
`@@ -134,14 +134,14 @@`
`134`	`134`	`},`
`135`	`135`	`"outputs": [],`
`136`	`136`	`"source": [`
`137`		- "n_samples = 100\nn_clusters_range = np.linspace(2, n_samples, 10).astype(int)\n\nplt.figure(2)\n\nplots = []\nnames = []\n\nfor marker, (score_name, score_func) in zip(\"d^vx.,\", score_funcs):\n\n scores = uniform_labelings_scores(score_func, n_samples, n_clusters_range)\n plots.append(\n plt.errorbar(\n n_clusters_range,\n np.median(scores, axis=1),\n scores.std(axis=1),\n alpha=0.8,\n linewidth=2,\n marker=marker,\n )[0]\n )\n names.append(score_name)\n\nplt.title(\n \"Clustering measures for 2 random uniform labelings\\nwith equal number of clusters\"\n)\nplt.xlabel(f\"Number of clusters (Number of samples is fixed to {n_samples})\")\nplt.ylabel(\"Score value\")\nplt.legend(plots, names)\nplt.ylim(bottom=-0.05, top=1.05)\nplt.show()"
	`137`	+ "n_samples = 100\nn_clusters_range = np.linspace(2, n_samples, 10).astype(int)\n\nplt.figure(2)\n\nplots = []\nnames = []\n\nfor marker, (score_name, score_func) in zip(\"d^vx.,\", score_funcs):\n scores = uniform_labelings_scores(score_func, n_samples, n_clusters_range)\n plots.append(\n plt.errorbar(\n n_clusters_range,\n np.median(scores, axis=1),\n scores.std(axis=1),\n alpha=0.8,\n linewidth=2,\n marker=marker,\n )[0]\n )\n names.append(score_name)\n\nplt.title(\n \"Clustering measures for 2 random uniform labelings\\nwith equal number of clusters\"\n)\nplt.xlabel(f\"Number of clusters (Number of samples is fixed to {n_samples})\")\nplt.ylabel(\"Score value\")\nplt.legend(plots, names)\nplt.ylim(bottom=-0.05, top=1.05)\nplt.show()"
`138`	`138`	`]`
`139`	`139`	`},`
`140`	`140`	`{`
`141`	`141`	`"cell_type": "markdown",`
`142`	`142`	`"metadata": {},`
`143`	`143`	`"source": [`
`144`		- "We observe similar results as for the first experiment: adjusted for chance\nmetrics stay constantly near zero while other metrics tend to get larger with\nfiner-grained labelings. The mean V-measure of random labeling increases\nsignificantly as the number of clusters is closer to the total number of\nsamples used to compute the measure. Furthermore, raw Mutual Information is\nunbounded from above and its scale depends on the dimensions of the clustering\nproblem and the cardinality of the ground truth classes.\n\nOnly adjusted measures can hence be safely used as a consensus index to\nevaluate the average stability of clustering algorithms for a given value of k\non various overlapping sub-samples of the dataset.\n\nNon-adjusted clustering evaluation metric can therefore be misleading as they\noutput large values for fine-grained labelings, one could be lead to think\nthat the labeling has captured meaningful groups while they can be totally\nrandom. In particular, such non-adjusted metrics should not be used to compare\nthe results of different clustering algorithms that output a different number\nof clusters.\n\n"
	`144`	+ "We observe similar results as for the first experiment: adjusted for chance\nmetrics stay constantly near zero while other metrics tend to get larger with\nfiner-grained labelings. The mean V-measure of random labeling increases\nsignificantly as the number of clusters is closer to the total number of\nsamples used to compute the measure. Furthermore, raw Mutual Information is\nunbounded from above and its scale depends on the dimensions of the clustering\nproblem and the cardinality of the ground truth classes. This is why the\ncurve goes off the chart.\n\nOnly adjusted measures can hence be safely used as a consensus index to\nevaluate the average stability of clustering algorithms for a given value of k\non various overlapping sub-samples of the dataset.\n\nNon-adjusted clustering evaluation metric can therefore be misleading as they\noutput large values for fine-grained labelings, one could be lead to think\nthat the labeling has captured meaningful groups while they can be totally\nrandom. In particular, such non-adjusted metrics should not be used to compare\nthe results of different clustering algorithms that output a different number\nof clusters.\n\n"
`145`	`145`	`]`
`146`	`146`	`}`
`147`	`147`	`],`