Skip to content

Commit c756a67

Browse files
committed
Pushing the docs to dev/ for branch: main, commit ca0862a9dbc5dadab2ccd30828de6de0c6f1f69d
1 parent 340c8dc commit c756a67

File tree

1,218 files changed

+4300
-4320
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,218 files changed

+4300
-4320
lines changed
Binary file not shown.

dev/_downloads/6d4f620ec6653356eb970c2a6ed62081/plot_calibration_curve.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,7 @@
8080
"cell_type": "markdown",
8181
"metadata": {},
8282
"source": [
83-
"Uncalibrated :class:`~sklearn.naive_bayes.GaussianNB` is poorly calibrated\nbecause of\nthe redundant features which violate the assumption of feature-independence\nand result in an overly confident classifier, which is indicated by the\ntypical transposed-sigmoid curve. Calibration of the probabilities of\n:class:`~sklearn.naive_bayes.GaussianNB` with `isotonic` can fix\nthis issue as can be seen from the nearly diagonal calibration curve.\n:ref:sigmoid regression `<sigmoid_regressor>` also improves calibration\nslightly,\nalbeit not as strongly as the non-parametric isotonic regression. This can be\nattributed to the fact that we have plenty of calibration data such that the\ngreater flexibility of the non-parametric model can be exploited.\n\nBelow we will make a quantitative analysis considering several classification\nmetrics: `brier_score_loss`, `log_loss`,\n`precision, recall, F1 score <precision_recall_f_measure_metrics>` and\n`ROC AUC <roc_metrics>`.\n\n"
83+
"Uncalibrated :class:`~sklearn.naive_bayes.GaussianNB` is poorly calibrated\nbecause of\nthe redundant features which violate the assumption of feature-independence\nand result in an overly confident classifier, which is indicated by the\ntypical transposed-sigmoid curve. Calibration of the probabilities of\n:class:`~sklearn.naive_bayes.GaussianNB` with `isotonic` can fix\nthis issue as can be seen from the nearly diagonal calibration curve.\n`Sigmoid regression <sigmoid_regressor>` also improves calibration\nslightly,\nalbeit not as strongly as the non-parametric isotonic regression. This can be\nattributed to the fact that we have plenty of calibration data such that the\ngreater flexibility of the non-parametric model can be exploited.\n\nBelow we will make a quantitative analysis considering several classification\nmetrics: `brier_score_loss`, `log_loss`,\n`precision, recall, F1 score <precision_recall_f_measure_metrics>` and\n`ROC AUC <roc_metrics>`.\n\n"
8484
]
8585
},
8686
{
Binary file not shown.

dev/_downloads/7c06490f380b1e20e9558c6c5fde70ed/plot_adjusted_for_chance_measures.ipynb

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
"cell_type": "markdown",
1616
"metadata": {},
1717
"source": [
18-
"\n# Adjustment for chance in clustering performance evaluation\n\nThis notebook explores the impact of uniformly-distributed random labeling on\nthe behavior of some clustering evaluation metrics. For such purpose, the\nmetrics are computed with a fixed number of samples and as a function of the number\nof clusters assigned by the estimator. The example is divided into two\nexperiments:\n\n- a first experiment with fixed \"ground truth labels\" (and therefore fixed\n number of classes) and randomly \"predicted labels\";\n- a second experiment with varying \"ground truth labels\", randomly \"predicted\n labels\". The \"predicted labels\" have the same number of classes and clusters\n as the \"ground truth labels\".\n"
18+
"\n# Adjustment for chance in clustering performance evaluation\nThis notebook explores the impact of uniformly-distributed random labeling on\nthe behavior of some clustering evaluation metrics. For such purpose, the\nmetrics are computed with a fixed number of samples and as a function of the number\nof clusters assigned by the estimator. The example is divided into two\nexperiments:\n\n- a first experiment with fixed \"ground truth labels\" (and therefore fixed\n number of classes) and randomly \"predicted labels\";\n- a second experiment with varying \"ground truth labels\", randomly \"predicted\n labels\". The \"predicted labels\" have the same number of classes and clusters\n as the \"ground truth labels\".\n"
1919
]
2020
},
2121
{
@@ -98,7 +98,7 @@
9898
},
9999
"outputs": [],
100100
"source": [
101-
"import matplotlib.pyplot as plt\nimport matplotlib.style as style\n\nn_samples = 1000\nn_classes = 10\nn_clusters_range = np.linspace(2, 100, 10).astype(int)\nplots = []\nnames = []\n\nstyle.use(\"seaborn-colorblind\")\nplt.figure(1)\n\nfor marker, (score_name, score_func) in zip(\"d^vx.,\", score_funcs):\n\n scores = fixed_classes_uniform_labelings_scores(\n score_func, n_samples, n_clusters_range, n_classes=n_classes\n )\n plots.append(\n plt.errorbar(\n n_clusters_range,\n scores.mean(axis=1),\n scores.std(axis=1),\n alpha=0.8,\n linewidth=1,\n marker=marker,\n )[0]\n )\n names.append(score_name)\n\nplt.title(\n \"Clustering measures for random uniform labeling\\n\"\n f\"against reference assignment with {n_classes} classes\"\n)\nplt.xlabel(f\"Number of clusters (Number of samples is fixed to {n_samples})\")\nplt.ylabel(\"Score value\")\nplt.ylim(bottom=-0.05, top=1.05)\nplt.legend(plots, names)\nplt.show()"
101+
"import matplotlib.pyplot as plt\nimport seaborn as sns\n\nn_samples = 1000\nn_classes = 10\nn_clusters_range = np.linspace(2, 100, 10).astype(int)\nplots = []\nnames = []\n\nsns.color_palette(\"colorblind\")\nplt.figure(1)\n\nfor marker, (score_name, score_func) in zip(\"d^vx.,\", score_funcs):\n scores = fixed_classes_uniform_labelings_scores(\n score_func, n_samples, n_clusters_range, n_classes=n_classes\n )\n plots.append(\n plt.errorbar(\n n_clusters_range,\n scores.mean(axis=1),\n scores.std(axis=1),\n alpha=0.8,\n linewidth=1,\n marker=marker,\n )[0]\n )\n names.append(score_name)\n\nplt.title(\n \"Clustering measures for random uniform labeling\\n\"\n f\"against reference assignment with {n_classes} classes\"\n)\nplt.xlabel(f\"Number of clusters (Number of samples is fixed to {n_samples})\")\nplt.ylabel(\"Score value\")\nplt.ylim(bottom=-0.05, top=1.05)\nplt.legend(plots, names, bbox_to_anchor=(0.5, 0.5))\nplt.show()"
102102
]
103103
},
104104
{
@@ -134,14 +134,14 @@
134134
},
135135
"outputs": [],
136136
"source": [
137-
"n_samples = 100\nn_clusters_range = np.linspace(2, n_samples, 10).astype(int)\n\nplt.figure(2)\n\nplots = []\nnames = []\n\nfor marker, (score_name, score_func) in zip(\"d^vx.,\", score_funcs):\n\n scores = uniform_labelings_scores(score_func, n_samples, n_clusters_range)\n plots.append(\n plt.errorbar(\n n_clusters_range,\n np.median(scores, axis=1),\n scores.std(axis=1),\n alpha=0.8,\n linewidth=2,\n marker=marker,\n )[0]\n )\n names.append(score_name)\n\nplt.title(\n \"Clustering measures for 2 random uniform labelings\\nwith equal number of clusters\"\n)\nplt.xlabel(f\"Number of clusters (Number of samples is fixed to {n_samples})\")\nplt.ylabel(\"Score value\")\nplt.legend(plots, names)\nplt.ylim(bottom=-0.05, top=1.05)\nplt.show()"
137+
"n_samples = 100\nn_clusters_range = np.linspace(2, n_samples, 10).astype(int)\n\nplt.figure(2)\n\nplots = []\nnames = []\n\nfor marker, (score_name, score_func) in zip(\"d^vx.,\", score_funcs):\n scores = uniform_labelings_scores(score_func, n_samples, n_clusters_range)\n plots.append(\n plt.errorbar(\n n_clusters_range,\n np.median(scores, axis=1),\n scores.std(axis=1),\n alpha=0.8,\n linewidth=2,\n marker=marker,\n )[0]\n )\n names.append(score_name)\n\nplt.title(\n \"Clustering measures for 2 random uniform labelings\\nwith equal number of clusters\"\n)\nplt.xlabel(f\"Number of clusters (Number of samples is fixed to {n_samples})\")\nplt.ylabel(\"Score value\")\nplt.legend(plots, names)\nplt.ylim(bottom=-0.05, top=1.05)\nplt.show()"
138138
]
139139
},
140140
{
141141
"cell_type": "markdown",
142142
"metadata": {},
143143
"source": [
144-
"We observe similar results as for the first experiment: adjusted for chance\nmetrics stay constantly near zero while other metrics tend to get larger with\nfiner-grained labelings. The mean V-measure of random labeling increases\nsignificantly as the number of clusters is closer to the total number of\nsamples used to compute the measure. Furthermore, raw Mutual Information is\nunbounded from above and its scale depends on the dimensions of the clustering\nproblem and the cardinality of the ground truth classes.\n\nOnly adjusted measures can hence be safely used as a consensus index to\nevaluate the average stability of clustering algorithms for a given value of k\non various overlapping sub-samples of the dataset.\n\nNon-adjusted clustering evaluation metric can therefore be misleading as they\noutput large values for fine-grained labelings, one could be lead to think\nthat the labeling has captured meaningful groups while they can be totally\nrandom. In particular, such non-adjusted metrics should not be used to compare\nthe results of different clustering algorithms that output a different number\nof clusters.\n\n"
144+
"We observe similar results as for the first experiment: adjusted for chance\nmetrics stay constantly near zero while other metrics tend to get larger with\nfiner-grained labelings. The mean V-measure of random labeling increases\nsignificantly as the number of clusters is closer to the total number of\nsamples used to compute the measure. Furthermore, raw Mutual Information is\nunbounded from above and its scale depends on the dimensions of the clustering\nproblem and the cardinality of the ground truth classes. This is why the\ncurve goes off the chart.\n\nOnly adjusted measures can hence be safely used as a consensus index to\nevaluate the average stability of clustering algorithms for a given value of k\non various overlapping sub-samples of the dataset.\n\nNon-adjusted clustering evaluation metric can therefore be misleading as they\noutput large values for fine-grained labelings, one could be lead to think\nthat the labeling has captured meaningful groups while they can be totally\nrandom. In particular, such non-adjusted metrics should not be used to compare\nthe results of different clustering algorithms that output a different number\nof clusters.\n\n"
145145
]
146146
}
147147
],

0 commit comments

Comments
 (0)