Skip to content

Commit c756a67

Browse files
committed
Pushing the docs to dev/ for branch: main, commit ca0862a9dbc5dadab2ccd30828de6de0c6f1f69d
1 parent 340c8dc commit c756a67

File tree

1,218 files changed

+4300
-4320
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,218 files changed

+4300
-4320
lines changed
Binary file not shown.

dev/_downloads/6d4f620ec6653356eb970c2a6ed62081/plot_calibration_curve.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,7 @@
8080
"cell_type": "markdown",
8181
"metadata": {},
8282
"source": [
83-
"Uncalibrated :class:`~sklearn.naive_bayes.GaussianNB` is poorly calibrated\nbecause of\nthe redundant features which violate the assumption of feature-independence\nand result in an overly confident classifier, which is indicated by the\ntypical transposed-sigmoid curve. Calibration of the probabilities of\n:class:`~sklearn.naive_bayes.GaussianNB` with `isotonic` can fix\nthis issue as can be seen from the nearly diagonal calibration curve.\n:ref:sigmoid regression `<sigmoid_regressor>` also improves calibration\nslightly,\nalbeit not as strongly as the non-parametric isotonic regression. This can be\nattributed to the fact that we have plenty of calibration data such that the\ngreater flexibility of the non-parametric model can be exploited.\n\nBelow we will make a quantitative analysis considering several classification\nmetrics: `brier_score_loss`, `log_loss`,\n`precision, recall, F1 score <precision_recall_f_measure_metrics>` and\n`ROC AUC <roc_metrics>`.\n\n"
83+
"Uncalibrated :class:`~sklearn.naive_bayes.GaussianNB` is poorly calibrated\nbecause of\nthe redundant features which violate the assumption of feature-independence\nand result in an overly confident classifier, which is indicated by the\ntypical transposed-sigmoid curve. Calibration of the probabilities of\n:class:`~sklearn.naive_bayes.GaussianNB` with `isotonic` can fix\nthis issue as can be seen from the nearly diagonal calibration curve.\n`Sigmoid regression <sigmoid_regressor>` also improves calibration\nslightly,\nalbeit not as strongly as the non-parametric isotonic regression. This can be\nattributed to the fact that we have plenty of calibration data such that the\ngreater flexibility of the non-parametric model can be exploited.\n\nBelow we will make a quantitative analysis considering several classification\nmetrics: `brier_score_loss`, `log_loss`,\n`precision, recall, F1 score <precision_recall_f_measure_metrics>` and\n`ROC AUC <roc_metrics>`.\n\n"
8484
]
8585
},
8686
{
Binary file not shown.

dev/_downloads/7c06490f380b1e20e9558c6c5fde70ed/plot_adjusted_for_chance_measures.ipynb

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
"cell_type": "markdown",
1616
"metadata": {},
1717
"source": [
18-
"\n# Adjustment for chance in clustering performance evaluation\n\nThis notebook explores the impact of uniformly-distributed random labeling on\nthe behavior of some clustering evaluation metrics. For such purpose, the\nmetrics are computed with a fixed number of samples and as a function of the number\nof clusters assigned by the estimator. The example is divided into two\nexperiments:\n\n- a first experiment with fixed \"ground truth labels\" (and therefore fixed\n number of classes) and randomly \"predicted labels\";\n- a second experiment with varying \"ground truth labels\", randomly \"predicted\n labels\". The \"predicted labels\" have the same number of classes and clusters\n as the \"ground truth labels\".\n"
18+
"\n# Adjustment for chance in clustering performance evaluation\nThis notebook explores the impact of uniformly-distributed random labeling on\nthe behavior of some clustering evaluation metrics. For such purpose, the\nmetrics are computed with a fixed number of samples and as a function of the number\nof clusters assigned by the estimator. The example is divided into two\nexperiments:\n\n- a first experiment with fixed \"ground truth labels\" (and therefore fixed\n number of classes) and randomly \"predicted labels\";\n- a second experiment with varying \"ground truth labels\", randomly \"predicted\n labels\". The \"predicted labels\" have the same number of classes and clusters\n as the \"ground truth labels\".\n"
1919
]
2020
},
2121
{
@@ -98,7 +98,7 @@
9898
},
9999
"outputs": [],
100100
"source": [
101-
"import matplotlib.pyplot as plt\nimport matplotlib.style as style\n\nn_samples = 1000\nn_classes = 10\nn_clusters_range = np.linspace(2, 100, 10).astype(int)\nplots = []\nnames = []\n\nstyle.use(\"seaborn-colorblind\")\nplt.figure(1)\n\nfor marker, (score_name, score_func) in zip(\"d^vx.,\", score_funcs):\n\n scores = fixed_classes_uniform_labelings_scores(\n score_func, n_samples, n_clusters_range, n_classes=n_classes\n )\n plots.append(\n plt.errorbar(\n n_clusters_range,\n scores.mean(axis=1),\n scores.std(axis=1),\n alpha=0.8,\n linewidth=1,\n marker=marker,\n )[0]\n )\n names.append(score_name)\n\nplt.title(\n \"Clustering measures for random uniform labeling\\n\"\n f\"against reference assignment with {n_classes} classes\"\n)\nplt.xlabel(f\"Number of clusters (Number of samples is fixed to {n_samples})\")\nplt.ylabel(\"Score value\")\nplt.ylim(bottom=-0.05, top=1.05)\nplt.legend(plots, names)\nplt.show()"
101+
"import matplotlib.pyplot as plt\nimport seaborn as sns\n\nn_samples = 1000\nn_classes = 10\nn_clusters_range = np.linspace(2, 100, 10).astype(int)\nplots = []\nnames = []\n\nsns.color_palette(\"colorblind\")\nplt.figure(1)\n\nfor marker, (score_name, score_func) in zip(\"d^vx.,\", score_funcs):\n scores = fixed_classes_uniform_labelings_scores(\n score_func, n_samples, n_clusters_range, n_classes=n_classes\n )\n plots.append(\n plt.errorbar(\n n_clusters_range,\n scores.mean(axis=1),\n scores.std(axis=1),\n alpha=0.8,\n linewidth=1,\n marker=marker,\n )[0]\n )\n names.append(score_name)\n\nplt.title(\n \"Clustering measures for random uniform labeling\\n\"\n f\"against reference assignment with {n_classes} classes\"\n)\nplt.xlabel(f\"Number of clusters (Number of samples is fixed to {n_samples})\")\nplt.ylabel(\"Score value\")\nplt.ylim(bottom=-0.05, top=1.05)\nplt.legend(plots, names, bbox_to_anchor=(0.5, 0.5))\nplt.show()"
102102
]
103103
},
104104
{
@@ -134,14 +134,14 @@
134134
},
135135
"outputs": [],
136136
"source": [
137-
"n_samples = 100\nn_clusters_range = np.linspace(2, n_samples, 10).astype(int)\n\nplt.figure(2)\n\nplots = []\nnames = []\n\nfor marker, (score_name, score_func) in zip(\"d^vx.,\", score_funcs):\n\n scores = uniform_labelings_scores(score_func, n_samples, n_clusters_range)\n plots.append(\n plt.errorbar(\n n_clusters_range,\n np.median(scores, axis=1),\n scores.std(axis=1),\n alpha=0.8,\n linewidth=2,\n marker=marker,\n )[0]\n )\n names.append(score_name)\n\nplt.title(\n \"Clustering measures for 2 random uniform labelings\\nwith equal number of clusters\"\n)\nplt.xlabel(f\"Number of clusters (Number of samples is fixed to {n_samples})\")\nplt.ylabel(\"Score value\")\nplt.legend(plots, names)\nplt.ylim(bottom=-0.05, top=1.05)\nplt.show()"
137+
"n_samples = 100\nn_clusters_range = np.linspace(2, n_samples, 10).astype(int)\n\nplt.figure(2)\n\nplots = []\nnames = []\n\nfor marker, (score_name, score_func) in zip(\"d^vx.,\", score_funcs):\n scores = uniform_labelings_scores(score_func, n_samples, n_clusters_range)\n plots.append(\n plt.errorbar(\n n_clusters_range,\n np.median(scores, axis=1),\n scores.std(axis=1),\n alpha=0.8,\n linewidth=2,\n marker=marker,\n )[0]\n )\n names.append(score_name)\n\nplt.title(\n \"Clustering measures for 2 random uniform labelings\\nwith equal number of clusters\"\n)\nplt.xlabel(f\"Number of clusters (Number of samples is fixed to {n_samples})\")\nplt.ylabel(\"Score value\")\nplt.legend(plots, names)\nplt.ylim(bottom=-0.05, top=1.05)\nplt.show()"
138138
]
139139
},
140140
{
141141
"cell_type": "markdown",
142142
"metadata": {},
143143
"source": [
144-
"We observe similar results as for the first experiment: adjusted for chance\nmetrics stay constantly near zero while other metrics tend to get larger with\nfiner-grained labelings. The mean V-measure of random labeling increases\nsignificantly as the number of clusters is closer to the total number of\nsamples used to compute the measure. Furthermore, raw Mutual Information is\nunbounded from above and its scale depends on the dimensions of the clustering\nproblem and the cardinality of the ground truth classes.\n\nOnly adjusted measures can hence be safely used as a consensus index to\nevaluate the average stability of clustering algorithms for a given value of k\non various overlapping sub-samples of the dataset.\n\nNon-adjusted clustering evaluation metric can therefore be misleading as they\noutput large values for fine-grained labelings, one could be lead to think\nthat the labeling has captured meaningful groups while they can be totally\nrandom. In particular, such non-adjusted metrics should not be used to compare\nthe results of different clustering algorithms that output a different number\nof clusters.\n\n"
144+
"We observe similar results as for the first experiment: adjusted for chance\nmetrics stay constantly near zero while other metrics tend to get larger with\nfiner-grained labelings. The mean V-measure of random labeling increases\nsignificantly as the number of clusters is closer to the total number of\nsamples used to compute the measure. Furthermore, raw Mutual Information is\nunbounded from above and its scale depends on the dimensions of the clustering\nproblem and the cardinality of the ground truth classes. This is why the\ncurve goes off the chart.\n\nOnly adjusted measures can hence be safely used as a consensus index to\nevaluate the average stability of clustering algorithms for a given value of k\non various overlapping sub-samples of the dataset.\n\nNon-adjusted clustering evaluation metric can therefore be misleading as they\noutput large values for fine-grained labelings, one could be lead to think\nthat the labeling has captured meaningful groups while they can be totally\nrandom. In particular, such non-adjusted metrics should not be used to compare\nthe results of different clustering algorithms that output a different number\nof clusters.\n\n"
145145
]
146146
}
147147
],

dev/_downloads/85db957603c93bd3e0a4265ea6565b13/plot_calibration_curve.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -124,7 +124,7 @@
124124
# typical transposed-sigmoid curve. Calibration of the probabilities of
125125
# :class:`~sklearn.naive_bayes.GaussianNB` with :ref:`isotonic` can fix
126126
# this issue as can be seen from the nearly diagonal calibration curve.
127-
# :ref:sigmoid regression `<sigmoid_regressor>` also improves calibration
127+
# :ref:`Sigmoid regression <sigmoid_regressor>` also improves calibration
128128
# slightly,
129129
# albeit not as strongly as the non-parametric isotonic regression. This can be
130130
# attributed to the fact that we have plenty of calibration data such that the

dev/_downloads/f641388d3d28570ff40709693f9cb7ca/plot_adjusted_for_chance_measures.py

Lines changed: 5 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,6 @@
22
==========================================================
33
Adjustment for chance in clustering performance evaluation
44
==========================================================
5-
65
This notebook explores the impact of uniformly-distributed random labeling on
76
the behavior of some clustering evaluation metrics. For such purpose, the
87
metrics are computed with a fixed number of samples and as a function of the number
@@ -14,7 +13,6 @@
1413
- a second experiment with varying "ground truth labels", randomly "predicted
1514
labels". The "predicted labels" have the same number of classes and clusters
1615
as the "ground truth labels".
17-
1816
"""
1917

2018
# Author: Olivier Grisel <[email protected]>
@@ -109,19 +107,18 @@ def fixed_classes_uniform_labelings_scores(
109107
# `n_clusters_range`.
110108

111109
import matplotlib.pyplot as plt
112-
import matplotlib.style as style
110+
import seaborn as sns
113111

114112
n_samples = 1000
115113
n_classes = 10
116114
n_clusters_range = np.linspace(2, 100, 10).astype(int)
117115
plots = []
118116
names = []
119117

120-
style.use("seaborn-colorblind")
118+
sns.color_palette("colorblind")
121119
plt.figure(1)
122120

123121
for marker, (score_name, score_func) in zip("d^vx.,", score_funcs):
124-
125122
scores = fixed_classes_uniform_labelings_scores(
126123
score_func, n_samples, n_clusters_range, n_classes=n_classes
127124
)
@@ -144,7 +141,7 @@ def fixed_classes_uniform_labelings_scores(
144141
plt.xlabel(f"Number of clusters (Number of samples is fixed to {n_samples})")
145142
plt.ylabel("Score value")
146143
plt.ylim(bottom=-0.05, top=1.05)
147-
plt.legend(plots, names)
144+
plt.legend(plots, names, bbox_to_anchor=(0.5, 0.5))
148145
plt.show()
149146

150147
# %%
@@ -189,7 +186,6 @@ def uniform_labelings_scores(score_func, n_samples, n_clusters_range, n_runs=5):
189186
names = []
190187

191188
for marker, (score_name, score_func) in zip("d^vx.,", score_funcs):
192-
193189
scores = uniform_labelings_scores(score_func, n_samples, n_clusters_range)
194190
plots.append(
195191
plt.errorbar(
@@ -219,7 +215,8 @@ def uniform_labelings_scores(score_func, n_samples, n_clusters_range, n_runs=5):
219215
# significantly as the number of clusters is closer to the total number of
220216
# samples used to compute the measure. Furthermore, raw Mutual Information is
221217
# unbounded from above and its scale depends on the dimensions of the clustering
222-
# problem and the cardinality of the ground truth classes.
218+
# problem and the cardinality of the ground truth classes. This is why the
219+
# curve goes off the chart.
223220
#
224221
# Only adjusted measures can hence be safely used as a consensus index to
225222
# evaluate the average stability of clustering algorithms for a given value of k

0 commit comments

Comments
 (0)