scikit-learn
diff --git a/‎dev/_downloads/07fcc19ba03226cd3d83d4e40ec44385/auto_examples_python.zip
21 Bytes b/‎dev/_downloads/07fcc19ba03226cd3d83d4e40ec44385/auto_examples_python.zip
21 Bytes
diff --git a/‎dev/_downloads/5bb71b0b2052531cacf3736b4d2b3a92/plot_face_compress.py
Lines changed: 7 additions & 7 deletions b/‎dev/_downloads/5bb71b0b2052531cacf3736b4d2b3a92/plot_face_compress.py
Lines changed: 7 additions & 7 deletions
diff --git a/‎dev/_downloads/6f1e7a639e0699d6164445b55e6c116d/auto_examples_jupyter.zip
21 Bytes b/‎dev/_downloads/6f1e7a639e0699d6164445b55e6c116d/auto_examples_jupyter.zip
21 Bytes
diff --git a/‎dev/_downloads/f52666c44d104a3e37802015751177fe/plot_face_compress.ipynb
Lines changed: 5 additions & 5 deletions b/‎dev/_downloads/f52666c44d104a3e37802015751177fe/plot_face_compress.ipynb
Lines changed: 5 additions & 5 deletions
diff --git a/‎dev/_downloads/scikit-learn-docs.zip
1.07 KB b/‎dev/_downloads/scikit-learn-docs.zip
1.07 KB
diff --git a/‎dev/_images/sphx_glr_plot_agglomerative_clustering_001.png
-46 Bytes b/‎dev/_images/sphx_glr_plot_agglomerative_clustering_001.png
-46 Bytes
diff --git a/‎dev/_images/sphx_glr_plot_agglomerative_clustering_002.png
179 Bytes b/‎dev/_images/sphx_glr_plot_agglomerative_clustering_002.png
179 Bytes
diff --git a/‎dev/_images/sphx_glr_plot_agglomerative_clustering_003.png
-18 Bytes b/‎dev/_images/sphx_glr_plot_agglomerative_clustering_003.png
-18 Bytes
diff --git a/‎dev/_images/sphx_glr_plot_agglomerative_clustering_004.png
209 Bytes b/‎dev/_images/sphx_glr_plot_agglomerative_clustering_004.png
209 Bytes
diff --git a/‎dev/_images/sphx_glr_plot_agglomerative_clustering_thumb.png
9 Bytes b/‎dev/_images/sphx_glr_plot_agglomerative_clustering_thumb.png
9 Bytes
@@ -36,7 +36,7 @@
 # %%
 # Thus the image is a 2D array of 768 pixels in height and 1024 pixels in width. Each
 # value is a 8-bit unsigned integer, which means that the image is encoded using 8
-# bits per pixel. The total memory usage of the image is 786 kilobytes (1 bytes equals
+# bits per pixel. The total memory usage of the image is 786 kilobytes (1 byte equals
 # 8 bits).
 #
 # Using 8-bit unsigned integer means that the image is encoded using 256 different
@@ -60,9 +60,9 @@
 #
 # The idea behind compression via vector quantization is to reduce the number of
 # gray levels to represent an image. For instance, we can use 8 values instead
-# of 256 values. Therefore, it means that we could efficiently use 1 bit instead
+# of 256 values. Therefore, it means that we could efficiently use 3 bits instead
 # of 8 bits to encode a single pixel and therefore reduce the memory usage by a
-# factor of 8. We will later discuss about this memory usage.
+# factor of approximately 2.5. We will later discuss about this memory usage.
 #
 # Encoding strategy
 # """""""""""""""""
@@ -91,7 +91,7 @@
 ax[1].set_xlabel("Pixel value")
 ax[1].set_ylabel("Count of pixels")
 ax[1].set_title("Sub-sampled distribution of the pixel values")
-_ = fig.suptitle("Raccoon face compressed using 1-bit and a uniform strategy")
+_ = fig.suptitle("Raccoon face compressed using 3 bits and a uniform strategy")
 
 # %%
 # Qualitatively, we can spot some small regions where we see the effect of the
@@ -136,7 +136,7 @@
 ax[1].set_xlabel("Pixel value")
 ax[1].set_ylabel("Number of pixels")
 ax[1].set_title("Distribution of the pixel values")
-_ = fig.suptitle("Raccoon face compressed using 1-bit and a K-means strategy")
+_ = fig.suptitle("Raccoon face compressed using 3 bits and a K-means strategy")
 
 # %%
 bin_edges = encoder.bin_edges_[0]
@@ -176,8 +176,8 @@
 # Indeed, the output of the :class:`~sklearn.preprocessing.KBinsDiscretizer` is
 # an array of 64-bit float. It means that it takes x8 more memory. However, we
 # use this 64-bit float representation to encode 8 values. Indeed, we will save
-# memory only if we cast the compressed image into an array of 1-bit integer. We
-# could use the method `numpy.ndarray.astype`. However, a 1-bit integer
+# memory only if we cast the compressed image into an array of 3-bits integers. We
+# could use the method `numpy.ndarray.astype`. However, a 3-bits integer
 # representation does not exist and to encode the 8 values, we would need to use
 # the 8-bit unsigned integer representation as well.
 #
 
@@ -51,7 +51,7 @@
       "cell_type": "markdown",
       "metadata": {},
       "source": [
-        "Thus the image is a 2D array of 768 pixels in height and 1024 pixels in width. Each\nvalue is a 8-bit unsigned integer, which means that the image is encoded using 8\nbits per pixel. The total memory usage of the image is 786 kilobytes (1 bytes equals\n8 bits).\n\nUsing 8-bit unsigned integer means that the image is encoded using 256 different\nshades of gray, at most. We can check the distribution of these values.\n\n"
+        "Thus the image is a 2D array of 768 pixels in height and 1024 pixels in width. Each\nvalue is a 8-bit unsigned integer, which means that the image is encoded using 8\nbits per pixel. The total memory usage of the image is 786 kilobytes (1 byte equals\n8 bits).\n\nUsing 8-bit unsigned integer means that the image is encoded using 256 different\nshades of gray, at most. We can check the distribution of these values.\n\n"
       ]
     },
     {
@@ -69,7 +69,7 @@
       "cell_type": "markdown",
       "metadata": {},
       "source": [
-        "## Compression via vector quantization\n\nThe idea behind compression via vector quantization is to reduce the number of\ngray levels to represent an image. For instance, we can use 8 values instead\nof 256 values. Therefore, it means that we could efficiently use 1 bit instead\nof 8 bits to encode a single pixel and therefore reduce the memory usage by a\nfactor of 8. We will later discuss about this memory usage.\n\n### Encoding strategy\n\nThe compression can be done using a\n:class:`~sklearn.preprocessing.KBinsDiscretizer`. We need to choose a strategy\nto define the 8 gray values to sub-sample. The simplest strategy is to define\nthem equally spaced, which correspond to setting `strategy=\"uniform\"`. From\nthe previous histogram, we know that this strategy is certainly not optimal.\n\n"
+        "## Compression via vector quantization\n\nThe idea behind compression via vector quantization is to reduce the number of\ngray levels to represent an image. For instance, we can use 8 values instead\nof 256 values. Therefore, it means that we could efficiently use 3 bits instead\nof 8 bits to encode a single pixel and therefore reduce the memory usage by a\nfactor of approximately 2.5. We will later discuss about this memory usage.\n\n### Encoding strategy\n\nThe compression can be done using a\n:class:`~sklearn.preprocessing.KBinsDiscretizer`. We need to choose a strategy\nto define the 8 gray values to sub-sample. The simplest strategy is to define\nthem equally spaced, which correspond to setting `strategy=\"uniform\"`. From\nthe previous histogram, we know that this strategy is certainly not optimal.\n\n"
       ]
     },
     {
@@ -80,7 +80,7 @@
       },
       "outputs": [],
       "source": [
-        "from sklearn.preprocessing import KBinsDiscretizer\n\nn_bins = 8\nencoder = KBinsDiscretizer(\n    n_bins=n_bins, encode=\"ordinal\", strategy=\"uniform\", random_state=0\n)\ncompressed_raccoon_uniform = encoder.fit_transform(raccoon_face.reshape(-1, 1)).reshape(\n    raccoon_face.shape\n)\n\nfig, ax = plt.subplots(ncols=2, figsize=(12, 4))\nax[0].imshow(compressed_raccoon_uniform, cmap=plt.cm.gray)\nax[0].axis(\"off\")\nax[0].set_title(\"Rendering of the image\")\nax[1].hist(compressed_raccoon_uniform.ravel(), bins=256)\nax[1].set_xlabel(\"Pixel value\")\nax[1].set_ylabel(\"Count of pixels\")\nax[1].set_title(\"Sub-sampled distribution of the pixel values\")\n_ = fig.suptitle(\"Raccoon face compressed using 1-bit and a uniform strategy\")"
+        "from sklearn.preprocessing import KBinsDiscretizer\n\nn_bins = 8\nencoder = KBinsDiscretizer(\n    n_bins=n_bins, encode=\"ordinal\", strategy=\"uniform\", random_state=0\n)\ncompressed_raccoon_uniform = encoder.fit_transform(raccoon_face.reshape(-1, 1)).reshape(\n    raccoon_face.shape\n)\n\nfig, ax = plt.subplots(ncols=2, figsize=(12, 4))\nax[0].imshow(compressed_raccoon_uniform, cmap=plt.cm.gray)\nax[0].axis(\"off\")\nax[0].set_title(\"Rendering of the image\")\nax[1].hist(compressed_raccoon_uniform.ravel(), bins=256)\nax[1].set_xlabel(\"Pixel value\")\nax[1].set_ylabel(\"Count of pixels\")\nax[1].set_title(\"Sub-sampled distribution of the pixel values\")\n_ = fig.suptitle(\"Raccoon face compressed using 3 bits and a uniform strategy\")"
       ]
     },
     {
@@ -127,7 +127,7 @@
       },
       "outputs": [],
       "source": [
-        "encoder = KBinsDiscretizer(\n    n_bins=n_bins, encode=\"ordinal\", strategy=\"kmeans\", random_state=0\n)\ncompressed_raccoon_kmeans = encoder.fit_transform(raccoon_face.reshape(-1, 1)).reshape(\n    raccoon_face.shape\n)\n\nfig, ax = plt.subplots(ncols=2, figsize=(12, 4))\nax[0].imshow(compressed_raccoon_kmeans, cmap=plt.cm.gray)\nax[0].axis(\"off\")\nax[0].set_title(\"Rendering of the image\")\nax[1].hist(compressed_raccoon_kmeans.ravel(), bins=256)\nax[1].set_xlabel(\"Pixel value\")\nax[1].set_ylabel(\"Number of pixels\")\nax[1].set_title(\"Distribution of the pixel values\")\n_ = fig.suptitle(\"Raccoon face compressed using 1-bit and a K-means strategy\")"
+        "encoder = KBinsDiscretizer(\n    n_bins=n_bins, encode=\"ordinal\", strategy=\"kmeans\", random_state=0\n)\ncompressed_raccoon_kmeans = encoder.fit_transform(raccoon_face.reshape(-1, 1)).reshape(\n    raccoon_face.shape\n)\n\nfig, ax = plt.subplots(ncols=2, figsize=(12, 4))\nax[0].imshow(compressed_raccoon_kmeans, cmap=plt.cm.gray)\nax[0].axis(\"off\")\nax[0].set_title(\"Rendering of the image\")\nax[1].hist(compressed_raccoon_kmeans.ravel(), bins=256)\nax[1].set_xlabel(\"Pixel value\")\nax[1].set_ylabel(\"Number of pixels\")\nax[1].set_title(\"Distribution of the pixel values\")\n_ = fig.suptitle(\"Raccoon face compressed using 3 bits and a K-means strategy\")"
       ]
     },
     {
@@ -192,7 +192,7 @@
       "cell_type": "markdown",
       "metadata": {},
       "source": [
-        "Indeed, the output of the :class:`~sklearn.preprocessing.KBinsDiscretizer` is\nan array of 64-bit float. It means that it takes x8 more memory. However, we\nuse this 64-bit float representation to encode 8 values. Indeed, we will save\nmemory only if we cast the compressed image into an array of 1-bit integer. We\ncould use the method `numpy.ndarray.astype`. However, a 1-bit integer\nrepresentation does not exist and to encode the 8 values, we would need to use\nthe 8-bit unsigned integer representation as well.\n\nIn practice, observing a memory gain would require the original image to be in\na 64-bit float representation.\n\n"
+        "Indeed, the output of the :class:`~sklearn.preprocessing.KBinsDiscretizer` is\nan array of 64-bit float. It means that it takes x8 more memory. However, we\nuse this 64-bit float representation to encode 8 values. Indeed, we will save\nmemory only if we cast the compressed image into an array of 3-bits integers. We\ncould use the method `numpy.ndarray.astype`. However, a 3-bits integer\nrepresentation does not exist and to encode the 8 values, we would need to use\nthe 8-bit unsigned integer representation as well.\n\nIn practice, observing a memory gain would require the original image to be in\na 64-bit float representation.\n\n"
       ]
     }
   ],
Original file line number	Diff line number	Diff line change
`@@ -51,7 +51,7 @@`
`51`	`51`	`"cell_type": "markdown",`
`52`	`52`	`"metadata": {},`
`53`	`53`	`"source": [`
`54`		`- "Thus the image is a 2D array of 768 pixels in height and 1024 pixels in width. Each\nvalue is a 8-bit unsigned integer, which means that the image is encoded using 8\nbits per pixel. The total memory usage of the image is 786 kilobytes (1 bytes equals\n8 bits).\n\nUsing 8-bit unsigned integer means that the image is encoded using 256 different\nshades of gray, at most. We can check the distribution of these values.\n\n"`
	`54`	`+ "Thus the image is a 2D array of 768 pixels in height and 1024 pixels in width. Each\nvalue is a 8-bit unsigned integer, which means that the image is encoded using 8\nbits per pixel. The total memory usage of the image is 786 kilobytes (1 byte equals\n8 bits).\n\nUsing 8-bit unsigned integer means that the image is encoded using 256 different\nshades of gray, at most. We can check the distribution of these values.\n\n"`
`55`	`55`	`]`
`56`	`56`	`},`
`57`	`57`	`{`
`@@ -69,7 +69,7 @@`
`69`	`69`	`"cell_type": "markdown",`
`70`	`70`	`"metadata": {},`
`71`	`71`	`"source": [`
`72`		- "## Compression via vector quantization\n\nThe idea behind compression via vector quantization is to reduce the number of\ngray levels to represent an image. For instance, we can use 8 values instead\nof 256 values. Therefore, it means that we could efficiently use 1 bit instead\nof 8 bits to encode a single pixel and therefore reduce the memory usage by a\nfactor of 8. We will later discuss about this memory usage.\n\n### Encoding strategy\n\nThe compression can be done using a\n:class:`~sklearn.preprocessing.KBinsDiscretizer`. We need to choose a strategy\nto define the 8 gray values to sub-sample. The simplest strategy is to define\nthem equally spaced, which correspond to setting `strategy=\"uniform\"`. From\nthe previous histogram, we know that this strategy is certainly not optimal.\n\n"
	`72`	+ "## Compression via vector quantization\n\nThe idea behind compression via vector quantization is to reduce the number of\ngray levels to represent an image. For instance, we can use 8 values instead\nof 256 values. Therefore, it means that we could efficiently use 3 bits instead\nof 8 bits to encode a single pixel and therefore reduce the memory usage by a\nfactor of approximately 2.5. We will later discuss about this memory usage.\n\n### Encoding strategy\n\nThe compression can be done using a\n:class:`~sklearn.preprocessing.KBinsDiscretizer`. We need to choose a strategy\nto define the 8 gray values to sub-sample. The simplest strategy is to define\nthem equally spaced, which correspond to setting `strategy=\"uniform\"`. From\nthe previous histogram, we know that this strategy is certainly not optimal.\n\n"
`73`	`73`	`]`
`74`	`74`	`},`
`75`	`75`	`{`
`@@ -80,7 +80,7 @@`
`80`	`80`	`},`
`81`	`81`	`"outputs": [],`
`82`	`82`	`"source": [`
`83`		- "from sklearn.preprocessing import KBinsDiscretizer\n\nn_bins = 8\nencoder = KBinsDiscretizer(\n n_bins=n_bins, encode=\"ordinal\", strategy=\"uniform\", random_state=0\n)\ncompressed_raccoon_uniform = encoder.fit_transform(raccoon_face.reshape(-1, 1)).reshape(\n raccoon_face.shape\n)\n\nfig, ax = plt.subplots(ncols=2, figsize=(12, 4))\nax[0].imshow(compressed_raccoon_uniform, cmap=plt.cm.gray)\nax[0].axis(\"off\")\nax[0].set_title(\"Rendering of the image\")\nax[1].hist(compressed_raccoon_uniform.ravel(), bins=256)\nax[1].set_xlabel(\"Pixel value\")\nax[1].set_ylabel(\"Count of pixels\")\nax[1].set_title(\"Sub-sampled distribution of the pixel values\")\n_ = fig.suptitle(\"Raccoon face compressed using 1-bit and a uniform strategy\")"
	`83`	+ "from sklearn.preprocessing import KBinsDiscretizer\n\nn_bins = 8\nencoder = KBinsDiscretizer(\n n_bins=n_bins, encode=\"ordinal\", strategy=\"uniform\", random_state=0\n)\ncompressed_raccoon_uniform = encoder.fit_transform(raccoon_face.reshape(-1, 1)).reshape(\n raccoon_face.shape\n)\n\nfig, ax = plt.subplots(ncols=2, figsize=(12, 4))\nax[0].imshow(compressed_raccoon_uniform, cmap=plt.cm.gray)\nax[0].axis(\"off\")\nax[0].set_title(\"Rendering of the image\")\nax[1].hist(compressed_raccoon_uniform.ravel(), bins=256)\nax[1].set_xlabel(\"Pixel value\")\nax[1].set_ylabel(\"Count of pixels\")\nax[1].set_title(\"Sub-sampled distribution of the pixel values\")\n_ = fig.suptitle(\"Raccoon face compressed using 3 bits and a uniform strategy\")"
`84`	`84`	`]`
`85`	`85`	`},`
`86`	`86`	`{`
`@@ -127,7 +127,7 @@`
`127`	`127`	`},`
`128`	`128`	`"outputs": [],`
`129`	`129`	`"source": [`
`130`		- "encoder = KBinsDiscretizer(\n n_bins=n_bins, encode=\"ordinal\", strategy=\"kmeans\", random_state=0\n)\ncompressed_raccoon_kmeans = encoder.fit_transform(raccoon_face.reshape(-1, 1)).reshape(\n raccoon_face.shape\n)\n\nfig, ax = plt.subplots(ncols=2, figsize=(12, 4))\nax[0].imshow(compressed_raccoon_kmeans, cmap=plt.cm.gray)\nax[0].axis(\"off\")\nax[0].set_title(\"Rendering of the image\")\nax[1].hist(compressed_raccoon_kmeans.ravel(), bins=256)\nax[1].set_xlabel(\"Pixel value\")\nax[1].set_ylabel(\"Number of pixels\")\nax[1].set_title(\"Distribution of the pixel values\")\n_ = fig.suptitle(\"Raccoon face compressed using 1-bit and a K-means strategy\")"
	`130`	+ "encoder = KBinsDiscretizer(\n n_bins=n_bins, encode=\"ordinal\", strategy=\"kmeans\", random_state=0\n)\ncompressed_raccoon_kmeans = encoder.fit_transform(raccoon_face.reshape(-1, 1)).reshape(\n raccoon_face.shape\n)\n\nfig, ax = plt.subplots(ncols=2, figsize=(12, 4))\nax[0].imshow(compressed_raccoon_kmeans, cmap=plt.cm.gray)\nax[0].axis(\"off\")\nax[0].set_title(\"Rendering of the image\")\nax[1].hist(compressed_raccoon_kmeans.ravel(), bins=256)\nax[1].set_xlabel(\"Pixel value\")\nax[1].set_ylabel(\"Number of pixels\")\nax[1].set_title(\"Distribution of the pixel values\")\n_ = fig.suptitle(\"Raccoon face compressed using 3 bits and a K-means strategy\")"
`131`	`131`	`]`
`132`	`132`	`},`
`133`	`133`	`{`
`@@ -192,7 +192,7 @@`
`192`	`192`	`"cell_type": "markdown",`
`193`	`193`	`"metadata": {},`
`194`	`194`	`"source": [`
`195`		- "Indeed, the output of the :class:`~sklearn.preprocessing.KBinsDiscretizer` is\nan array of 64-bit float. It means that it takes x8 more memory. However, we\nuse this 64-bit float representation to encode 8 values. Indeed, we will save\nmemory only if we cast the compressed image into an array of 1-bit integer. We\ncould use the method `numpy.ndarray.astype`. However, a 1-bit integer\nrepresentation does not exist and to encode the 8 values, we would need to use\nthe 8-bit unsigned integer representation as well.\n\nIn practice, observing a memory gain would require the original image to be in\na 64-bit float representation.\n\n"
	`195`	+ "Indeed, the output of the :class:`~sklearn.preprocessing.KBinsDiscretizer` is\nan array of 64-bit float. It means that it takes x8 more memory. However, we\nuse this 64-bit float representation to encode 8 values. Indeed, we will save\nmemory only if we cast the compressed image into an array of 3-bits integers. We\ncould use the method `numpy.ndarray.astype`. However, a 3-bits integer\nrepresentation does not exist and to encode the 8 values, we would need to use\nthe 8-bit unsigned integer representation as well.\n\nIn practice, observing a memory gain would require the original image to be in\na 64-bit float representation.\n\n"
`196`	`196`	`]`
`197`	`197`	`}`
`198`	`198`	`],`