Skip to content

Commit a869c19

Browse files
committed
Pushing the docs to dev/ for branch: master, commit 3424f72469c0ece8cb71011c61860c1945904a2a
1 parent 2582daf commit a869c19

File tree

1,245 files changed

+4322
-4322
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,245 files changed

+4322
-4322
lines changed

dev/_downloads/303b136a5deb71b87475c966bb50d80d/plot_gmm_sin.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
"cell_type": "markdown",
1616
"metadata": {},
1717
"source": [
18-
"\n# Gaussian Mixture Model Sine Curve\n\n\nThis example demonstrates the behavior of Gaussian mixture models fit on data\nthat was not sampled from a mixture of Gaussian random variables. The dataset\nis formed by 100 points loosely spaced following a noisy sine curve. There is\ntherefore no ground truth value for the number of Gaussian components.\n\nThe first model is a classical Gaussian Mixture Model with 10 components fit\nwith the Expectation-Maximization algorithm.\n\nThe second model is a Bayesian Gaussian Mixture Model with a Dirichlet process\nprior fit with variational inference. The low value of the concentration prior\nmakes the model favor a lower number of active components. This models\n\"decides\" to focus its modeling power on the big picture of the structure of\nthe dataset: groups of points with alternating directions modeled by\nnon-diagonal covariance matrices. Those alternating directions roughly capture\nthe alternating nature of the original sine signal.\n\nThe third model is also a Bayesian Gaussian mixture model with a Dirichlet\nprocess prior but this time the value of the concentration prior is higher\ngiving the model more liberty to model the fine-grained structure of the data.\nThe result is a mixture with a larger number of active components that is\nsimilar to the first model where we arbitrarily decided to fix the number of\ncomponents to 10.\n\nWhich model is the best is a matter of subjective judgement: do we want to\nfavor models that only capture the big picture to summarize and explain most of\nthe structure of the data while ignoring the details or do we prefer models\nthat closely follow the high density regions of the signal?\n\nThe last two panels show how we can sample from the last two models. The\nresulting samples distributions do not look exactly like the original data\ndistribution. The difference primarily stems from the approximation error we\nmade by using a model that assumes that the data was generated by a finite\nnumber of Gaussian components instead of a continuous noisy sine curve.\n"
18+
"\n# Gaussian Mixture Model Sine Curve\n\n\nThis example demonstrates the behavior of Gaussian mixture models fit on data\nthat was not sampled from a mixture of Gaussian random variables. The dataset\nis formed by 100 points loosely spaced following a noisy sine curve. There is\ntherefore no ground truth value for the number of Gaussian components.\n\nThe first model is a classical Gaussian Mixture Model with 10 components fit\nwith the Expectation-Maximization algorithm.\n\nThe second model is a Bayesian Gaussian Mixture Model with a Dirichlet process\nprior fit with variational inference. The low value of the concentration prior\nmakes the model favor a lower number of active components. This models\n\"decides\" to focus its modeling power on the big picture of the structure of\nthe dataset: groups of points with alternating directions modeled by\nnon-diagonal covariance matrices. Those alternating directions roughly capture\nthe alternating nature of the original sine signal.\n\nThe third model is also a Bayesian Gaussian mixture model with a Dirichlet\nprocess prior but this time the value of the concentration prior is higher\ngiving the model more liberty to model the fine-grained structure of the data.\nThe result is a mixture with a larger number of active components that is\nsimilar to the first model where we arbitrarily decided to fix the number of\ncomponents to 10.\n\nWhich model is the best is a matter of subjective judgment: do we want to\nfavor models that only capture the big picture to summarize and explain most of\nthe structure of the data while ignoring the details or do we prefer models\nthat closely follow the high density regions of the signal?\n\nThe last two panels show how we can sample from the last two models. The\nresulting samples distributions do not look exactly like the original data\ndistribution. The difference primarily stems from the approximation error we\nmade by using a model that assumes that the data was generated by a finite\nnumber of Gaussian components instead of a continuous noisy sine curve.\n"
1919
]
2020
},
2121
{
Binary file not shown.

dev/_downloads/71d339c5f1e3408e8d01066ccfa20f3a/plot_gmm_sin.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@
2626
similar to the first model where we arbitrarily decided to fix the number of
2727
components to 10.
2828
29-
Which model is the best is a matter of subjective judgement: do we want to
29+
Which model is the best is a matter of subjective judgment: do we want to
3030
favor models that only capture the big picture to summarize and explain most of
3131
the structure of the data while ignoring the details or do we prefer models
3232
that closely follow the high density regions of the signal?
Binary file not shown.

dev/_downloads/scikit-learn-docs.pdf

-12.3 KB
Binary file not shown.

dev/_images/iris.png

0 Bytes

0 commit comments

Comments
 (0)