Skip to content

Commit 01897ef

Browse files
committed
Pushing the docs to dev/ for branch: master, commit 63a2f0a02bf155b161704da57a961c348f29fd7b
1 parent 19f63e7 commit 01897ef

File tree

1,070 files changed

+4887
-3283
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,070 files changed

+4887
-3283
lines changed
34 Bytes
Binary file not shown.
34 Bytes
Binary file not shown.

dev/_downloads/plot_sparse_cov.ipynb

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
"cell_type": "markdown",
1616
"metadata": {},
1717
"source": [
18-
"\n# Sparse inverse covariance estimation\n\n\nUsing the GraphLasso estimator to learn a covariance and sparse precision\nfrom a small number of samples.\n\nTo estimate a probabilistic model (e.g. a Gaussian model), estimating the\nprecision matrix, that is the inverse covariance matrix, is as important\nas estimating the covariance matrix. Indeed a Gaussian model is\nparametrized by the precision matrix.\n\nTo be in favorable recovery conditions, we sample the data from a model\nwith a sparse inverse covariance matrix. In addition, we ensure that the\ndata is not too much correlated (limiting the largest coefficient of the\nprecision matrix) and that there a no small coefficients in the\nprecision matrix that cannot be recovered. In addition, with a small\nnumber of observations, it is easier to recover a correlation matrix\nrather than a covariance, thus we scale the time series.\n\nHere, the number of samples is slightly larger than the number of\ndimensions, thus the empirical covariance is still invertible. However,\nas the observations are strongly correlated, the empirical covariance\nmatrix is ill-conditioned and as a result its inverse --the empirical\nprecision matrix-- is very far from the ground truth.\n\nIf we use l2 shrinkage, as with the Ledoit-Wolf estimator, as the number\nof samples is small, we need to shrink a lot. As a result, the\nLedoit-Wolf precision is fairly close to the ground truth precision, that\nis not far from being diagonal, but the off-diagonal structure is lost.\n\nThe l1-penalized estimator can recover part of this off-diagonal\nstructure. It learns a sparse precision. It is not able to\nrecover the exact sparsity pattern: it detects too many non-zero\ncoefficients. However, the highest non-zero coefficients of the l1\nestimated correspond to the non-zero coefficients in the ground truth.\nFinally, the coefficients of the l1 precision estimate are biased toward\nzero: because of the penalty, they are all smaller than the corresponding\nground truth value, as can be seen on the figure.\n\nNote that, the color range of the precision matrices is tweaked to\nimprove readability of the figure. The full range of values of the\nempirical precision is not displayed.\n\nThe alpha parameter of the GraphLasso setting the sparsity of the model is\nset by internal cross-validation in the GraphLassoCV. As can be\nseen on figure 2, the grid to compute the cross-validation score is\niteratively refined in the neighborhood of the maximum.\n\n"
18+
"\n# Sparse inverse covariance estimation\n\n\nUsing the GraphicalLasso estimator to learn a covariance and sparse precision\nfrom a small number of samples.\n\nTo estimate a probabilistic model (e.g. a Gaussian model), estimating the\nprecision matrix, that is the inverse covariance matrix, is as important\nas estimating the covariance matrix. Indeed a Gaussian model is\nparametrized by the precision matrix.\n\nTo be in favorable recovery conditions, we sample the data from a model\nwith a sparse inverse covariance matrix. In addition, we ensure that the\ndata is not too much correlated (limiting the largest coefficient of the\nprecision matrix) and that there a no small coefficients in the\nprecision matrix that cannot be recovered. In addition, with a small\nnumber of observations, it is easier to recover a correlation matrix\nrather than a covariance, thus we scale the time series.\n\nHere, the number of samples is slightly larger than the number of\ndimensions, thus the empirical covariance is still invertible. However,\nas the observations are strongly correlated, the empirical covariance\nmatrix is ill-conditioned and as a result its inverse --the empirical\nprecision matrix-- is very far from the ground truth.\n\nIf we use l2 shrinkage, as with the Ledoit-Wolf estimator, as the number\nof samples is small, we need to shrink a lot. As a result, the\nLedoit-Wolf precision is fairly close to the ground truth precision, that\nis not far from being diagonal, but the off-diagonal structure is lost.\n\nThe l1-penalized estimator can recover part of this off-diagonal\nstructure. It learns a sparse precision. It is not able to\nrecover the exact sparsity pattern: it detects too many non-zero\ncoefficients. However, the highest non-zero coefficients of the l1\nestimated correspond to the non-zero coefficients in the ground truth.\nFinally, the coefficients of the l1 precision estimate are biased toward\nzero: because of the penalty, they are all smaller than the corresponding\nground truth value, as can be seen on the figure.\n\nNote that, the color range of the precision matrices is tweaked to\nimprove readability of the figure. The full range of values of the\nempirical precision is not displayed.\n\nThe alpha parameter of the GraphicalLasso setting the sparsity of the model is\nset by internal cross-validation in the GraphicalLassoCV. As can be\nseen on figure 2, the grid to compute the cross-validation score is\niteratively refined in the neighborhood of the maximum.\n\n"
1919
]
2020
},
2121
{
@@ -26,7 +26,7 @@
2626
},
2727
"outputs": [],
2828
"source": [
29-
"print(__doc__)\n# author: Gael Varoquaux <[email protected]>\n# License: BSD 3 clause\n# Copyright: INRIA\n\nimport numpy as np\nfrom scipy import linalg\nfrom sklearn.datasets import make_sparse_spd_matrix\nfrom sklearn.covariance import GraphLassoCV, ledoit_wolf\nimport matplotlib.pyplot as plt\n\n# #############################################################################\n# Generate the data\nn_samples = 60\nn_features = 20\n\nprng = np.random.RandomState(1)\nprec = make_sparse_spd_matrix(n_features, alpha=.98,\n smallest_coef=.4,\n largest_coef=.7,\n random_state=prng)\ncov = linalg.inv(prec)\nd = np.sqrt(np.diag(cov))\ncov /= d\ncov /= d[:, np.newaxis]\nprec *= d\nprec *= d[:, np.newaxis]\nX = prng.multivariate_normal(np.zeros(n_features), cov, size=n_samples)\nX -= X.mean(axis=0)\nX /= X.std(axis=0)\n\n# #############################################################################\n# Estimate the covariance\nemp_cov = np.dot(X.T, X) / n_samples\n\nmodel = GraphLassoCV()\nmodel.fit(X)\ncov_ = model.covariance_\nprec_ = model.precision_\n\nlw_cov_, _ = ledoit_wolf(X)\nlw_prec_ = linalg.inv(lw_cov_)\n\n# #############################################################################\n# Plot the results\nplt.figure(figsize=(10, 6))\nplt.subplots_adjust(left=0.02, right=0.98)\n\n# plot the covariances\ncovs = [('Empirical', emp_cov), ('Ledoit-Wolf', lw_cov_),\n ('GraphLasso', cov_), ('True', cov)]\nvmax = cov_.max()\nfor i, (name, this_cov) in enumerate(covs):\n plt.subplot(2, 4, i + 1)\n plt.imshow(this_cov, interpolation='nearest', vmin=-vmax, vmax=vmax,\n cmap=plt.cm.RdBu_r)\n plt.xticks(())\n plt.yticks(())\n plt.title('%s covariance' % name)\n\n\n# plot the precisions\nprecs = [('Empirical', linalg.inv(emp_cov)), ('Ledoit-Wolf', lw_prec_),\n ('GraphLasso', prec_), ('True', prec)]\nvmax = .9 * prec_.max()\nfor i, (name, this_prec) in enumerate(precs):\n ax = plt.subplot(2, 4, i + 5)\n plt.imshow(np.ma.masked_equal(this_prec, 0),\n interpolation='nearest', vmin=-vmax, vmax=vmax,\n cmap=plt.cm.RdBu_r)\n plt.xticks(())\n plt.yticks(())\n plt.title('%s precision' % name)\n ax.set_axis_bgcolor('.7')\n\n# plot the model selection metric\nplt.figure(figsize=(4, 3))\nplt.axes([.2, .15, .75, .7])\nplt.plot(model.cv_alphas_, np.mean(model.grid_scores_, axis=1), 'o-')\nplt.axvline(model.alpha_, color='.5')\nplt.title('Model selection')\nplt.ylabel('Cross-validation score')\nplt.xlabel('alpha')\n\nplt.show()"
29+
"print(__doc__)\n# author: Gael Varoquaux <[email protected]>\n# License: BSD 3 clause\n# Copyright: INRIA\n\nimport numpy as np\nfrom scipy import linalg\nfrom sklearn.datasets import make_sparse_spd_matrix\nfrom sklearn.covariance import GraphicalLassoCV, ledoit_wolf\nimport matplotlib.pyplot as plt\n\n# #############################################################################\n# Generate the data\nn_samples = 60\nn_features = 20\n\nprng = np.random.RandomState(1)\nprec = make_sparse_spd_matrix(n_features, alpha=.98,\n smallest_coef=.4,\n largest_coef=.7,\n random_state=prng)\ncov = linalg.inv(prec)\nd = np.sqrt(np.diag(cov))\ncov /= d\ncov /= d[:, np.newaxis]\nprec *= d\nprec *= d[:, np.newaxis]\nX = prng.multivariate_normal(np.zeros(n_features), cov, size=n_samples)\nX -= X.mean(axis=0)\nX /= X.std(axis=0)\n\n# #############################################################################\n# Estimate the covariance\nemp_cov = np.dot(X.T, X) / n_samples\n\nmodel = GraphicalLassoCV()\nmodel.fit(X)\ncov_ = model.covariance_\nprec_ = model.precision_\n\nlw_cov_, _ = ledoit_wolf(X)\nlw_prec_ = linalg.inv(lw_cov_)\n\n# #############################################################################\n# Plot the results\nplt.figure(figsize=(10, 6))\nplt.subplots_adjust(left=0.02, right=0.98)\n\n# plot the covariances\ncovs = [('Empirical', emp_cov), ('Ledoit-Wolf', lw_cov_),\n ('GraphicalLassoCV', cov_), ('True', cov)]\nvmax = cov_.max()\nfor i, (name, this_cov) in enumerate(covs):\n plt.subplot(2, 4, i + 1)\n plt.imshow(this_cov, interpolation='nearest', vmin=-vmax, vmax=vmax,\n cmap=plt.cm.RdBu_r)\n plt.xticks(())\n plt.yticks(())\n plt.title('%s covariance' % name)\n\n\n# plot the precisions\nprecs = [('Empirical', linalg.inv(emp_cov)), ('Ledoit-Wolf', lw_prec_),\n ('GraphicalLasso', prec_), ('True', prec)]\nvmax = .9 * prec_.max()\nfor i, (name, this_prec) in enumerate(precs):\n ax = plt.subplot(2, 4, i + 5)\n plt.imshow(np.ma.masked_equal(this_prec, 0),\n interpolation='nearest', vmin=-vmax, vmax=vmax,\n cmap=plt.cm.RdBu_r)\n plt.xticks(())\n plt.yticks(())\n plt.title('%s precision' % name)\n ax.set_axis_bgcolor('.7')\n\n# plot the model selection metric\nplt.figure(figsize=(4, 3))\nplt.axes([.2, .15, .75, .7])\nplt.plot(model.cv_alphas_, np.mean(model.grid_scores_, axis=1), 'o-')\nplt.axvline(model.alpha_, color='.5')\nplt.title('Model selection')\nplt.ylabel('Cross-validation score')\nplt.xlabel('alpha')\n\nplt.show()"
3030
]
3131
}
3232
],

dev/_downloads/plot_sparse_cov.py

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
Sparse inverse covariance estimation
44
======================================
55
6-
Using the GraphLasso estimator to learn a covariance and sparse precision
6+
Using the GraphicalLasso estimator to learn a covariance and sparse precision
77
from a small number of samples.
88
99
To estimate a probabilistic model (e.g. a Gaussian model), estimating the
@@ -43,8 +43,8 @@
4343
improve readability of the figure. The full range of values of the
4444
empirical precision is not displayed.
4545
46-
The alpha parameter of the GraphLasso setting the sparsity of the model is
47-
set by internal cross-validation in the GraphLassoCV. As can be
46+
The alpha parameter of the GraphicalLasso setting the sparsity of the model is
47+
set by internal cross-validation in the GraphicalLassoCV. As can be
4848
seen on figure 2, the grid to compute the cross-validation score is
4949
iteratively refined in the neighborhood of the maximum.
5050
"""
@@ -56,7 +56,7 @@
5656
import numpy as np
5757
from scipy import linalg
5858
from sklearn.datasets import make_sparse_spd_matrix
59-
from sklearn.covariance import GraphLassoCV, ledoit_wolf
59+
from sklearn.covariance import GraphicalLassoCV, ledoit_wolf
6060
import matplotlib.pyplot as plt
6161

6262
# #############################################################################
@@ -83,7 +83,7 @@
8383
# Estimate the covariance
8484
emp_cov = np.dot(X.T, X) / n_samples
8585

86-
model = GraphLassoCV()
86+
model = GraphicalLassoCV()
8787
model.fit(X)
8888
cov_ = model.covariance_
8989
prec_ = model.precision_
@@ -98,7 +98,7 @@
9898

9999
# plot the covariances
100100
covs = [('Empirical', emp_cov), ('Ledoit-Wolf', lw_cov_),
101-
('GraphLasso', cov_), ('True', cov)]
101+
('GraphicalLassoCV', cov_), ('True', cov)]
102102
vmax = cov_.max()
103103
for i, (name, this_cov) in enumerate(covs):
104104
plt.subplot(2, 4, i + 1)
@@ -111,7 +111,7 @@
111111

112112
# plot the precisions
113113
precs = [('Empirical', linalg.inv(emp_cov)), ('Ledoit-Wolf', lw_prec_),
114-
('GraphLasso', prec_), ('True', prec)]
114+
('GraphicalLasso', prec_), ('True', prec)]
115115
vmax = .9 * prec_.max()
116116
for i, (name, this_prec) in enumerate(precs):
117117
ax = plt.subplot(2, 4, i + 5)

0 commit comments

Comments
 (0)