Sparse covariance estimation for Gaussian variables

% Joëlle Skaf - 04/24/08
% (a figure is generated)
%
% Suppose y \in\reals^n is a Gaussian random variable with zero mean and
% covariance matrix R = \Expect(yy^T), with sparse inverse S = R^{-1}
% (S_ij = 0 means that y_i and y_j are conditionally independent).
% We want to estimate the covariance matrix R based on N independent
% samples y1,...,yN drawn from the distribution, and using prior knowledge
% that S is sparse
% A good heuristic for estimating R is to solve the problem
%           maximize    logdet(S) - tr(SY) - lambda*sum(sum(abs(S)))
%           subject to  S >= 0
% where Y is the sample covariance of y1,...,yN, and lambda is a sparsity
% parameter to be chosen or tuned.
% A figure showing the sparsity (number of nonzeros) of S versus lambda
% is generated.

% Input data
randn('state',0);
n = 10;
N = 100;
Strue = sprandsym(n,0.5,0.01,1);
nnz_true = sum(Strue(:)>1e-4);
R = inv(full(Strue));
y_sample = sqrtm(R)*randn(n,N);
Y = cov(y_sample');
Nlambda = 20;
lambda = logspace(-2, 3, Nlambda);
nnz = zeros(1,Nlambda);

for i=1:Nlambda
    disp(['i = ' num2str(i) ', lambda(i) = ' num2str(lambda(i))]);
    % Maximum likelihood estimate of R^{-1}
    cvx_begin sdp
        variable S(n,n) symmetric
        maximize log_det(S) - trace(S*Y) - lambda(i)*sum(sum(abs(S)))
        S >= 0
    cvx_end
    nnz(i) = sum(S(:)>1e-4);
end

figure;
semilogx(lambda, nnz);
hold on;
semilogx(lambda, nnz_true*ones(1,Nlambda),'r');
xlabel('\lambda');
legend('nonzeros in S', 'nonzeros in R^{-1}');
i = 1, lambda(i) = 0.01
 
Successive approximation method to be employed.
   For improved efficiency, sedumi is solving the dual problem.
   sedumi will be called several times to refine the solution.
   Original size: 501 variables, 222 equality constraints
   1 exponentials add 8 variables, 5 equality constraints
-----------------------------------------------------------------
          Errors   
Act Centering    Conic    Status
-----------------------------------
1   2.739e+00  7.007e-01  Solved
1   7.415e-02  4.502e-04  Solved
1   9.096e-03  6.756e-06  Solved
1   1.134e-03  1.244e-07  Solved
1   1.317e-04  0.000e+00  Solved
-----------------------------------------------------------------
Status: Solved
Optimal value (cvx_optval): -31.3012
i = 2, lambda(i) = 0.01833
 
Successive approximation method to be employed.
   For improved efficiency, sedumi is solving the dual problem.
   sedumi will be called several times to refine the solution.
   Original size: 501 variables, 222 equality constraints
   1 exponentials add 8 variables, 5 equality constraints
-----------------------------------------------------------------
          Errors   
Act Centering    Conic    Status
-----------------------------------
1   2.743e+00  7.030e-01  Solved
1   7.377e-02  4.458e-04  Solved
1   9.051e-03  6.737e-06  Solved
1   1.129e-03  1.595e-07  Solved
1   1.410e-04  5.814e-08  Solved
-----------------------------------------------------------------
Status: Solved
Optimal value (cvx_optval): -31.3504
i = 3, lambda(i) = 0.033598
 
Successive approximation method to be employed.
   For improved efficiency, sedumi is solving the dual problem.
   sedumi will be called several times to refine the solution.
   Original size: 501 variables, 222 equality constraints
   1 exponentials add 8 variables, 5 equality constraints
-----------------------------------------------------------------
          Errors   
Act Centering    Conic    Status
-----------------------------------
1   2.751e+00  7.072e-01  Solved
1   7.312e-02  4.379e-04  Solved
1   8.964e-03  6.563e-06  Solved
1   1.118e-03  1.101e-07  Solved
1   1.394e-04  1.829e-09  Solved
-----------------------------------------------------------------
Status: Solved
Optimal value (cvx_optval): -31.4368
i = 4, lambda(i) = 0.061585
 
Successive approximation method to be employed.
   For improved efficiency, sedumi is solving the dual problem.
   sedumi will be called several times to refine the solution.
   Original size: 501 variables, 222 equality constraints
   1 exponentials add 8 variables, 5 equality constraints
-----------------------------------------------------------------
          Errors   
Act Centering    Conic    Status
-----------------------------------
1   2.764e+00  7.143e-01  Solved
1   7.201e-02  4.247e-04  Solved
1   8.839e-03  6.407e-06  Solved
1   1.102e-03  1.364e-07  Solved
1   1.320e-04  2.596e-09  Solved
-----------------------------------------------------------------
Status: Solved
Optimal value (cvx_optval): -31.5845
i = 5, lambda(i) = 0.11288
 
Successive approximation method to be employed.
   For improved efficiency, sedumi is solving the dual problem.
   sedumi will be called several times to refine the solution.
   Original size: 501 variables, 222 equality constraints
   1 exponentials add 8 variables, 5 equality constraints
-----------------------------------------------------------------
          Errors   
Act Centering    Conic    Status
-----------------------------------
1   2.785e+00  7.261e-01  Solved
1   7.014e-02  4.030e-04  Solved
1   8.614e-03  6.130e-06  Solved
1   1.074e-03  1.724e-07  Solved
1   1.311e-04  3.342e-09  Solved
-----------------------------------------------------------------
Status: Solved
Optimal value (cvx_optval): -31.8272
i = 6, lambda(i) = 0.20691
 
Successive approximation method to be employed.
   For improved efficiency, sedumi is solving the dual problem.
   sedumi will be called several times to refine the solution.
   Original size: 501 variables, 222 equality constraints
   1 exponentials add 8 variables, 5 equality constraints
-----------------------------------------------------------------
          Errors   
Act Centering    Conic    Status
-----------------------------------
1   2.818e+00  7.450e-01  Solved
1   6.711e-02  3.687e-04  Solved
1   8.248e-03  5.534e-06  Solved
1   1.029e-03  8.373e-08  Solved
-----------------------------------------------------------------
Status: Solved
Optimal value (cvx_optval): -32.2113
i = 7, lambda(i) = 0.37927
 
Successive approximation method to be employed.
   For improved efficiency, sedumi is solving the dual problem.
   sedumi will be called several times to refine the solution.
   Original size: 501 variables, 222 equality constraints
   1 exponentials add 8 variables, 5 equality constraints
-----------------------------------------------------------------
          Errors   
Act Centering    Conic    Status
-----------------------------------
1   2.870e+00  7.746e-01  Solved
1   6.224e-02  3.171e-04  Solved
1   7.659e-03  4.794e-06  Solved
1   9.556e-04  8.354e-08  Solved
1   1.195e-04  1.073e-08  Solved
-----------------------------------------------------------------
Status: Solved
Optimal value (cvx_optval): -32.8011
i = 8, lambda(i) = 0.69519
 
Successive approximation method to be employed.
   For improved efficiency, sedumi is solving the dual problem.
   sedumi will be called several times to refine the solution.
   Original size: 501 variables, 222 equality constraints
   1 exponentials add 8 variables, 5 equality constraints
-----------------------------------------------------------------
          Errors   
Act Centering    Conic    Status
-----------------------------------
1   2.946e+00  8.191e-01  Solved
1   5.465e-02  2.444e-04  Solved
1   6.738e-03  3.821e-06  Solved
1   8.408e-04  1.898e-07  Solved
1   9.273e-05  6.843e-09  Solved
1   8.694e-06  0.000e+00  Solved
-----------------------------------------------------------------
Status: Solved
Optimal value (cvx_optval): -33.6657
i = 9, lambda(i) = 1.2743
 
Successive approximation method to be employed.
   For improved efficiency, sedumi is solving the dual problem.
   sedumi will be called several times to refine the solution.
   Original size: 501 variables, 222 equality constraints
   1 exponentials add 8 variables, 5 equality constraints
-----------------------------------------------------------------
          Errors   
Act Centering    Conic    Status
-----------------------------------
1   3.052e+00  8.839e-01  Solved
1   4.310e-02  1.518e-04  Solved
1   5.330e-03  2.308e-06  Solved
1   6.655e-04  4.051e-08  Solved
1   7.645e-05  0.000e+00  Solved
-----------------------------------------------------------------
Status: Solved
Optimal value (cvx_optval): -34.876
i = 10, lambda(i) = 2.3357
 
Successive approximation method to be employed.
   For improved efficiency, sedumi is solving the dual problem.
   sedumi will be called several times to refine the solution.
   Original size: 501 variables, 222 equality constraints
   1 exponentials add 8 variables, 5 equality constraints
-----------------------------------------------------------------
          Errors   
Act Centering    Conic    Status
-----------------------------------
1   3.194e+00  9.750e-01  Solved
1   2.591e-02  5.477e-05  Solved
1   3.217e-03  8.400e-07  Solved
1   4.018e-04  1.823e-08  Solved
-----------------------------------------------------------------
Status: Solved
Optimal value (cvx_optval): -36.4981
i = 11, lambda(i) = 4.2813
 
Successive approximation method to be employed.
   For improved efficiency, sedumi is solving the dual problem.
   sedumi will be called several times to refine the solution.
   Original size: 501 variables, 222 equality constraints
   1 exponentials add 8 variables, 5 equality constraints
-----------------------------------------------------------------
          Errors   
Act Centering    Conic    Status
-----------------------------------
1   3.374e+00  1.099e+00  Solved
1   1.193e-03  1.111e-07  Solved
1   1.490e-04  1.188e-09  Solved
-----------------------------------------------------------------
Status: Solved
Optimal value (cvx_optval): -38.5618
i = 12, lambda(i) = 7.8476
 
Successive approximation method to be employed.
   For improved efficiency, sedumi is solving the dual problem.
   sedumi will be called several times to refine the solution.
   Original size: 501 variables, 222 equality constraints
   1 exponentials add 8 variables, 5 equality constraints
-----------------------------------------------------------------
          Errors   
Act Centering    Conic    Status
-----------------------------------
1   3.596e+00  1.262e+00  Solved
1   3.359e-02  9.264e-05  Solved
1   4.243e-03  1.481e-06  Solved
1   5.312e-04  6.969e-09  Solved
-----------------------------------------------------------------
Status: Solved
Optimal value (cvx_optval): -41.092
i = 13, lambda(i) = 14.3845
 
Successive approximation method to be employed.
   For improved efficiency, sedumi is solving the dual problem.
   sedumi will be called several times to refine the solution.
   Original size: 501 variables, 222 equality constraints
   1 exponentials add 8 variables, 5 equality constraints
-----------------------------------------------------------------
          Errors   
Act Centering    Conic    Status
-----------------------------------
1   3.859e+00  1.474e+00  Solved
1   8.126e-02  5.454e-04  Solved
1   1.044e-02  9.122e-06  Solved
1   1.311e-03  1.343e-07  Solved
-----------------------------------------------------------------
Status: Solved
Optimal value (cvx_optval): -44.1061
i = 14, lambda(i) = 26.3665
 
Successive approximation method to be employed.
   For improved efficiency, sedumi is solving the dual problem.
   sedumi will be called several times to refine the solution.
   Original size: 501 variables, 222 equality constraints
   1 exponentials add 8 variables, 5 equality constraints
-----------------------------------------------------------------
          Errors   
Act Centering    Conic    Status
-----------------------------------
1   4.176e+00  1.755e+00  Solved
1   1.482e-01  1.822e-03  Solved
1   1.940e-02  3.103e-05  Solved
1   2.440e-03  4.918e-07  Solved
1   3.049e-04  0.000e+00  Solved
-----------------------------------------------------------------
Status: Solved
Optimal value (cvx_optval): -47.7294
i = 15, lambda(i) = 48.3293
 
Successive approximation method to be employed.
   For improved efficiency, sedumi is solving the dual problem.
   sedumi will be called several times to refine the solution.
   Original size: 501 variables, 222 equality constraints
   1 exponentials add 8 variables, 5 equality constraints
-----------------------------------------------------------------
          Errors   
Act Centering    Conic    Status
-----------------------------------
1   4.548e+00  2.123e+00  Solved
1   2.397e-01  4.805e-03  Solved
1   3.217e-02  8.560e-05  Solved
1   4.072e-03  1.423e-06  Solved
1   5.102e-04  0.000e+00  Solved
-----------------------------------------------------------------
Status: Solved
Optimal value (cvx_optval): -51.9812
i = 16, lambda(i) = 88.5867
 
Successive approximation method to be employed.
   For improved efficiency, sedumi is solving the dual problem.
   sedumi will be called several times to refine the solution.
   Original size: 501 variables, 222 equality constraints
   1 exponentials add 8 variables, 5 equality constraints
-----------------------------------------------------------------
          Errors   
Act Centering    Conic    Status
-----------------------------------
1   4.969e+00  2.591e+00  Solved
1   3.611e-01  1.103e-02  Solved
1   5.034e-02  2.085e-04  Solved
1   6.393e-03  3.359e-06  Solved
1   4.940e-04  0.000e+00  Solved
-----------------------------------------------------------------
Status: Solved
Optimal value (cvx_optval): -56.7913
i = 17, lambda(i) = 162.3777
 
Successive approximation method to be employed.
   For improved efficiency, sedumi is solving the dual problem.
   sedumi will be called several times to refine the solution.
   Original size: 501 variables, 222 equality constraints
   1 exponentials add 8 variables, 5 equality constraints
-----------------------------------------------------------------
          Errors   
Act Centering    Conic    Status
-----------------------------------
1   5.429e+00  3.170e+00  Solved
1   5.146e-01  2.278e-02  Solved
1   7.513e-02  4.704e-04  Solved
1   9.608e-03  8.311e-06  Solved
1   1.204e-03  1.684e-07  Solved
-----------------------------------------------------------------
Status: Solved
Optimal value (cvx_optval): -62.0441
i = 18, lambda(i) = 297.6351
 
Successive approximation method to be employed.
   For improved efficiency, sedumi is solving the dual problem.
   sedumi will be called several times to refine the solution.
   Original size: 501 variables, 222 equality constraints
   1 exponentials add 8 variables, 5 equality constraints
-----------------------------------------------------------------
          Errors   
Act Centering    Conic    Status
-----------------------------------
1   5.917e+00  3.865e+00  Solved
0   0.000e+00  0.000e+00  Failed
0   0.000e+00S 0.000e+00S Failed
-----------------------------------------------------------------
Status: Solved
Optimal value (cvx_optval): -2.05492e+07
i = 19, lambda(i) = 545.5595
 
Successive approximation method to be employed.
   For improved efficiency, sedumi is solving the dual problem.
   sedumi will be called several times to refine the solution.
   Original size: 501 variables, 222 equality constraints
   1 exponentials add 8 variables, 5 equality constraints
-----------------------------------------------------------------
          Errors   
Act Centering    Conic    Status
-----------------------------------
1   6.421e+00  4.682e+00  Solved
1   5.551e-01  2.662e-02  Solved
1   8.148e-02  5.597e-04  Solved
1   1.057e-02  1.048e-05  Solved
1   1.338e-03  1.376e-07  Solved
-----------------------------------------------------------------
Status: Solved
Optimal value (cvx_optval): -73.3838
i = 20, lambda(i) = 1000
 
Successive approximation method to be employed.
   For improved efficiency, sedumi is solving the dual problem.
   sedumi will be called several times to refine the solution.
   Original size: 501 variables, 222 equality constraints
   1 exponentials add 8 variables, 5 equality constraints
-----------------------------------------------------------------
          Errors   
Act Centering    Conic    Status
-----------------------------------
1   6.938e+00  5.623e+00  Solved
1   6.186e-02  3.225e-04  Solved
1   7.926e-03  5.818e-06  Solved
1   9.939e-04  0.000e+00  Solved
-----------------------------------------------------------------
Status: Solved
Optimal value (cvx_optval): -79.28