The Nearest Correlation Matrix

A correlation matrix is a symmetric matrix with unit diagonal and nonnegative eigenvalues. In 2000 I was approached by a London fund management company who wanted to find the nearest correlation matrix (NCM) in the Frobenius norm to an almost correlation matrix: a symmetric matrix having a significant number of (small) negative eigenvalues. This problem arises when the data from which the correlations are constructed is asynchronous or incomplete, or when models are stress-tested by artificially adjusting individual correlations. Solving the NCM problem (or obtaining a true correlation matrix some other way) is important in order to avoid subsequent calculations breaking down due to negative variances or volatilities, for example.

Algorithms

The convexity properties of the problem mean that there is a unique nearest correlation matrix, which is hence a global minimizer. In the 1990s several algorithms had been proposed for computing it, but none was guaranteed to work. Prompted by the approach from the company, I investigated the problem. I proved some results characterizing the solution and derived an alternating projections algorithm for computing it 1. The algorithm repeatedly projects onto the set of matrices with unit diagonal and the cone of symmetric positive semidefinite matrices. It is guaranteed to converge to the minimum, but does so at a linear rate. An important feature of the algorithm is that other projections can be added on. Thus, for example, if we want to leave the trailing principal submatrix of order three unchanged, we simply restore it at the end of each iteration 2, 3.

The alternating projections algorithm is widely used, but can be slow to converge, especially for large matrices 4. In 2006, Qi and Sun 5 derived a Newton method for the NCM problem. They work with the dual of the original problem, which is unconstrained. The objective function of the dual is not twice continuously differentiable, but by using the theory of strongly semismooth matrix functions Qi and Sun show that Newton’s method nevertheless has global quadratic convergence.

Ruediger Borsdorf and I, building on work in his M.Sc. thesis 3, built an algorithm that solves the Newton equations using minres with a Jacobi preconditioner (a nontrivial task since the coefficient matrix is not explicitly available), and has some other refinements described in 6. This algorithm has been implemented in the NAG Library 7.

In subsequent work, Borsdorf, Marcos Raydan and I 8 , 9 used the spectral projected gradient method (SPGM) to solve the k-factor NCM, in which the correlation matrix is constrained to have the form of a diagonal matrix plus a rank-k matrix. This problem variant arises in multifactor normal copula models, collateralized debt obligations (CDOs), and multivariate time series. One existing previous algorithm can fail to converge or solve the problem, but the SPGM has guaranteed convergence to a stationary point. This algorithm has also been implemented in the NAG Library.

The NCM problem has proved to be of very wide interest beyond the world of finance, as indicated by the fact that 1 is now my third best cited paper on the Web of Science. Recent applications in which the problem arises include reconstructing 20th century sea levels, genetic evaluations for thoroughbred horse breeding, modelling public health data sets, modelling storm damage of buildings, and a Kriging model for reservoirs.

Software

I regularly receive emails asking for software implementing algorithms for the NCM problem. I thought it would be useful to summarize what is available. In general, the Newton method is preferred, but the alternating projections method is more flexible as regards incorporating additional constraints.

Original NCM Problem

Alternating Projections Method

Newton Method

k-Factor NCM Problem

A MATLAB Alternating Projections Function

I thought it would be useful to provide my own MATLAB function nearcorr.m implementing the alternating projections algorithm. The listing is below. To see how it compares with the NAG code g02aa.m I ran the test code

%NEARCORR_TEST  Compare g02aa and nearcorr.

rng(10)                                  % Seed random number generators.
n = 100;
A = gallery('randcorr',n);               % Random correlation matrix. 
E  = randn(n)*1e-1;  A = A + (E + E')/2; % Perturb it.
tol = 1e-10;

% A = cor1399; tol = 1e-4;

fprintf('g02aa:\n')
maxits = int64(-1);  % For linear equation solver.
maxit = int64(-1);   % For Newton iteration.
tic
[~,X1,iter1,feval,nrmgrd,ifail] = g02aa(A,'errtol',tol,'maxits',maxits, ...
                                          'maxit',maxit);
toc

fprintf('  Newton steps taken: %d\n', iter1);
fprintf('  Norm of gradient of last Newton step: %6.4f\n', nrmgrd);
if ifail > 0, fprintf('  g02aa failed with ifail = %g\n', ifail), end

fprintf('nearcorr:\n')
tic
[X2,iter2] = nearcorr(A,tol,[],[],[],[],1);
toc
fprintf('  Number of iterations: %d\n', iter2);

fprintf('  Normwise relative difference between computed solutions:')
fprintf('%9.2e\n', norm(X1-X2,1)/norm(X1,1))

Running under Windows 7 on an Ivy Bridge Core i7 processor @4.4Ghz I obtained the following results, where the “real-life” matrix is based on stock data:

Matrix Code Time (secs) Iterations
1. Random (100), tol = 1e-10 g02aa 0.023 4
nearcorr 0.052 15
2. Random (500), tol = 1e-10 g02aa 0.48 4
nearcorr 3.01 26
3. Real-life (1399), tol = 1e-4 g02aa 6.8 5
nearcorr 100.6 68

The results show that while nearcorr can be fast for small dimensions, the number of iterations, and hence its run time, tends to increase with the dimension and it can be many times slower than the Newton method. This is a stark illustration of the difference between quadratic convergence and linear (with problem-dependent constant) convergence. Here is my MATLAB function nearcorr.m.

function [X,iter] = nearcorr(A,tol,flag,maxits,n_pos_eig,w,prnt)
%NEARCORR    Nearest correlation matrix.
%   X = NEARCORR(A,TOL,FLAG,MAXITS,N_POS_EIG,W,PRNT)
%   finds the nearest correlation matrix to the symmetric matrix A.
%   TOL is a convergence tolerance, which defaults to 16*EPS.
%   If using FLAG == 1, TOL must be a 2-vector, with first component
%   the convergence tolerance and second component a tolerance
%   for defining "sufficiently positive" eigenvalues.
%   FLAG = 0: solve using full eigendecomposition (EIG).
%   FLAG = 1: treat as "highly non-positive definite A" and solve
%             using partial eigendecomposition (EIGS).
%   MAXITS is the maximum number of iterations (default 100, but may
%   need to be increased).
%   N_POS_EIG (optional) is the known number of positive eigenvalues of A.
%   W is a vector defining a diagonal weight matrix diag(W).
%   PRNT = 1 for display of intermediate output.

%   By N. J. Higham, 13/6/01, updated 30/1/13, 15/11/14, 07/06/15.
%   Reference:  N. J. Higham, Computing the nearest correlation
%   matrix---A problem from finance. IMA J. Numer. Anal.,
%   22(3):329-343, 2002.

if ~isequal(A,A'), error('A must by symmetric.'), end
if nargin < 2 || isempty(tol), tol = length(A)*eps*[1 1]; end
if nargin < 3 || isempty(flag), flag = 0; end
if nargin < 4 || isempty(maxits), maxits = 100; end
if nargin < 6 || isempty(w), w = ones(length(A),1); end
if nargin < 7, prnt = 1; end

n = length(A);
if flag >= 1
   if nargin < 5 || isempty(n_pos_eig)
      [V,D] = eig(A); d = diag(D);
      n_pos_eig = sum(d >= tol(2)*d(n));
   end
   if prnt, fprintf('n = %g, n_pos_eig = %g\n', n, n_pos_eig), end
end

X = A; Y = A;
iter = 0;
rel_diffX = inf; rel_diffY = inf; rel_diffXY = inf;
dS = zeros(size(A));

w = w(:); Whalf = sqrt(w*w');

while max([rel_diffX rel_diffY rel_diffXY]) > tol(1)

   Xold = X;
   R = Y - dS;
   R_wtd = Whalf.*R;
   if flag == 0
      X = proj_spd(R_wtd);
   elseif flag == 1
      [X,np] = proj_spd_eigs(R_wtd,n_pos_eig,tol(2));
   end
   X = X ./ Whalf;
   dS = X - R;
   Yold = Y;
   Y = proj_unitdiag(X);
   rel_diffX = norm(X-Xold,'fro')/norm(X,'fro');
   rel_diffY = norm(Y-Yold,'fro')/norm(Y,'fro');
   rel_diffXY = norm(Y-X,'fro')/norm(Y,'fro');
   iter = iter + 1;
   if prnt
      fprintf('%2.0f:  %9.2e  %9.2e  %9.2e', ...
               iter, rel_diffX, rel_diffY, rel_diffXY)
      if flag >= 1, fprintf('  np = %g\n',np), else fprintf('\n'), end
   end
   if iter > maxits
       error(['Stopped after ' num2str(maxits) ' its. Try increasing MAXITS.'])
   end

end

%%%%%%%%%%%%%%%%%%%%%%%%
function A = proj_spd(A)
%PROJ_SPD

if ~isequal(A,A'), error('Not symmetric!'), end
[V,D] = eig(A);
A = V*diag(max(diag(D),0))*V';
A = (A+A')/2; % Ensure symmetry.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function [A,n_pos_eig_found] = proj_spd_eigs(A,n_pos_eig,tol)
%PROJ_SPD_EIGS

if ~isequal(A,A'), error('Not symmetric!'), end
k = n_pos_eig + 10; % 10 is safety factor.
if k > length(A), k = n_pos_eig; end
opts.disp = 0;
[V,D] = eigs(A,k,'LA',opts); d = diag(D);
j = (d > tol*max(d));
n_pos_eig_found = sum(j);
A = V(:,j)*D(j,j)*V(:,j)';  % Build using only the selected eigenpairs.
A = (A+A')/2; % Ensure symmetry.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function A = proj_unitdiag(A)
%PROJ_SPD

n = length(A);
A(1:n+1:n^2) = 1;

Updates

  • Links updated August 4, 2014.
  • nearcorr.m corrected November 15, 2014: iter was incorrectly initialized (thanks to Mike Croucher for pointing this out).
  • Added link to Mike Croucher’s Python alternating directions code, November 17, 2014.
  • Corrected an error in the convergence test, June 7, 2015. Effect on performance will be minimal (thanks to Nataša Strabić for pointing this out).

Footnotes:

1

Nicholas J. Higham, Computing the Nearest Correlation Matrix—A Problem from Finance, IMA J. Numer. Anal. 22, 329–343, 2002.

2

Craig Lucas, Computing Nearest Covariance and Correlation Matrices, M.Sc. Thesis, University of Manchester, 2001.

3

Ruediger Borsdorf, A Newton Algorithm for the Nearest Correlation Matrix, M.Sc. Thesis, University of Manchester, 2007.

4

Rene Escalante and Marcos Raydan, Alternating Projection Methods, SIAM, 2011.

5

Hou-Duo Qi and Defeng Sun, A Quadratically Convergent Newton Method for Computing the Nearest Correlation Matrix, SIAM J. Matrix Anal. Appl. 28, 360-385, 2006

6

Ruediger Borsdorf and Nicholas J. Higham, A Preconditioned Newton Algorithm for the Nearest Correlation Matrix, IMA J. Numer. Anal. 30, 94-107, 2010.

8

Ruediger Borsdorf, Nicholas Higham and Marcos Raydan, Computing a Nearest Correlation Matrix with Factor Structure, SIAM J. Matrix Anal., Appl. 31, 2603-2622, 2010

9

Ruediger Borsdorf, Structured Matrix Nearness Problems: Theory and Algorithms, Ph.D. Thesis, University of Manchester, 2012.

26 thoughts on “The Nearest Correlation Matrix

  1. Excellent post. One minor suggestion: change the Newton method to the Newton-CG method (a generic name). The point is that the computational cost of each Newton step is O(n^4), which is not affordable for large scale problems while the cost of each Newton-CG step is roughly in the order of O(n^3) given the nice structure of the NCM. — Defeng Sun

    1. Nick, I have a situation where I am unable to change the off-diagonals, I can only change the diagonals, that is, I can only add a diagonal matrix to my current matrix to make it positive definite. I only want to add just enough to the diagonal to make it positive definite and no more. Do you have any thoughts? I just posted the question on stack exchange but no responses so far,
      http://math.stackexchange.com/questions/665026/adding-elements-to-diagonal-of-symmetric-matrix-to-ensure-positive-definiteness

  2. Hi, I’m a student in a finance program at a university. We’re trying to apply the solutions presented here to an actual data set. However, we’re having problems finding an actual financial data set, whose correlation matrix would yield negative (small) eigenvalues. Can someone please suggest where to look?

    1. There is a nice small example in the MSc thesis by Craig Lucas, reference 2 above. You can apply the same principle to any (financial) data set that you have (delete some data, then compute the correlation matrix via the pairwise deletion method). While this does not guarantee that the computed approximate correlation matrix will be indefinite, after a bit of trial and error, you will get your test matrix. This also allows you to easily compare the output of the algorithm with the correlation matrix from the original (full) data set.

  3. In the nearcorr algorithm, is the convergence criteria sufficient if weights are being used? I can come up with examples where by utilizing weights, the algorithm would not converge under a low tolerance.

    1. I don’t have much experience with the weighted case. You could try modifying the relative differences to be weighted relative differences in this case if the existing test is not working well. Of course it may simply be that you need to increase the value of maxits. The alternating projections algorithm can be very slow to converge and the speed of convergence depends very much on the particular problem.

      1. Thanks for the reply. From a practical perspective, the weighted case is very useful. We may need to give larger or important lines of business more weight than others.

        I will try adjusting the relative differences.

        Thanks

  4. I would like to mention that the nearest correlation matrix also can be formulated and solved as a semidefinite optimization problem. The advantages of semidefinite optimization is that the problem can be solved in polynomial time and it is easy to include linear constraints on the computed matrix. The disadvantage is that can be an computational expensive way of solving the problem.

    See for example http://docs.mosek.com/7.0/matlabfusion/Nearest_correlation.html for a concrete implementation of that approach.

    Several software packages are available semidefinite optimization e.g. mosek, SeDuMi, SDPT3….

  5. Hi Dr. Nick,
    I am trying to control the weights applied to each correlation pair instead of using a diagonal weights matrix where I can only control all the correlations with row / column. Can i put individual weights to each pair? Will it break the logic? I am not sure. It would be great if you could clear my doubt.

      1. Sorry my bad! nearPD is a part of the matrix package in r 3.3.1. I was trying to install the nearPD package which is not available in r 3.3.1

  6. I am trying to write code in R for applying the alternating projections method as is described in this paper(http://www.maths.manchester.ac.uk/~higham/narep/narep369.pdf).
    In my case I am trying to write my own simpler code without using so many constraints as these
    that are used in the function “nearPD”.
    I would like just to project to the 2 sets ( S and U) and use the Dykstra’s correction but I dont have too much experience in R. Could anyone help me?
    Thanks in advance.

  7. Thank you Nick for this. Very helpful.
    I am working on reconstructing regulatory gene networks from expression data in the malaria vector, Anopheles gambiae, using gaussian graphical models, and am simulating some data to validate the algorithm. Cheers.

  8. Hi Dr. Nick,
    thank you for the extremely valuable content of this page.
    I am trying to deal with a performance problem arising during the Montecarlo simulation of a multivariate stochastic process with a state-dependent (almost) correlation matrix.
    The simulated vector is typically small, i.e. 3 to 10 dimensions. However, since the (almost) correlation matrix is state-dependent, I need to solve millions of NCM problems during the process simulation.
    For this reason I’d need an extremely fast algorithm for low dimensional matrices, a rarely discussed topic in the literature to my knowledge.
    Moreover, since the Montecarlo simulation runs on a GPU to optimize performances, an interesting feature for the NCM algorithm would be to have a fixed number of operations (i.e. avoid minimizers and loops).
    Do you have any suggestions on my problem or any literature on the subject to recommend?

    Thank you very much.

  9. Hi Nick,

    Thanks you for the great blog posts.

    I am currently facing a nearest *covariance* problem, rather than correlation. I have searched through many of your papers / Nag documentation, but couldn’t see any advice on this.

    Is there a simple way of adapting the solutions from the correlation setting?

    Thank you.

  10. Even though it’s already 2023, this post still proves to be very helpful to me! It truly is a post of exceptional quality!

Leave a comment