Update of Catalogue of Software for Matrix Functions

funm_ex.jpg

Edvin Hopkins and I have updated to version 3.0 the catalogue of software for matrix functions that we originally produced in 2014 and updated in 2016. It covers what is available in various languages (C++, Fortran, Java, Julia, Python, Rust), problem solving environments (GNU Octave, Maple, Mathematica, MATLAB and associated toolboxes, R, Scilab), and libraries (Armadillo, GNU Scientific Library, NAG Library, SLEPc, SLICOT).

Here are some highlights of changes in the last four years that are reflected in the new version.

  • Several new MATLAB third-party functions have been written, by various authors, notably for f(A)b and for arbitrary precision evaluation of the exponential and logarithm.
  • Matrix function support in Julia has been expanded.
  • Armadillo, Rust, SLEPc, and Tensorflow are included in new entries.

In addition, all URLs and references have been updated.

Suggestions for inclusion in a future revision are welcome.

Updated Catalogue of Software for Matrix Functions

help-matfun.jpg
From “help matfun” in MATLAB.

Edvin Deadman and I have updated the catalogue of software for matrix functions that we produced in 2014 (and which was discussed in this post). The new version, which has undergone some minor reorganization, is available here. It covers what is available in languages (C++, Fortran, Java, Julia, Python), problem solving environments (GNU Octave, Maple, Mathematica, MATLAB, R, Scilab), and libraries (GNU Scientific Library, NAG Library, SLICOT).

nag-mark25.jpg
From NAG LIbrary Mark 25 News.

Here are some highlights of changes in the last two years that are reflected in the new version.

Other changes to the catalogue include these.

  • SLICOT has been added.
  • Two more R packages are included.

Suggestions for inclusion in a future revision are welcome.

Jeff Rohl’s Fortran TV Course

As a first year mathematics undergraduate at the University of Manchester in 1979, I had to choose one course from another department. Like the majority of students, I chose the Fortran Programming course CS151 provided for mathematics students by the Department of Computer Science.

The course tutor was Simon Lavington, who is now perhaps best known for his historical research into early British computers (and can be seen on this video about the Ferranti Atlas computer). It used a videotaped set of lectures by Jeff Rohl. Jeff was an Australian who had come to Manchester in 1960 to do a PhD on compilers with Tony Brooker. He became a Professor at UMIST in the early 1970s and returned to Australia in 1976 to found the Department of Computer Science at the University of Western Australia.

rohl73.jpg
rohl-jeff.jpg
Jeff Rohl

The ten black and white videos, Programming in Fortran (1973), were accompanied by a 124-page book of the same title, written by Jeff and published by the University of Manchester Press.

These were the early days of computing. The book talked about punched cards, which thankfully we students did not have to use, and employed flowcharts (which it called “flow diagrams”) to illustrate the logical flow of programs. The book included the complete Fortran 66 standard in an appendix—something that would be inconceivable with most languages of today!

rohl73-p2-9.jpg

rohl73-p5-5.jpg

Many years later I met Jeff while we were both visiting the Computer Science Department at Cornell University. He said that people regularly tell him that they learned Fortran from his book and lectures and that the videos were recorded in one continuous take. In this YouTube era it is easy to forget how innovative these early 1970s video lectures were.

Fortran is of course still around and has a large user community. Indeed it ranks 24th in the January 2016 version of the TIOBE Programming Community Index. For some context on its usage see my article Programming Languages: An Applied Mathematics View in The Princeton Companion to Applied Mathematics.

The most recent standard is Fortran 2008 and another revision is in preparation. An old joke goes “I don’t know what language we’ll be using in 50 years time, but it will be called Fortran.”

NAG-advert-920900-cartoon.jpg
From a 1990s advert for the NAG Fortran 90 compiler (http://www.nag.co.uk/nagware/np.asp).

I was sorry to discover that Jeff passed away in 2003.

Simon Lavington has kindly provided me with more information about the TV lecture courses—three in total—recorded by him and Jeff Rohl in the Department of Computer Science. I will write about these in a subsequent post.

I am grateful to Jeff’s son Andrew Rohl for providing the photo of Jeff above.

Cataloguing Software for Matrix Functions

I began working on functions of matrices just over thirty years ago, when I was an MSc student, my original interest being in the matrix square root. In those days relatively little research had been done on the topic and no software for evaluating matrix functions was generally available. Since then interest in matrix functions has grown greatly.

Functions of interest include the exponential, the logarithm, and real powers, along with all kinds of trigonometric functions and some less generally known functions such as the sign function and the unwinding function.

130410-1301-49-0762.jpg
Gil Strang at the Advances in Matrix Functions and Matrix Equations Workshop in Manchester, April 2013.

A large amount of software for evaluating matrix functions now exists, covering many languages (C++, Fortran, Julia, Python, …) and problem solving environments (GNU Octave, Maple, Mathematica, MATLAB, R, Scilab, …). Some of it is part of a core product or package, while other codes are available individually on researchers’ web sites. It is hard to keep up with what is available.

Edvin Deadman and I therefore decided to produce a catalogue of matrix function software, which is available in the form of MIMS EPrint 2014.8. We have organized the catalogue by language/package and documented the algorithms that are implemented. The EPrint also contains a summary of the rapidly growing number of applications in which matrix functions are used, including some that we discovered only recently, and a list of what we regard as the best current algorithms.

Producing the catalogue took more work than I expected. Many packages are rather poorly documented and we sometimes had to delve deep into documentation or source code in order to find out which algorithms had been implemented.

One thing our survey shows is that the most complete and up to date collection of codes for matrix functions currently available is that in the NAG Library, which contains over 40 codes all implementing state of the art algorithms. This is no accident. We recently completed a three year Knowledge Transfer Partnership (KTP) funded by the Technology Strategy Board, the University of Manchester, and EPSRC, whose purpose was to translate matrix function algorithms into software for the NAG Engine (the underlying code base from which all NAG products are built) and to embed processes and expertise in developing matrix functions software into the company. Edvin was the Associate employed on the project and wrote all the codes, which are in Fortran.

A video about the KTP project been made by the University of Manchester and more information about the project can be obtained from Edvin’s posts at the NAG blog:

We intend to update the catalogue from time to time and welcome notification of errors and omissions.

The Nearest Correlation Matrix

A correlation matrix is a symmetric matrix with unit diagonal and nonnegative eigenvalues. In 2000 I was approached by a London fund management company who wanted to find the nearest correlation matrix (NCM) in the Frobenius norm to an almost correlation matrix: a symmetric matrix having a significant number of (small) negative eigenvalues. This problem arises when the data from which the correlations are constructed is asynchronous or incomplete, or when models are stress-tested by artificially adjusting individual correlations. Solving the NCM problem (or obtaining a true correlation matrix some other way) is important in order to avoid subsequent calculations breaking down due to negative variances or volatilities, for example.

Algorithms

The convexity properties of the problem mean that there is a unique nearest correlation matrix, which is hence a global minimizer. In the 1990s several algorithms had been proposed for computing it, but none was guaranteed to work. Prompted by the approach from the company, I investigated the problem. I proved some results characterizing the solution and derived an alternating projections algorithm for computing it 1. The algorithm repeatedly projects onto the set of matrices with unit diagonal and the cone of symmetric positive semidefinite matrices. It is guaranteed to converge to the minimum, but does so at a linear rate. An important feature of the algorithm is that other projections can be added on. Thus, for example, if we want to leave the trailing principal submatrix of order three unchanged, we simply restore it at the end of each iteration 2, 3.

The alternating projections algorithm is widely used, but can be slow to converge, especially for large matrices 4. In 2006, Qi and Sun 5 derived a Newton method for the NCM problem. They work with the dual of the original problem, which is unconstrained. The objective function of the dual is not twice continuously differentiable, but by using the theory of strongly semismooth matrix functions Qi and Sun show that Newton’s method nevertheless has global quadratic convergence.

Ruediger Borsdorf and I, building on work in his M.Sc. thesis 3, built an algorithm that solves the Newton equations using minres with a Jacobi preconditioner (a nontrivial task since the coefficient matrix is not explicitly available), and has some other refinements described in 6. This algorithm has been implemented in the NAG Library 7.

In subsequent work, Borsdorf, Marcos Raydan and I 8 , 9 used the spectral projected gradient method (SPGM) to solve the k-factor NCM, in which the correlation matrix is constrained to have the form of a diagonal matrix plus a rank-k matrix. This problem variant arises in multifactor normal copula models, collateralized debt obligations (CDOs), and multivariate time series. One existing previous algorithm can fail to converge or solve the problem, but the SPGM has guaranteed convergence to a stationary point. This algorithm has also been implemented in the NAG Library.

The NCM problem has proved to be of very wide interest beyond the world of finance, as indicated by the fact that 1 is now my third best cited paper on the Web of Science. Recent applications in which the problem arises include reconstructing 20th century sea levels, genetic evaluations for thoroughbred horse breeding, modelling public health data sets, modelling storm damage of buildings, and a Kriging model for reservoirs.

Software

I regularly receive emails asking for software implementing algorithms for the NCM problem. I thought it would be useful to summarize what is available. In general, the Newton method is preferred, but the alternating projections method is more flexible as regards incorporating additional constraints.

Original NCM Problem

Alternating Projections Method

Newton Method

k-Factor NCM Problem

A MATLAB Alternating Projections Function

I thought it would be useful to provide my own MATLAB function nearcorr.m implementing the alternating projections algorithm. The listing is below. To see how it compares with the NAG code g02aa.m I ran the test code

%NEARCORR_TEST  Compare g02aa and nearcorr.

rng(10)                                  % Seed random number generators.
n = 100;
A = gallery('randcorr',n);               % Random correlation matrix. 
E  = randn(n)*1e-1;  A = A + (E + E')/2; % Perturb it.
tol = 1e-10;

% A = cor1399; tol = 1e-4;

fprintf('g02aa:\n')
maxits = int64(-1);  % For linear equation solver.
maxit = int64(-1);   % For Newton iteration.
tic
[~,X1,iter1,feval,nrmgrd,ifail] = g02aa(A,'errtol',tol,'maxits',maxits, ...
                                          'maxit',maxit);
toc

fprintf('  Newton steps taken: %d\n', iter1);
fprintf('  Norm of gradient of last Newton step: %6.4f\n', nrmgrd);
if ifail > 0, fprintf('  g02aa failed with ifail = %g\n', ifail), end

fprintf('nearcorr:\n')
tic
[X2,iter2] = nearcorr(A,tol,[],[],[],[],1);
toc
fprintf('  Number of iterations: %d\n', iter2);

fprintf('  Normwise relative difference between computed solutions:')
fprintf('%9.2e\n', norm(X1-X2,1)/norm(X1,1))

Running under Windows 7 on an Ivy Bridge Core i7 processor @4.4Ghz I obtained the following results, where the “real-life” matrix is based on stock data:

Matrix Code Time (secs) Iterations
1. Random (100), tol = 1e-10 g02aa 0.023 4
nearcorr 0.052 15
2. Random (500), tol = 1e-10 g02aa 0.48 4
nearcorr 3.01 26
3. Real-life (1399), tol = 1e-4 g02aa 6.8 5
nearcorr 100.6 68

The results show that while nearcorr can be fast for small dimensions, the number of iterations, and hence its run time, tends to increase with the dimension and it can be many times slower than the Newton method. This is a stark illustration of the difference between quadratic convergence and linear (with problem-dependent constant) convergence. Here is my MATLAB function nearcorr.m.

function [X,iter] = nearcorr(A,tol,flag,maxits,n_pos_eig,w,prnt)
%NEARCORR    Nearest correlation matrix.
%   X = NEARCORR(A,TOL,FLAG,MAXITS,N_POS_EIG,W,PRNT)
%   finds the nearest correlation matrix to the symmetric matrix A.
%   TOL is a convergence tolerance, which defaults to 16*EPS.
%   If using FLAG == 1, TOL must be a 2-vector, with first component
%   the convergence tolerance and second component a tolerance
%   for defining "sufficiently positive" eigenvalues.
%   FLAG = 0: solve using full eigendecomposition (EIG).
%   FLAG = 1: treat as "highly non-positive definite A" and solve
%             using partial eigendecomposition (EIGS).
%   MAXITS is the maximum number of iterations (default 100, but may
%   need to be increased).
%   N_POS_EIG (optional) is the known number of positive eigenvalues of A.
%   W is a vector defining a diagonal weight matrix diag(W).
%   PRNT = 1 for display of intermediate output.

%   By N. J. Higham, 13/6/01, updated 30/1/13, 15/11/14, 07/06/15.
%   Reference:  N. J. Higham, Computing the nearest correlation
%   matrix---A problem from finance. IMA J. Numer. Anal.,
%   22(3):329-343, 2002.

if ~isequal(A,A'), error('A must by symmetric.'), end
if nargin < 2 || isempty(tol), tol = length(A)*eps*[1 1]; end
if nargin < 3 || isempty(flag), flag = 0; end
if nargin < 4 || isempty(maxits), maxits = 100; end
if nargin < 6 || isempty(w), w = ones(length(A),1); end
if nargin < 7, prnt = 1; end

n = length(A);
if flag >= 1
   if nargin < 5 || isempty(n_pos_eig)
      [V,D] = eig(A); d = diag(D);
      n_pos_eig = sum(d >= tol(2)*d(n));
   end
   if prnt, fprintf('n = %g, n_pos_eig = %g\n', n, n_pos_eig), end
end

X = A; Y = A;
iter = 0;
rel_diffX = inf; rel_diffY = inf; rel_diffXY = inf;
dS = zeros(size(A));

w = w(:); Whalf = sqrt(w*w');

while max([rel_diffX rel_diffY rel_diffXY]) > tol(1)

   Xold = X;
   R = Y - dS;
   R_wtd = Whalf.*R;
   if flag == 0
      X = proj_spd(R_wtd);
   elseif flag == 1
      [X,np] = proj_spd_eigs(R_wtd,n_pos_eig,tol(2));
   end
   X = X ./ Whalf;
   dS = X - R;
   Yold = Y;
   Y = proj_unitdiag(X);
   rel_diffX = norm(X-Xold,'fro')/norm(X,'fro');
   rel_diffY = norm(Y-Yold,'fro')/norm(Y,'fro');
   rel_diffXY = norm(Y-X,'fro')/norm(Y,'fro');
   iter = iter + 1;
   if prnt
      fprintf('%2.0f:  %9.2e  %9.2e  %9.2e', ...
               iter, rel_diffX, rel_diffY, rel_diffXY)
      if flag >= 1, fprintf('  np = %g\n',np), else fprintf('\n'), end
   end
   if iter > maxits
       error(['Stopped after ' num2str(maxits) ' its. Try increasing MAXITS.'])
   end

end

%%%%%%%%%%%%%%%%%%%%%%%%
function A = proj_spd(A)
%PROJ_SPD

if ~isequal(A,A'), error('Not symmetric!'), end
[V,D] = eig(A);
A = V*diag(max(diag(D),0))*V';
A = (A+A')/2; % Ensure symmetry.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function [A,n_pos_eig_found] = proj_spd_eigs(A,n_pos_eig,tol)
%PROJ_SPD_EIGS

if ~isequal(A,A'), error('Not symmetric!'), end
k = n_pos_eig + 10; % 10 is safety factor.
if k > length(A), k = n_pos_eig; end
opts.disp = 0;
[V,D] = eigs(A,k,'LA',opts); d = diag(D);
j = (d > tol*max(d));
n_pos_eig_found = sum(j);
A = V(:,j)*D(j,j)*V(:,j)';  % Build using only the selected eigenpairs.
A = (A+A')/2; % Ensure symmetry.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function A = proj_unitdiag(A)
%PROJ_SPD

n = length(A);
A(1:n+1:n^2) = 1;

Updates

  • Links updated August 4, 2014.
  • nearcorr.m corrected November 15, 2014: iter was incorrectly initialized (thanks to Mike Croucher for pointing this out).
  • Added link to Mike Croucher’s Python alternating directions code, November 17, 2014.
  • Corrected an error in the convergence test, June 7, 2015. Effect on performance will be minimal (thanks to Nataša Strabić for pointing this out).

Footnotes:

1

Nicholas J. Higham, Computing the Nearest Correlation Matrix—A Problem from Finance, IMA J. Numer. Anal. 22, 329–343, 2002.

2

Craig Lucas, Computing Nearest Covariance and Correlation Matrices, M.Sc. Thesis, University of Manchester, 2001.

3

Ruediger Borsdorf, A Newton Algorithm for the Nearest Correlation Matrix, M.Sc. Thesis, University of Manchester, 2007.

4

Rene Escalante and Marcos Raydan, Alternating Projection Methods, SIAM, 2011.

5

Hou-Duo Qi and Defeng Sun, A Quadratically Convergent Newton Method for Computing the Nearest Correlation Matrix, SIAM J. Matrix Anal. Appl. 28, 360-385, 2006

6

Ruediger Borsdorf and Nicholas J. Higham, A Preconditioned Newton Algorithm for the Nearest Correlation Matrix, IMA J. Numer. Anal. 30, 94-107, 2010.

8

Ruediger Borsdorf, Nicholas Higham and Marcos Raydan, Computing a Nearest Correlation Matrix with Factor Structure, SIAM J. Matrix Anal., Appl. 31, 2603-2622, 2010

9

Ruediger Borsdorf, Structured Matrix Nearness Problems: Theory and Algorithms, Ph.D. Thesis, University of Manchester, 2012.