Making Sense of Multivalued Matrix Functions with the Matrix Unwinding Function

Try the following quiz. Let A be an n\times n real or complex matrix. Consider the principal logarithm—the one for which \log z has imaginary part in (-\pi,\pi]—and define z^{t} = e^{t \log z} for t\in\mathbb{C} (an important special case being t = 1/p for an integer p) .

True or false:

  1. \log e^A = A for all A, in other words passing A through the exponential then the logarithm takes us on a round trip.
  2. (I-A^2)^{1/2} = (I-A)^{1/2}(I+A)^{1/2} for all A.
  3. (AB)^{t} = A^{t}B^{t} whenever A and B commute.

The answers are

  1. False. Yet e^{\log A} = A is always true.
  2. True. Yet the similar identity (A^2-I)^{1/2}=(A-I)^{1/2}(A+I)^{1/2} is false.
  3. False.

At first sight these results may seem rather strange. How can we understand them? If you take the viewpoint that each occurrence of \log and a power t in the above expressions stands for the families of all possible logarithms and powers then the identities are all true. But from a computational viewpoint we are usually concerned with a particular branch of each function, the principal branch, so equality cannot be taken for granted.

An excellent tool for understanding these identities is a new matrix function called the matrix unwinding function. This function is defined for any square matrix A by U(A) = (A - \log e^A )/(2\pi i), and it arises from the scalar unwinding number introduced by Corless, Hare and Jeffrey in 1996 1, 2. There is nothing special about A and B being matrices in this quiz; the answers are the same if they are scalars. But the matrix unwinding function neatly handles the extra subtleties of the matrix case.

Mary’s talk at the 2014 SIAM Annual Meeting in San Diego.

From the definition we have \log e^A = A + 2\pi i U(A), so the relation in the first quiz question is clearly valid when U(A) = 0, which is the case when the eigenvalues of A have imaginary parts lying on the interval (-\pi,\pi]. Each of the above identities can be understood by deriving an exact relation in which the unwinding function provides the discrepancy between the left and right-hand sides. For example,

(AB)^t = A^t B^t e^{-2\pi t i U(\log A + \log B)}.

Mary Aprahamian and I have recently published the paper The Matrix Unwinding Function, with an Application to Computing the Matrix Exponential, (SIAM J. Matrix. Anal. Appl., 35, 88-109, 2014), in which we introduce the matrix unwinding function and develop its many interesting properties. We analyze the identities discussed above, along with various others. Thanks to the University of Manchester’s Open Access funds, that paper is available for anyone to download from the SIAM website, using the given link.

The matrix unwinding function has another use. Note that e^A=e^{\log e^A}=e^{A-2\pi i U(A)} and the matrix A-2\pi i U(A) has eigenvalues with imaginary parts in (-\pi,\pi]. The scaling and squaring method for computing the matrix exponential is at its most efficient when A has norm of order 1, and this argument reduction operation tends to reduce the norm of A when A has eigenvalues with large imaginary part. In the paper we develop this argument reduction and show that it can lead to substantial computational savings.

How can we compute U(A)? The following incomplete MATLAB code implements the Schur algorithm developed in the paper. The full code is available.

function U = unwindm(A,flag)
%UNWINDM  Matrix unwinding function.
%   UNWINDM(A) is the matrix unwinding function of the square matrix A.

%   Reference: M. Aprahamian and N. J. Higham.
%   The matrix unwinding function, with an application to computing the
%   matrix exponential.  SIAM J. Matrix Anal. Appl., 35(1):88-109, 2014.

%   Mary Aprahamian and Nicholas J. Higham, 2013.

if nargin < 2, flag = 1; end
[Q,T] = schur(A,'complex');

ord = blocking(T);
[ord, ind] = swapping(ord);  % Gives the blocking.
ord = max(ord)-ord+1;        % Since ORDSCHUR puts highest index top left.
[Q,T] = ordschur(Q,T,ord);
U = Q * unwindm_tri(T) * Q';

function F = unwindm_tri(T)
%UNWINDM_tri   Unwinding matrix of upper triangular matrix.

n = length(T);
F = diag( unwind( diag(T) ) );

% Compute off-diagonal of F by scalar Parlett recurrence.
for j=2:n
     for i = j-1:-1:1
         if F(i,i) == F(j,j)
            F(i,j) = 0;        % We're within a diagonal block.
            s = T(i,j)*(F(i,i)-F(j,j));
            if j-i >= 2
               k = i+1:j-1;
               s = s + F(i,k)*T(k,j) - T(i,k)*F(k,j);
            F(i,j) = s/(T(i,i)-T(j,j));

function u = unwind(z)
%UNWIND  Unwinding number.
%   UNWIND(A) is the (scalar) unwinding number.

u = ceil( (imag(z) - pi)/(2*pi) );

... Other subfunctions omitted

Here is an example. As it illustrates, the unwinding matrix of a real matrix is usually pure imaginary.

>> A = [1 4; -1 1]*4, U = unwindm(A)
A =
     4    16
    -4     4
U =
   0.0000 + 0.0000i   0.0000 - 2.0000i
   0.0000 + 0.5000i  -0.0000 + 0.0000i
>> residual = A - logm(expm(A))
residual =
   -0.0000   12.5664
   -3.1416   -0.0000
>> residual - 2*pi*i*U
ans =
   1.0e-15 *
  -0.8882 + 0.0000i   0.0000 + 0.0000i
  -0.8882 + 0.0000i  -0.8882 + 0.3488i



Robert Corless and David Jeffrey, The Unwinding Number, SIGSAM Bull 30, 28-35, 1996


David Jeffrey, D. E. G. Hare and Robert Corless, Unwinding the Branches of the Lambert W Function, Math. Scientist 21, 1-7, 1996

Second Edition (2014) of Handbook of Linear Algebra edited by Hogben

One of the two or three largest books I have ever owned was recently delivered to me. The second edition of the Handbook of Linear Algebra, edited by Leslie Hogben (with the help of associate editors Richard Brualdi and G. W. (Pete) Stewart), comes in at over 1900 pages, 7cm thick and about half a kilogram. It is the same height and width, but much thicker than, the fourth edition of Golub and Van Loan’s Matrix Computations.


The second edition is substantially expanded from the 1400 page first edition of 2007, with 95 articles as opposed to the original 77. The table of contents and list of contributors is available at the book’s website.

The handbook aims to cover the major topics of linear algebra at both undergraduate and graduate level, as well as numerical linear algebra, combinatorial linear algebra, applications to different areas, and software.

The distinguished list of about 120 authors have produced articles in the CRC handbook style, which requires everything to be presented as a definition, a fact (without proof), an algorithm, or an example. As the author of the chapter on Functions of Matrices, I didn’t find this a natural style to write in, but one benefit is that it encourages the presentation of examples and the large number of illustrative examples is a characteristic feature of the book.

The 18 new chapters include

  • Tensors and Hypermatrices by Lek-Heng Lim
  • Matrix Polynomials by Joerg Liesen and Christian Mehl
  • Matrix Equations by Beatrice Meini
  • Invariant Subspaces by G. W. Stewart
  • Tournaments by T. S. Michael
  • Nonlinear Eigenvalue Problems by Heinrich Voss
  • Linear Algebra in Mathematical Population Biology and Epidemiology by Fred Brauer and Carlos Castillo-Chavez
  • Sage by Robert A. Bezer, Robert Bradshaw, Jason Grout, and William Stein

A notable absence from the applications chapters is network analysis, which in recent years has increasingly made use of linear algebra to define concepts such as centrality and communicability. However, it is impossible to cover every topic and in such a major project I would expect that some articles are invited but do not come to fruition by publication time.

The book is typeset in \LaTeX, like the first edition, but now using the Computer Modern fonts, which I feel give better readability than the font used previously.

A huge amount of thought has gone into the book. It has a 9 page initial section called Preliminaries that lists key definitions, a 51 page glossary, a 12 page notation index, and a 54 page main index.

For quite a while I was puzzled by index entries such as “50-12–17”. I eventually noticed that the second dash is an en-dash and realized that the notation means “pages 12 to 17 of article 50”. This should have been noted at the start of the index.

In fact my only serious criticism of the book is the index. It is simply too hard to find what you are looking for. For example, there is no entry for Gerhsgorin’s theorem, which appears on page 16-6. Nor is there one for Courant-Fischer, whose variational eigenvalue characterization theorem is on page 16-4. There is no index entry under “exponential”, but the matrix exponential appears under two other entries and they point to only one of the various pages where the exponential appears. The index entry for Loewner partial ordering points to Chapter 22, but the topic also has a substantial appearance in Section 9.5. Surprisingly, most of these problems were not present in the index to the first edition, which is also two pages longer!

Fortunately the glossary is effectively a high-level index with definitions of terms (and an interesting read in itself). So to get the best from the book use the glossary and index together!

An alternative book for reference is Bernstein’s Matrix Mathematics (second edition, 2009), which has an excellent 100+ page index, but no glossary. I am glad to have both books on my shelves (the first edition at home and the second edition at work, or vice versa—these books are too heavy to carry around!).

Overall, Leslie Hogben has done an outstanding job to produce a book of this size in a uniform style with such a high standard of editing and typesetting. Ideally one would have both the hard copy and the ebook version, so that one can search the latter. Unfortunately, the ebook appears to have the same relatively high list price as the hard copy (“unlimited access for $169.95”) and I could not see a special deal for buying both. Nevertheless, this is certainly a book to ask your library to order and maybe even to purchase yourself.