What Is the Matrix Unwinding Function?

Care is needed when dealing with multivalued functions because identities that hold for positive scalars can fail in the complex plane. For example, it is not always true that (z-1)^{1/2}(z+1)^{1/2} = (z^2-1)^{1/2} or \log(z_1 z_2) = \log z_1 + \log z_2 for all z,z_1,z_2\in\mathbb{C}. Here, the square root is the principal square root (the one lying in the right half-plane) and the logarithm is the principal logarithm (the one with imaginary part in (-\pi,\pi]).

A powerful tool for dealing with multivalued complex functions is the unwinding number, defined for z\in\mathbb{C} by

\notag     \mathcal{U}(z) = \displaystyle\frac{z - \log \mathrm{e}^z}{2\pi \mathrm{i}}.

The unwinding number provides a correction term for the putative identity \log \mathrm{e}^z = z, in that

\notag    \qquad\qquad\qquad\qquad     z = \log \mathrm{e}^z + 2\pi \mathrm{i}\, \mathcal{U}(z)    \qquad\qquad\qquad\qquad (*)

for all z.

A useful formula for the unwinding number is

\notag    \mathcal{U}(z) = \left\lceil\displaystyle\frac{\mathrm{Im} z - \pi}{2\pi}\right\rceil,

where \lceil\cdot\rceil is the ceiling function, which returns the smallest integer greater than or equal to its argument. It follows that \mathcal{U}(z) = 0 if and only if \mathrm{Im} z \in (-\pi, \pi]. Hence \log \mathrm{e}^z = z if and only if \mathrm{Im} z \in (-\pi, \pi].

The unwinding number provides correction terms for various identities. For example, for z_1,z_2\in\mathbb{C}, replacing z by \log z_1 \pm \log z_2 in (*), we have

\notag \begin{aligned}     \log z_1 \pm \log z_2       &= \log \bigl( \mathrm{e}^{\log z_1 \pm \log z_2} \bigr)          + 2\pi\mathrm{i} \, \mathcal(\log z_1 \pm \log z_2)\\       &= \log \bigl(\mathrm{e}^{\log z_1} \mathrm{e}^{\pm \log z_2} \bigr)          + 2\pi\mathrm{i} \, \mathcal(\log z_1 \pm \log z_2)\\       &= \log \bigl( z_1 z_2^{\pm 1} \bigr)          + 2\pi\mathrm{i} \, \mathcal(\log z_1 \pm \log z_2). \end{aligned}

This gives the identities

\notag \begin{aligned} \log (z_1 z_2) &= \log z_1 + \log z_2 - 2\pi \mathrm{i}\, \mathcal{U}(\log z_1 +\log z_2), \\  \qquad\qquad\qquad   \log (z_1/z_2) &= \log z_1 - \log z_2 -       2\pi \mathrm{i}\,\mathcal{U}(\log z_1 - \log z_2).  \qquad\qquad\qquad (\#) \end{aligned}

Note that in textbooks one can find identities such as \log (z_1 z_2) = \log z_1 + \log z_2, in which each occurrence of \log is interpreted as a possibly different branch. For computational purposes we want formulas that contain a specific branch, usually the principal branch.

An application of the unwinding number to matrix functions is in computing the logarithm of a 2\times 2 upper triangular matrix. For any function f, we have

\notag     f\left( \begin{bmatrix}                      \lambda_1 & t_{12}  \\                            0   & \lambda_2             \end{bmatrix} \right)          = \begin{bmatrix}               f(\lambda_1) & t_{12} f[\lambda_1,\lambda_2] \\                 0          & f(\lambda_2)             \end{bmatrix},

where the divided difference

\notag   f[\lambda_1,\lambda_2]    = \begin{cases}      \displaystyle\frac{f(\lambda_2)-f(\lambda_1)}{\lambda_2-\lambda_1},      & \lambda_1 \ne \lambda_2, \\      f'(\lambda_1), & \lambda_1 = \lambda_2.     \end{cases}

When \lambda_1 \approx \lambda_2 this formula suffers from numerical cancellation. For the logarithm, we can rewrite it, using (\#), as

\notag  \begin{aligned}   \log\lambda_2 - \log\lambda_1   &= \log \left(\frac{\lambda_2}{\lambda_1}\right) + 2\pi \mathrm{i} \,        \mathcal{U}(\log \lambda_2 - \log \lambda_1)       \\   &= \log \left(\frac{1+z}{1-z}\right) + 2\pi \mathrm{i}\,     \mathcal{U}(\log \lambda_2 - \log \lambda_1),  \end{aligned}

where z = (\lambda_2-\lambda_1)/(\lambda_2+\lambda_1). Using the hyperbolic arc tangent, defined by

\notag     \mathrm{atanh}(z) = \frac{1}{2}\log\left( \displaystyle\frac{1+z}{1-z} \right),

we obtain

\notag   f[\lambda_1,\lambda_2]    = \displaystyle\frac{2\mathrm{atanh}(z) + 2\pi \mathrm{i}\,      \mathcal{U}(\log \lambda_2 - \log \lambda_1)}{\lambda_2-\lambda_1},    \quad \lambda_1 \ne \lambda_2.

Assuming that we have an accurate \mathrm{atanh} function this formula will provide an accurate value of f[\lambda_1,\lambda_2] provided that z is not too close to \pm 1 (the singularities of \mathrm{atanh}) and not too large. This formula is used in the MATLAB function logm.

Matrix Unwinding Function

The unwinding number leads to the matrix unwinding function, defined for A\in\mathbb{C}^{n\times n} by

\notag   \mathcal{U}(A) = \displaystyle\frac{A - \log \mathrm{e}^A}{2\pi \mathrm{i}}.

Here, \log is the principal matrix logarithm, defined by the property that its eigenvalues have imaginary parts in the interval (-\pi,\pi]. It can be shown that \mathcal{U}(A) = 0 if and only if the imaginary parts of all the eigenvalues of A lie in the interval (-\pi, \pi]. Furthermore, \mathcal{U}(A) is a diagonalizable matrix with integer eigenvalues.

As an example, the matrix

\notag   A = \left[\begin{array}{rrrr}    3 & 1 & -1 & -9\\    -1 & 3 & 9 & -1\\    -1 & -9 & 3 & 1\\    9 & -1 & -1 & 3 \end{array}\right],    \quad \Lambda(A) = \{ 2\pm 8\mathrm{i}, 4 \pm 10\mathrm{i} \}

has unwinding matrix function

\notag   X = \mathcal{U}(A) = \mathrm{i} \left[\begin{array}{rrrr}    0 & -\frac{1}{2} & 0 & \frac{3}{2}\\    \frac{1}{2} & 0 & -\frac{3}{2} & 0\\    0 & \frac{3}{2} & 0 & -\frac{1}{2}\\    -\frac{3}{2} & 0 & \frac{1}{2} & 0 \end{array}\right],    \quad \Lambda(X) = \{ \pm 1, \pm 2 \}.

In general, if A is real then \mathcal{U}(A) is pure imaginary as long as A has no eigenvalue with imaginary part that is an odd multiple of \pi.

The matrix unwinding function is useful for providing correction terms for matrix identities involving multivalued functions. Here are four useful matrix identities, along with cases in which the correction term vanishes. See Aprahamian and Higham (2014) for proofs.

  • For nonsingular A and \alpha,\beta\in\mathbb{C},

    \notag   (A^\alpha)^{\beta} = A^{\alpha\beta}   \mathrm{e}^{-2\beta\pi \mathrm{i}\, \mathcal{U}(\alpha\log A)}.

    If \beta is an integer then the correction term is I. If \alpha\in(-1,1] and \beta = 1/\alpha then \mathcal{U}(\alpha\log A) = 0 and so

    \notag      (A^\alpha)^{1/\alpha} = A, \quad \alpha \in [-1,1],

    and this equation is obviously true for \alpha = -1, too.

  • If A and B are nonsingular and AB = BA then

    \notag      \log(AB^{\pm1}) = \log A \pm \log B - 2\pi \mathrm{i}\, \mathcal{U}(\log A \pm \log B).

    If \arg\lambda_i + \arg\mu_i \in(-\pi,\pi] for every eigenvalue \lambda_i of A and the corresponding eigenvalue \mu_i of B (there is a correspondence because of the commutativity of A and B, which implies that they are simultaneously unitarily diagonalizable), then \log(AB^{\pm 1}) = \log A \pm \log B.

  • If A and B are nonsingular and AB = BA then for any \alpha\in\mathbb{C},

    \notag   (AB)^\alpha = A^\alpha B^\alpha \mathrm{e}^{-2\pi\, \alpha \mathrm{i}\, \mathcal{U}(\log A + \log B)}.

    If \alpha is an integer or the eigenvalues of A and B have arguments in (-\pi/2,\pi/2] then (AB)^\alpha = A^\alpha B^\alpha.

  • For nonsingular A and \alpha\in\mathbb{C}.

    \notag     \log A^{\alpha} = \alpha \log A - 2\pi \mathrm{i}\, \mathcal{U}(\alpha \log A).

    If \alpha\in(-1,1] then \log A^{\alpha} = \alpha \log A, and this equation also holds for \alpha = -1 if A has no negative eigenvalues.

The matrix unwinding function can be computed by an adaptation of the Schur–Parlett algorithm. The algorithm computes a Schur decomposition and re-orders it into a block form with eigenvalues having the same unwinding number in the same diagonal block. The unwinding function of each diagonal block is then a multiple of the identity and the off-diagonal blocks are obtained by the block Parlett recurrence. This approach gives more accurate results than directly evaluating \mathcal{U}(A) from its definition in terms of the matrix logarithm and natrix exponential. A MATLAB code unwindm is available at https://github.com/higham/unwinding

Matrix Argument Reduction

The matrix unwinding function can be of use computationally for reducing the size of the imaginary parts of the eigenvalues of a matrix. The function

\notag   \mathrm{mod}(A) = A - 2\pi\mathrm{i}\, \mathcal{U}(A)

has eigenvalues \lambda with \mathrm{Im} \lambda \in(-\pi,\pi]. Using \mathrm{mod}(A) in place of A can be useful in computing the matrix exponential by the scaling and squaring algorithm because \mathrm{e}^A = \mathrm{e}^{\mathrm{mod}(A)} and \mathrm{mod}(A) can have a much smaller norm than A, giving potential reductions in cost. Combined with the Schur decomposition-based algorithm for \mathcal{U}(A) mentioned above, this idea leads to a numerical method for \mathrm{e}^A. See Aprahamian and Higham (2014) for details.

Round Trip Relations

If you apply a matrix function and then its inverse do you get back to where you started, that is, is f^{-1}(f(z)) = z? In principle yes, but if the inverse is multivalued the answer is not immediate. The matrix unwinding function is useful for analyzing such round trip relations. As an example, if A has no eigenvalues with real part of the form k \pi for an integer k, then

\notag    \mathrm{acos}(\cos A) = \bigl(A-2\pi\,\mathcal{U}    (\mathrm{i} A)\bigr)\mathrm{sign} \bigl(A-2\pi\mathcal{U}( \mathrm{i} A)\bigr).

Here, \mathrm{acos} is the principal arc cosine defined in Aprahamian and Higham (2016), where this result and analogous results for the arc sine, arc hyperbolic cosine, and arc hyperbolic sine are derived; and \mathrm{sign} is the matrix sign function.


The unwinding number was introduced by Corless, Hare and Jeffrey in 1996, to help implement computer algebra over the complex numbers. It was generalized to the matrix case by Aprahamian and Higham (2014).


This is a minimal set of references, which contain further useful references within.

Related Blog Posts

This article is part of the “What Is” series, available from https://nhigham.com/category/what-is and in PDF form from the GitHub repository https://github.com/higham/what-is.

Posted in what-is | Tagged | Leave a comment

What Is a (Non)normal Matrix?

An n\times n matrix is normal if A^*A = AA^*, that is, if A commutes with its conjugate transpose. Although the definition is simple to state, its significance is not immediately obvious.

The definition says that the inner product of the ith and jth columns equals the inner product of the ith and jth rows for all i and j. For i = j this means that the ith row and the ith column have the same 2-norm for all i. This fact can easily be used to show that a normal triangular matrix must be diagonal. It then follows from the Schur decomposition that A is normal if it is unitarily diagonalizable: A = QDQ^* for some unitary Q and diagonal D, where D contains the eigenvalues of A on the diagonal. Thus the normal matrices are those with a complete set of orthonormal eigenvectors.

For a general diagonalizable matrix, A = XDX^{-1}, the condition number \kappa(X) = \|X| \|X^{-1}\| can be arbitrarily large, but for a normal matrix X can be taken to have 2-norm condition number 1. This property makes normal matrices well-behaved for numerical computation.

Many equivalent conditions to A being normal are known: seventy are given by Grone et al. (1987) and a further nineteen are given by Elsner and Ikramov (1998).

The normal matrices include the classes of matrix given in this table:

Real Complex
Diagonal Diagonal
Symmetric Hermitian
Skew-symmetric Skew-Hermitian
Orthogonal Unitary
Circulant Circulant

Circulant matrices are n\times n Toeplitz matrices in which the diagonals wrap around:

\notag      \begin{bmatrix} c_1     & c_n    & \dots   & c_2     \\                      c_2     & c_1    & \dots   & \vdots  \\                      \vdots  & \ddots & \ddots  & c_n     \\                      c_n     & \dots  & c_2     & c_1     \\      \end{bmatrix}.

They are diagonalized by a unitary matrix known as the discrete Fourier transform matrix, which has (r,s) element \exp( -2\pi \mathrm{i} (r-1)(s-1) / n ).

A normal matrix is not necessarily of the form given in the table, even for n = 2. Indeed, a 2\times 2 normal matrix must have one of the forms

\notag    \left[\begin{array}{@{\mskip2mu}rr@{\mskip2mu}}      a & b\\      b & c    \end{array}\right], \quad    \left[\begin{array}{@{}rr@{\mskip2mu}}      a & b\\     -b & a    \end{array}\right].

The first matrix is symmetric. The second matrix is of the form aI + bJ, where J = \bigl[\begin{smallmatrix}\!\phantom{-}0 & 1\\\!-1 & 0 \end{smallmatrix}\bigr] is skew-symmetric and satisfies J^2 = -I, and it has eigenvalues a \pm \mathrm{i}b.

It is natural to ask what the commutator C = AA^*- A^*A can look like when A is not normal. One immediate observation is that C has zero trace, so its eigenvalues sum to zero, implying that C is an indefinite Hermitian matrix if it is not zero. Since an indefinite matrix has at least two different nonzero eigenvalues, C cannot be of rank 1.

In the polar decomposition A = UH, where U is unitary and H is Hermitian positive semidefinite, the polar factors commute if and only if A is normal.

The field of values, also known as the numerical range, is defined for A\in\mathbb{C}^{n\times n} by

F(A) = \biggl\{\, \displaystyle\frac{z^*Az}{z^*z} : 0 \ne z \in \mathbb{C}^n \, \biggr\}.

The set F(A) is compact and convex (a nontrivial property proved by Toeplitz and Hausdorff), and it contains all the eigenvalues of A. Normal matrices have the property that the field of values is the convex hull of the eigenvalues. The next figure illustrates two fields of values, with the eigenvalues plotted as dots. The one on the left is for the nonnormal matrix gallery('smoke',16) and that on the right is for the circulant matrix gallery('circul',x) with x constructed as x = randn(16,1); x = x/norm(x).


Measures of Nonnormality

How can we measure the degree of nonnormality of a matrix? Let A have the Schur decomposition A = QTQ^*, where Q is unitary and T is upper triangular, and write T = D+M, where D = \mathrm{diag}(\lambda_i) is diagonal with the eigenvalues of A on its diagonal and M is strictly upper triangular. If A is normal then M is zero, so \|M\|_F is a natural measure of how far A is from being normal. While M depends on Q (which is not unique), its Frobenius norm does not, since \|A\|_F^2 = \|T\|_F^2 = \|D\|_F^2 + \|M\|_F^2. Accordingly, Henrici defined the departure from normality by

\notag   \nu(A) = \biggl( \|A\|_F^2 - \displaystyle\sum_{j=1}^n |\lambda_j|^2 \biggr)^{1/2}.

Henrici (1962) derived an upper bound for \nu(A) and Elsner and Paardekooper (1987) derived a lower bound, both based on the commutator:

\notag   \displaystyle\frac{\|A^*A-AA^*\|_F}{4\|A\|_2}   \le \nu(A) \le \Bigl( \displaystyle\frac{n^3-n}{12} \Bigr)^{1/4} \|A^*A-AA^*\|_F^{1/2}.

The distance to normality is

\notag   d(A) = \min \bigl\{\, \|E\|_F: ~A+E \in \mathbb{C}^{n\times n}~~\mathrm{is~                                   normal} \,\bigr\}.

This quantity can be computed by an algorithm of Ruhe (1987). It is trivially bounded above by \nu(A) and is also bounded below by a multiple of it (László, 1994):

\notag     \displaystyle\frac{\nu(A)}{n^{1/2}} \le d(A) \le \nu(A).

Normal matrices are a particular class of diagonalizable matrices. For diagonalizable matrices various bounds are available that depend on the condition number of a diagonalizing transformation. Since such a transformation is not unique, we take a diagonalization A = XDX^{-1}, D = \mathrm{diag}(\lambda_i), with X having minimal 2-norm condition number:

\kappa_2(X) = \min\{\, \kappa_2(Y): A = YDY^{-1}, ~D~\mathrm{diagonal} \,\}.

Here are some examples of such bounds. We denote by \rho(A) the spectral radius of A, the largest absolute value of any eigenvalue of A.

  • By taking norms in the eigenvalue-eigenvector equation Ax = \lambda x we obtain \rho(A) \le \|A\|_2. Taking norms in A = XDX^{-1} gives \|A\|_2 \le \kappa_2(X) \|D\|_2 = \kappa_2(X)\rho(A). Hence

\notag    \displaystyle\frac{\|A\|_2}{\kappa_2(X)} \le \rho(A) \le \|A\|_2.

  • If A has singular values \sigma_1 \ge \sigma_2 \ge \cdots \ge   \sigma_n and its eigenvalues are ordered |\lambda_1| \ge |\lambda_2| \ge \cdots \ge |\lambda_n|, then (Ruhe, 1975)

    \notag      \displaystyle\frac{\sigma_i(A)}{\kappa_2(X)}      \le |\lambda_i(A)|      \le \kappa_2(X) \sigma_i(A), \quad i = 1\colon n.

    Note that for i=1 the previous upper bound is sharper.

  • For any real p > 0,

    \notag    \displaystyle\frac{\rho(A)^p}{\kappa_2(X)} \le     \|A^p\|_2 \le \kappa_2(X) \rho(A)^p.

  • For any function f defined on the spectrum of A,

    \notag    \displaystyle\frac{\max_i|f(\lambda_i)|}{\kappa_2(X)} \le     \|f(A)\|_2 \le \max_i|f(\lambda_i)|.

For normal A we can take X unitary and so all these bounds are equalities. The condition number \kappa_2(X) can therefore be regarded as another measure of non-normality, as quantified by these bounds.


This is a minimal set of references, which contain further useful references within.

Posted in what-is | Leave a comment

What Is the Matrix Logarithm?

A logarithm of a square matrix A is a matrix X such that \mathrm{e}^X = A, where \mathrm{e}^X is the matrix exponential. Just as in the scalar case, the matrix logarithm is not unique, since if \mathrm{e}^X = A then \mathrm{e}^{X+2k\pi\mathrm{i}I} = A for any integer k. However, for matrices the nonuniqueness is complicated by the presence of repeated eigenvalues. For example, the matrix

\notag    X(t) =     2\pi \mathrm{i}       \begin{bmatrix}1 & -2t & 2t^2 \\                      0 &  -1 & -t \\                      0 &  0 &  0       \end{bmatrix}

is an upper triangular logarithm of the 3\times 3 identity matrix I_3 for any t, whereas the obvious logarithms are the diagonal matrices 2\pi\mskip1mu\mathrm{i}\mskip1mu\mathrm{diag}(k_1,k_2,k_3), for integers k_1, k_2, and k_3. Notice that the repeated eigenvalue 1 of I_3 has been mapped to the distinct eigenvalues 2\pi, -2\pi, and 0 in X(t). This is characteristic of nonprimary logarithms, and in some applications such strange logarithms may be required—an example is the embeddability problem for Markov chains.

An important question is whether a nonsingular real matrix has a real logarithm. The answer is that it does if and only if the Jordan canonical form contains an even number of Jordan blocks of each size for every negative eigenvalue. This means, in particular, that if A has an unrepeated negative eigenvalue then it does not have a real logarithm. Minus the n\times n identity matrix has a real logarithm for even n but not for odd n. Indeed, for n=2,

\notag    X =  \pi\left[\begin{array}{@{}rr@{}}              0 & 1\\              -1 & 0             \end{array}\right]

satisfies \mathrm{e}^X = -I_2, as does Y = ZXZ^{-1} for any nonsingular Z, since \mathrm{e}^Y = Z\mathrm{e}^X Z^{-1} = -Z Z^{-1}  = -I_2.

For most practical purposes it is the principal logarithm that is of interest, defined for any matrix with no negative real eigenvalues as the unique logarithm whose eigenvalues lie in the strip \{\, z : -\pi < \mathrm{Im}(z)  < \pi \,\}. From this point on we assume that A has no negative eigenvalues and that the logarithm is the principal logarithm.

Various explicit representations of the logarithm are available, including

\notag \begin{aligned}       \log A  &= \int_0^1 (A-I)\bigl[ t(A-I) + I \bigr]^{-1} \,\mathrm{d}t, \\     \log(I+X) &= X - \frac{X^2}{2} + \frac{X^3}{3}                    - \frac{X^4}{4} + \cdots, \quad \rho(X)<1, \end{aligned}

where the spectral radius \rho(X) = \max\{\, |\lambda| : \lambda~\textrm{is an eigenvalue of}~X\,\}. A useful relation is \log (A^{\alpha}) = \alpha \log A for \alpha\in[-1,1], with important special cases \log (A^{-1}) = - \log A and \log (A^{1/2}) = \frac{1}{2} \log A (where the square root is the principal square root). Recurring the latter expression gives, for any positive integer k,

\notag       \log(A) = 2^k \log\bigl(A^{1/2^k}\bigr).

This formula is the basis for the inverse scaling and squaring method for computing the logarithm, which chooses k so that E = I - A^{1/2^k} is small enough that \log(I + E) can be efficiently approximated by Padé approximation. The MATLAB function logm uses the inverse scaling and squaring method together with a Schur decomposition.


This references contains more on the facts above, as well as further references.

Related Blog Posts

This article is part of the “What Is” series, available from https://nhigham.com/category/what-is and in PDF form from the GitHub repository https://github.com/higham/what-is.

Posted in what-is | Tagged | Leave a comment

What Is a QR Factorization?

A QR factorization of a rectangular matrix A\in\mathbb{R}^{m\times n} with m\ge n is a factorization A = QR with Q\in\mathbb{R}^{m\times m} orthonormal and R\in\mathbb{R}^{m\times n} upper trapezoidal. The R factor has the form R = \left[\begin{smallmatrix}R_1\\ 0\end{smallmatrix}\right], where R_1 is n\times n and upper triangular. Partitioning Q conformably with R we have

\notag    A = QR     =    \begin{array}[b]{@{\mskip-20mu}c@{\mskip0mu}c@{\mskip-1mu}c@{}}    & \mskip10mu\scriptstyle n & \scriptstyle m-n  \\       \mskip15mu          \begin{array}{r}              \scriptstyle m          \end{array}~    &       \multicolumn{2}{c}{\mskip-15mu          \left[\begin{array}{c@{~}c@{~}}                  Q_1 & Q_2                \end{array}\right]       }    \end{array} \mskip-10mu    \begin{array}[b]{@{\mskip-25mu}c@{\mskip-20mu}c@{}}    \scriptstyle n    \\    \multicolumn{1}{c}{        \left[\begin{array}{@{}c@{}}                  R_1\\                  0              \end{array}\right]}    & \mskip-12mu\          \begin{array}{l}              \scriptstyle n \\              \scriptstyle m-n          \end{array}    \end{array}    = Q_1 R_1.

There are therefore two forms of QR factorization:

  • A = QR is the full QR factorization,
  • A = Q_1R_1 is the reduced (also called economy-sized, or thin) QR factorization.

To prove the existence of a QR factorization note that if A has full rank then A^T\!A is symmetric positive definite. Since A = Q_1R_1 implies A^T\!A = R_1^TQ_1^TQ_1R_1 = R_1^TR_1, we can take R_1 to be the Cholesky factor of A^T\!A and then define Q_1 = AR_1^{-1}. The resulting Q_1 has orthonormal columns because

\notag        Q_1^TQ_1 = R_1^{-T} A^T A R_1^{-1}                 = R_1^{-T} R_1^T R_1 R_1^{-1}                  = I.

Therefore when A has full rank there is a unique reduced QR factorization if we require R_1 to have positive diagonal elements. (Without this requirement we can multiply the ith column of Q and the ith row of R by -1 and obtain another QR factorization.)

When A has full rank the columns of Q_1 span the column space (or range) of A. Indeed Ax = Q_1R_1x = Q_1(R_1x) implies \mathrm{range}(A) \subseteq \mathrm{range}(Q_1) while Q_1x = Q_1R_1\cdot R_1^{-1}x =: Ay implies \mathrm{range}(Q_1) \subseteq \mathrm{range}(A), so \mathrm{range}(Q_1) = \mathrm{range}(A). Furthermore, Q^TA = R gives Q_2^TA = 0, so the columns of Q_2 span the null space of A^T.

The QR factorization provides a way of orthonormalizing the columns of a matrix. An alternative is provided by the polar decomposition A = UH, where U has orthonormal columns and H is positive semidefinite. The orthogonal polar factor U is the closest matrix with orthonormal columns to A in any unitarily invariant norm, but it is more expensive to compute than the Q factor.

There are three standard ways of computing a QR factorization.

Gram–Schmidt orthogonalization computes the reduced factorization. It has the disadvantage that in floating-point arithmetic the computed \widehat{Q} is not guaranteed to be orthonormal to the working precision. The modified Gram–Schmidt method (a variation of the classical method) is better behaved numerically in that \|\widehat{Q}^T\widehat{Q} - I\|_F \le c_1(m,n)\kappa_2(A)u for some constant c_1(m,n), where u is the unit roundoff, so the loss of orthogonality is bounded.

Householder QR factorization and Givens QR factorization both construct Q^T as a product of orthogonal matrices that are chosen to reduce A to upper trapezoidal form. In both methods, at the start of the kth stage we have

\notag   \qquad\qquad\qquad\qquad    A^{(k)} = Q_{k-1}^T A =    \begin{array}[b]{@{\mskip35mu}c@{\mskip20mu}c@{\mskip-5mu}c@{}c}    \scriptstyle k-1 &    \scriptstyle 1 &    \scriptstyle n-k &    \\    \multicolumn{3}{c}{    \left[\begin{array}{c@{\mskip10mu}cc}    R_{k-1} & y_k  & B_k \\        0   & z_k  & C_k    \end{array}\right]}    & \mskip-12mu    \begin{array}{c}    \scriptstyle k-1 \\    \scriptstyle m-k+1    \end{array}    \end{array},   \qquad\qquad\qquad\qquad (*)

where R_{k-1} is upper triangular and Q_{k-1} is a product of Householder transformations or Givens rotations. Working on A^{(k)}(k:m,k:n) we now apply a Householder transformation or n-k Givens rotations in order to zero out the last n-k elements of z_k and thereby take the matrix one step closer to upper trapezoidal form.

Householder QR factorization is the method of choice for general matrices, but Givens QR factorization is preferred for structured matrices with a lot of zeros, such as upper Hessenberg matrices and tridiagonal matrices.

Both these methods produce Q in factored form and if the product is explicitly formed they yield a computed \widehat{Q} that is orthogonal to the working precision, that is, \|\widehat{Q}^T\widehat{Q} - I\|_F \le c_2(m,n)u, for some constant c_2.

Modified Gram–Schmidt, Householder QR, and Givens QR all have the property that there exists an exactly orthogonal Q such that the computed \widehat{R} satisfies

\notag   A + \Delta A = Q \widehat{R}, \quad   \|\Delta A\|_F \le c_3(m,n)u \|A\|_F,

for some constant c_3.

Another way of computing a QR factorization is by the technique in the existence proof above, via Cholesky factorization of A^T\!A. This is known as the Cholesky QR algorithm and it has favorable computational cost when m \gg n. In its basic form, this method is not recommended unless A is extremely well conditioned, because the computed \widehat{Q} is far from orthonormal for ill conditioned matrices. The method can be made competitive with the others either by using extra precision or by iterating the process.

Column Pivoting and Rank-Revealing QR Factorization

In practice, we often want to compute a basis for the range of A when A is rank deficient. The basic QR factorization may not do so. Householder QR factorization with column pivoting reveals rank deficiency by incorporating column interchanges. At the kth stage, before applying a Householder transformation to (*), the column of largest 2-norm of C_k, the jth say, is determined, and if its norm exceeds that of z_K then the kth and (k+j)th columns of A^{(k)} are interchanged. The result is a factorization A\Pi = QR, where \Pi is a permutation matrix and R satisfies the inequalities

\notag    |r_{kk}|^2 \ge \displaystyle\sum_{i=k}^j |r_{ij}|^2,     \quad j=k+1\colon n, \quad k=1\colon n.

In particular,

\notag   \qquad\qquad\qquad\qquad\qquad\qquad   |r_{11}| \ge |r_{22}| \ge \cdots \ge |r_{nn}|.   \qquad\qquad\qquad\qquad\qquad\qquad (\dagger)

If A is rank deficient then R has the form

\notag R = \begin{bmatrix}R_{11} & R_{12} \\ 0 & 0 \end{bmatrix},

with R_{11} nonsingular, and the rank of A is the dimension of R_{11}.

Near rank deficiency of A to tends to be revealed by a small trailing diagonal block of R, but this is not guaranteed. Indeed for the Kahan matrix

\notag        U_n(\theta) = \mathrm{diag}(1,s,\dots,s^{n-1})                   \begin{bmatrix} 1 & -c & -c     & \dots & -c \\                               & 1  & -c     & \dots & -c \\                               &    & \ddots &\ddots & \vdots \\                               &    &        &\ddots & -c \\                               &    &        &       &  1                    \end{bmatrix}

where c =\cos\theta and s = \sin\theta, u_{nn} is of order 2^n times larger than the smallest singular value for small \theta and U_n(\theta) is invariant under QR factorization with column pivoting.

In practice, column pivoting reduces the efficiency of Householder QR factorization because it limits the amount of the computation that can be expressed in terms of matrix multiplication. This has motivated the development of methods that select the pivot columns using randomized projections. These methods gain speed and produce factorizations of similar rank-revealing quality, though they give up the inequalities (\dagger).

It is known that at least one permutation matrix \Pi exists such that the QR factorization of A\Pi is rank-revealing. Computing such a permutation is impractical, but heuristic algorithms for producing an approximate rank-revealing factorization are available.


This is a minimal set of references, which contain further useful references within.

Related Blog Posts

This article is part of the “What Is” series, available from https://nhigham.com/category/what-is and in PDF form from the GitHub repository https://github.com/higham/what-is.

Posted in what-is | Leave a comment

What is the Cayley–Hamilton Theorem?

The Cayley–Hamilton Theorem says that a square matrix A satisfies its characteristic equation, that is p(A) = 0 where p(t) = \det(tI-A) is the characteristic polynomial. This statement is not simply the substitution “p(A) = \det(A - A) = 0”, which is not valid since t must remain a scalar inside the \det term. Rather, for an n\times n A, the characteristic polynomial has the form

\notag    p(t) = t^n + a_{n-1}t^{n-1} + \cdots + a_1 t + a_0

and the Cayley–Hamilton theorem says that

\notag    p(A) = A^n + a_{n-1}A^{n-1} + \cdots + a_1 A + a_0 I = 0.

Various proofs of the theorem are available, of which we give two. The first is the most natural for anyone familiar with the Jordan canonical form. The second is more elementary but less obvious.

First proof.

Consider a 4\times 4 Jordan block with eigenvalue \lambda:

\notag     J =      \begin{bmatrix}    \lambda & 1       & 0        & 0       \\            & \lambda & 1        & 0\\            &         & \lambda  & 1\\            &         &          & \lambda         \end{bmatrix}.

We have

\notag     J - \lambda I =      \begin{bmatrix}          0 & 1       & 0        & 0       \\            &  0      & 1        & 0\\            &         & 0        & 1\\            &         &          & 0         \end{bmatrix}, \quad     (J - \lambda I)^2 =      \begin{bmatrix}            0  & 0  & 1 & 0 \\               & 0  & 0 & 1 \\               &    & 0 & 0 \\               &    &   & 0         \end{bmatrix}, \quad     (J - \lambda I)^3 =      \begin{bmatrix}            0  & 0  & 0 & 1 \\               & 0  & 0 & 0 \\               &    & 0 & 0 \\               &    &   & 0         \end{bmatrix},

and then (J - \lambda I)^4 = 0. In general, for an n \times n Jordan block J with eigenvalue \lambda, (J - \lambda I)^k is zero apart from a kth superdiagonal of ones for k\le n-1, and (J - \lambda I)^n = 0.

Let A have the Jordan canonical form A = Z JZ^{-1}, where J = \mathrm{diag}(J_1, \dots, J_k) and each J_i is an m_i\times m_i Jordan block with eigenvalue \lambda_i. The characteristic polynomial of A can be factorized as p(t) = (t-\lambda_1)^{m_1}(t-\lambda_2)^{m_2}\dots(t-\lambda_k)^{m_k}. Note that A^i = Z J^i Z^{-1} for all i, and hence q(A) = Z q(J) Z^{-1} for any polynomial q. Then

\notag  Z^{-1}p(A)Z = p(J)        = \mathrm{diag}\bigl( p(J_1), p(J_2), \dots, p(J_k) \bigr),

and p(J_i) is zero because it contains a factor (J_i - \lambda_i I\bigr)^{m_i} and this factor is zero, as noted above. Hence Z^{-1}p(A)Z = 0 and therefore p(A) = 0.

Second Proof

Recall that the adjugate \mathrm{adj}(B) of an n\times n matrix B is the transposed matrix of cofactors, where a cofactor is a signed sum of products of n-1 entries of B, and that B\mathrm{adj}(B) = \det(B)I. With B = tI - A, each entry of \mathrm{adj}(B) is a polynomial of degree n-1 in t, so B^{-1} = \mathrm{adj}(B)/\det(B) can be written

\notag    (tI - A)^{-1} = \displaystyle\frac{\mathrm{adj}(tI - A)}{\det(tI - A)}                  = \displaystyle\frac{t^{n-1}C_{n-1} + \cdots + tC_1 + C_0}{p(t)},

for some matrices C_{n-1}, \dots, C_0 not depending on t. Rearranging, we obtain

\notag   (tI - A) \bigl(t^{n-1}C_{n-1} + \cdots + tC_1 + C_0 \bigr) =  p(t)I,

and equating coefficients of t^n, …, t^0 gives

\notag \begin{aligned}   C_{n-1} &= I,\\   C_{n-2} - A C_{n-1} &= a_{n-1}I,\\          & \vdots\\   C_0 - A C_1 &= a_1 I,\\   - A C_0 &= a_0I. \end{aligned}

Premultiplying the first equation by A^n, the second by A^{n-1}, and so on, and adding, gives

\notag     0 = A^n + a_{n-1}A^{n-1} + \cdots + a_1 A + a_0 I = p(A),

as required. This proof is by Buchheim (1884).

Applications and Generalizations

A common use of the Cayley–Hamilton theorem is to show that A^{-1} is expressible as a linear combination of I, A, …, A^{n-1}. Indeed for a nonsingular A, p(A) = 0 implies that

\notag    A^{-1} = -\displaystyle\frac{1}{a_0}             \bigl( A^{n-1} + a_{n-1}A^{n-2} + \cdots + a_1 I \bigr),

since a_0 = \det(A) \ne 0.

Similarly, A^k for any k \ge n can be expressed as a linear combination of I, A, …, A^{n-1}. An interesting implication is that any matrix power series is actually a polynomial in the matrix. Thus the matrix exponential \mathrm{e}^A = I + A + A^2/2! + \cdots can be written \mathrm{e}^A = c_{n-1} A^{n-1} + \cdots + C_1A + c_0I for some scalars c_{n-1}, …, c_0. However, the c_i depend on A, which reduces the usefulness of the polynomial representation. A rare example of an explicit expression of this form is Rodrigues’s formula for the exponential of a skew-symmetric matrix A \in \mathbb{R}^{3\times 3}:

\notag      \mathrm{e}^A = I + \displaystyle\frac{\sin\theta}{\theta} A +                \displaystyle\frac{1-\cos\theta}{\theta^2} A^2,

where \theta = \sqrt{\|A\|_F^2/2}.

Cayley used the Cayley–Hamilton theorem to find square roots of a 2\times 2 matrix. If X^2 = A then applying the theorem to X gives X^2 - \mathrm{trace}(X)X + \det(X)I = 0, or

\notag \qquad\qquad\qquad\qquad   A - \mathrm{trace}(X)X + \det(X)I = 0, \qquad\qquad\qquad\qquad (*)

which gives

X = \displaystyle\frac{A + \det(X)I}{\mathrm{trace}(X)}.

Now \det(X) = \sqrt{\det(A)} and taking the trace in (*) gives an equation for \mathrm{trace}(X), leading to

\notag         X = \displaystyle\frac{ A + \sqrt{\det(A)} \,I}                  {\sqrt{\mathrm{trace}(A) + 2 \sqrt{\det(A)}}}.

With appropriate choices of signs for the square roots this formula gives all four square roots of A when A has distinct eigenvalues, but otherwise the formula can break down.

Expressions obtained from the Cayley–Hamilton theorem are of little practical use for general matrices, because algorithms that compute the coefficients a_i of the characteristic polynomial are typically numerically unstable.

The Cayley–Hamilton theorem has been generalized in various directions. The theorem can be interpreted as saying that the powers A^i for all nonnegative i generate a vector space of dimension at most n. Gerstenhaber (1961) proved that if A and B are two commuting n\times n matrices then the matrices A^iB^j, for all nonnegative i and j, generate a vector space of dimension at most n. It is conjectured that this result extends to three matrices.

Historical Note

The Cayley–Hamilton theorem appears in the 1858 memoir in which Cayley introduced matrix algebra. Cayley gave a proof for n = 2 and stated that he had verified the result for n = 3, adding “I have not thought it necessary to undertake the labour of a formal proof of the theorem in the general case of a matrix of any degree.” Hamilton had proved the result for quaternions in 1853. Cayley actually discovered a more general version of the Cayley–Hamilton theorem, which appears in an 1857 letter to Sylvester but not in any of his published work: if the square matrices A and B commute and f(x,y) = \det(x A - y B) then f(B,A) = 0.


Related Blog Posts

This article is part of the “What Is” series, available from https://nhigham.com/category/what-is and in PDF form from the GitHub repository https://github.com/higham/what-is.

Posted in what-is | Leave a comment

What Is the CS Decomposition?

The CS (cosine-sine) decomposition reveals close relationships between the singular value decompositions (SVDs) of the blocks an orthogonal matrix expressed in block 2\times 2 form. In full generality, it applies when the diagonal blocks are not necessarily square. We focus here mainly on the most practically important case of square diagonal blocks.

Let Q\in\mathbb{R}^{n \times n} be orthogonal and suppose that n = 2p is even and Q is partitioned into four equally sized blocks:

\notag    Q =    \begin{array}[b]{@{\mskip35mu}c@{\mskip-20mu}c@{\mskip-10mu}c@{}}    \scriptstyle p &    \scriptstyle p &    \\    \multicolumn{2}{c}{        \left[\begin{array}{c@{~}c@{~}}                  Q_{11}& Q_{12} \\                  Q_{21}& Q_{22} \\              \end{array}\right]}    & \mskip-12mu\          \begin{array}{c}              \scriptstyle p \\              \scriptstyle p              \end{array}    \end{array}.

Then there exist orthogonal matrices U_1,U_2,V_1,V_2\in\mathbb{R}^{p \times p} such that

\notag    \begin{bmatrix}  U_1^T & 0\\                          0   & U_2^T    \end{bmatrix}    \begin{bmatrix}  Q_{11} & Q_{12}\\                          Q_{21} & Q_{22}    \end{bmatrix}    \begin{bmatrix}  V_1 & 0\\                          0   & V_2    \end{bmatrix}    =    \begin{array}[b]{@{\mskip36mu}c@{\mskip-13mu}c@{\mskip-10mu}c@{}}    \scriptstyle p &    \scriptstyle p &    \\    \multicolumn{2}{c}{        \left[\begin{array}{@{\mskip3mu}rr@{~}}                  C &    S         \\                 -S &    C              \end{array}\right]}    & \mskip-12mu\          \begin{array}{c}              \scriptstyle p \\              \scriptstyle p              \end{array}    \end{array},

where C = \mathrm{diag}(c_i) and S = \mathrm{diag}(s_i) with c_i \ge 0, s_i \ge 0, and c_i^2 + s_i^2 = 1 for all i. This CS decomposition comprises four SVDs:

\notag   \begin{alignedat}{2}   Q_{11} &= U_1CV_1^T,   &\quad Q_{12} &= U_1 S V_2^T, \\   Q_{21} &= U_2 (-S) V_1^T,   &\quad Q_{22} &= U_2C V_2^T.   \end{alignedat}

(Strictly speaking, for Q_{21} we need to move the minus sign from S to U_2 or V_1 to obtain an SVD.) The orthogonality ensures that there are only four different singular vector matrices instead of eight, and it makes the singular values of the blocks closely linked. We also obtain SVDs of four cross products of the blocks: Q_{11}^TQ_{12} = V_1^T CS V_2^T, etc.

Note that for p = 1, the CS decomposition reduces to the fact that any 2\times 2 orthogonal matrix is of the form \left[\begin{smallmatrix} c & s \\ -s & c \end{smallmatrix}\right] (a rotation ) up to multiplication of a row or column by -1.

A consequence of the decomposition is that Q_{11} and Q_{22} have the same 2-norms and Frobenius norms, as do their inverses if they are nonsingular. The same is true for Q_{12} and Q_{21}.

Now we drop the requirement that n is even and consider diagonal blocks of different sizes:

\notag    Q =    \begin{array}[b]{@{\mskip33mu}c@{\mskip-16mu}c@{\mskip-10mu}c@{}}    \scriptstyle p &    \scriptstyle n-p &    \\    \multicolumn{2}{c}{        \left[\begin{array}{c@{~}c@{~}}                  Q_{11}& Q_{12} \\                  Q_{21}& Q_{22} \\              \end{array}\right]}    & \mskip-12mu\          \begin{array}{c}              \scriptstyle p \\              \scriptstyle n-p              \end{array}    \end{array}, \quad p \le \displaystyle\frac{n}{2}.

The CS decomposition now has the form

\notag    \begin{bmatrix}  U_1^T & 0\\                          0   & U_2^T    \end{bmatrix}    \begin{bmatrix}  Q_{11} & Q_{12}\\                          Q_{21} & Q_{22}    \end{bmatrix}    \begin{bmatrix}  V_1 & 0\\                          0   & V_2    \end{bmatrix}    =    \begin{array}[b]{@{\mskip35mu}c@{\mskip30mu}c@{\mskip-10mu}c@{}c}    \scriptstyle p &    \scriptstyle p &    \scriptstyle n-2p &    \\    \multicolumn{3}{c}{    \left[\begin{array}{c@{~}|c@{~}c}    C &    S      & 0   \\    \hline   -S &    C      & 0   \\    0 &    0      & I_{n-2p}    \end{array}\right]}    & \mskip-12mu    \begin{array}{c}    \scriptstyle p \\    \scriptstyle p \\    \scriptstyle n-2p    \end{array}    \end{array},

with U_1, U_2, C, and S, and V_1 and V_2 (both now (n-p) \times )n-p)), having the same properties as before. The new feature for p < n/2 is the identity matrix in the bottom right-hand corner on the right-hand side. Here is an example with p = 2 and n=5, with elements shown to two decimal places:

\notag \begin{aligned} \left[\begin{array}{rr|rrr}  0.71  & -0.71  &  0     &  0     &  0     \\ -0.71  & -0.71  &  0     &  0     &  0     \\\hline  0     &  0     &  0.17  &  0.61  & -0.78  \\  0     &  0     & -0.58  & -0.58  & -0.58  \\  0     &  0     & -0.80  &  0.54  &  0.25  \\  \end{array}\right] \left[\begin{array}{rr|rrr} -0.60  & -0.40  & -0.40  & -0.40  & -0.40  \\  0.40  &  0.60  & -0.40  & -0.40  & -0.40  \\\hline  0.40  & -0.40  &  0.60  & -0.40  & -0.40  \\  0.40  & -0.40  & -0.40  &  0.60  & -0.40  \\  0.40  & -0.40  & -0.40  & -0.40  &  0.60  \\ \end{array}\right] \\ \times \left[\begin{array}{rr|rrr} -0.71  &  0.71  &  0     &  0     &  0     \\ -0.71  & -0.71  &  0     &  0     &  0     \\\hline  0     &  0     &  0.17  &  0.58  & -0.80  \\  0     &  0     &  0.61  &  0.58  &  0.54  \\  0     &  0     & -0.78  &  0.58  &  0.25  \\ \end{array}\right] = \left[\begin{array}{rr|rrr}  1.00  &  0     &  0     &  0     &  0     \\  0     &  0.20  &  0     &  0.98  &  0     \\\hline  0     &  0     &  1.00  &  0     &  0     \\  0     & -0.98  &  0     &  0.20  &  0     \\  0     &  0     &  0     &  0     &  1.00  \\ \end{array}\right]. \end{aligned}

We mention two interesting consequences of the CS decomposition.

  • With p=1: if q_{11} = 0 then Q_{22} is singular.
  • For unequally sized diagonal blocks it is no longer always true that Q_{11} and Q_{22} have the same norms, but their inverses do: \|Q_{11}^{-1}\|_2 = \|Q_{22}^{-1}\|_2 = 1/\min_ic_i \ge 1. When p = 1, this relation becomes \|Q_{22}^{-1}\|_2 = 1/|q_{11}|.

The CS decomposition also exists for a rectangular matrix with orthonormal columns,

\notag    Q =    \begin{array}[b]{@{\mskip-25mu}c@{\mskip-20mu}c@{}}    \scriptstyle n    \\    \multicolumn{1}{c}{        \left[\begin{array}{@{}c@{}}                  Q_{1}\\                  Q_{2}              \end{array}\right]}    & \mskip-12mu\          \begin{array}{c}              \scriptstyle p \\              \scriptstyle q          \end{array}    \end{array}, \quad p\ge n, \quad q \ge n.

Now the decomposition takes the form

\notag    \begin{bmatrix}  U_1^T & 0\\                          0   & U_2^T    \end{bmatrix}    \begin{bmatrix}  Q_{1}\\                     Q_{2}    \end{bmatrix}    V    =    \begin{array}[b]{@{\mskip-25mu}c@{\mskip-20mu}c@{}}    \scriptstyle n    \\    \multicolumn{1}{c}{        \left[\begin{array}{c@{~}}                  C\\                  S              \end{array}\right]}    & \mskip-12mu\          \begin{array}{c}              \scriptstyle p \\              \scriptstyle q          \end{array}    \end{array},

where U_1\in\mathbb{R}^{p\times p}, U_2\in\mathbb{R}^{q\times q}, and V\in\mathbb{R}^{n\times n} are orthogonal and C and S have the same form as before except that they are rectangular.

The most general form of the CS decomposition is for an orthogonal matrix with diagonal blocks that are not square. Now the matrix on the right-hand side has a more complicated block structure (see the references for details).

The CS decomposition arises in measuring angles and distances between subspaces. These are defined in terms of the orthogonal projectors onto the subspaces, so singular values of orthonormal matrices naturally arise.

Software for computing the CS decomposition is available in LAPACK, based on an algorithm of Sutton (2009). We used a MATLAB interface to it, available on MathWorks File Exchange, for the numerical example. Note that the output of this code is not quite in the form in which we have presented the decomposition, so some post-processing is required to achieve it.


This is a minimal set of references, which contain further useful references within.

Related Blog Posts

This article is part of the “What Is” series, available from https://nhigham.com/category/what-is and in PDF form from the GitHub repository https://github.com/higham/what-is.

Posted in what-is | Leave a comment

What’s New in MATLAB R2020a and R2020b?

In this post I discuss new features in MATLAB R2020a and R2020b. As usual in this series, I focus on a few of the features most relevant to my work. See the release notes for a detailed list of the many changes in MATLAB and its toolboxes.

Exportgraphics (R2020a)

The exportgraphics function is very useful for saving to a file a tightly cropped version of a figure with the border white instead of gray. Simple usages are


I have previously used the export_fig function, which is not built into MATLAB but is available from File Exchange; I think I will be using exportgraphics instead from now on.

Svdsketch (R2020b)

The new svdsketch function computes the singular value decomposition (SVD) USV^T of a low rank approximation to a matrix (U and V orthogonal, S diagonal with nonnegative diagonal entries). It is mainly intended for use with matrices that are close to having low rank, as is the case in various applications.

This function uses a randomized algorithm that computes a sketch of the given m-by-n matrix A, which is essentially a product Q^TA, where Q is an orthonormal basis for the product A\Omega, where \Omega is a random n-by-k matrix. The value of k is chosen automatically to achieve \|USV^T-A\|_F \le \mathrm{tol}\|A\|_F, where \mathrm{tol} is a tolerance that defaults to \epsilon^{1/4} and must not be less than \epsilon^{1/2}, where \epsilon is the machine epsilon (2\times 10^{-16} for double precision). The algorithm includes a power method iteration that refines the sketch before computing the SVD.

The output of the function is an SVD in which U and V are numerically orthogonal and the singular values in S of size \mathrm{tol} or larger are good approximations to singular values of A, but smaller singular values in S may not be good approximations to singular values of A.

Here is an example. The code

n = 8; rng(1); 8; A = gallery('randsvd',n,1e8,3);
[U,S,V] = svdsketch(A,1e-3);
rel_res = norm(A-U*S*V')/norm(A)
singular_values = [svd(A) [diag(S); zeros(n-length(S),1)]]

produces the following output, with the exact singular values in the first column and the approximate ones in the second column:

rel_res =
singular_values =
   1.0000e+00   1.0000e+00
   7.1969e-02   7.1969e-02
   5.1795e-03   5.1795e-03
   3.7276e-04   3.7276e-04
   2.6827e-05   2.6827e-05
   1.9307e-06            0
   1.3895e-07            0
   1.0000e-08            0

The approximate singular values are correct down to around 10^{-5}, which is more than the 10^{-3} requested. This is a difficult matrix for svdsketch because there is no clear gap in the singular values of A.

Axis Padding (R2020b)

The padding property of an axis puts some padding between the axis limits and the surrounding box. The code

x = linspace(0,2*pi,50); plot(x,tan(x),'linewidth',1.4)
title('Original axis')
axis padded, title('Padded axis')

produces the output



Turbo Colormap (2020b)

The default colormap changed from jet (the rainbow color map) to parula in R2014b (with a tweak in R2017a), because parula is more perceptually uniform and maintains information when printed in monochrome. The new turbo colormap is a more perceptually uniform version of jet, as these examples show. Notice that turbo has a longer transition through the greens and yellows. If you can’t give up on jet, use turbo instead.







ND Arrays (R2020b)

The new pagemtimes function performs matrix multiplication on pages of n-dimensional arrays, while pagetranspose and pagectranspose carry out the transpose and conjugate transpose, respectively, on pages of n-dimensional arrays.


Both releases report significantly improved speed of certain functions, including some of the ODE solvers.

Posted in software | Tagged | Leave a comment

What Is the Singular Value Decomposition?

A singular value decomposition (SVD) of a matrix A\in\mathbb{R}^{m\times n} is a factorization

\notag     A = U\Sigma V^T,

where U\in\mathbb{R}^{m\times m} and V\in\mathbb{R}^{n\times n} are orthogonal, \Sigma = \mathrm{diag}(\sigma_1,\dots, \sigma_p)\in\mathbb{R}^{m\times n}, where p = \min(m,n), and \sigma_1\ge \sigma_2\ge \cdots \ge \sigma_p \ge 0.

Partition U =[ u_1,\dots,u_m] and V = [v_1,\dots, v_n]. The \sigma_i are called the singular values of A and the u_i and v_i are the left and right singular vectors. We have Av_i = \sigma_i u_i, i = 1 \colon p. The matrix \Sigma is unique but U and V are not. The form of \Sigma is

\notag  \Sigma =       \left[\begin{array}{ccc}\sigma_1&&\\ &\ddots&\\& &\sigma_n\\\hline         &\rule{0cm}{15pt} \text{\Large 0} &      \end{array}\right]      \mathrm{for}~ m \ge n, \quad    \Sigma =     \begin{bmatrix}         \begin{array}{ccc|c@{\mskip5mu}}\sigma_1&&\\ &\ddots&               & \text{\Large 0}           \\& &\sigma_m\end{array}\\      \end{bmatrix} \mathrm{for}~ m \le n

Here is an example, in which the entries of A have been specially chosen to give simple forms for the elements of the factors:

\notag A = \left[\begin{array}{rr} 0 & \frac{4}{3}\\[\smallskipamount]                        -1 & -\frac{5}{3}\\[\smallskipamount]                        -2 & -\frac{2}{3} \end{array}\right] = \underbrace{ \displaystyle\frac{1}{3} \left[\begin{array}{rrr} 1 & -2 & -2\\ -2 & 1 & -2\\ -2 & -2 & 1 \end{array}\right] }_U \mskip5mu \underbrace{ \left[\begin{array}{cc} 2\,\sqrt{2} & 0\\ 0 & \sqrt{2}\\ 0 & 0 \end{array}\right] }_{\Sigma} \mskip5mu \underbrace{ \displaystyle\frac{1}{\sqrt{2}} \left[\begin{array}{cc} 1 & 1\\ 1 & -1  \end{array}\right] }_{V^T}.

The power of the SVD is that it reveals a great deal of useful information about norms, rank, and subspaces of a matrix and it enables many problems to be reduced to a trivial form.

Since U and V are nonsingular, \mathrm{rank}(A) = \mathrm{rank}(\Sigma) = r, where r \le p is the number of nonzero singular values. Since the 2-norm and Frobenius norm are invariant under orthogonal transformations, \|A\| = \|\Sigma\| for both norms, giving

\notag   \|A\|_2 = \sigma_1, \quad   \|A\|_F = \Bigl(\displaystyle\sum_{i=1}^r \sigma_i^2\Bigr)^{1/2},

and hence \|A\|_2 \le \|A\|_F \le r^{1/2} \|A\|_2. The range space and null space of A are given in terms of the columns of U and V by

\notag \begin{aligned} \mathrm{null}(A)  &= \mathrm{span} \{ v_{r+1}, \dots,v_n \},\\ \mathrm{range}(A) &= \mathrm{span} \{u_1,u_2,\dots, u_r\}. \end{aligned}

We can write the SVD as

\notag \qquad\qquad     A = \begin{bmatrix} u_1, u_2 \dots, u_r \end{bmatrix}         \mathrm{diag}(\sigma_1,\dots, \sigma_r)        \begin{bmatrix} v_1^T\\ v_2^T\\ \vdots\\ v_r^T \end{bmatrix}     = \displaystyle\sum_{i=1}^{r} \sigma_i u_i v_i^T,  \qquad\qquad(*)

which expresses A as a sum of r rank-1 matrices, the ith of which has 2-norm \sigma_i. The famous Eckart–Young theorem (1936) says that

\notag  \min_{\mathrm{rank}(B) = k} \|A-B\|_q =  \begin{cases}      \sigma_{k+1},                                & q = 2,  \\      \Bigl(\sum_{i=k+1}^r \sigma_i^2\Bigr)^{1/2}, & q = F,   \end{cases}

and that the minimum is attained at

\notag    A_k = U D_k V^T, \quad    D_k = \mathrm{diag}(\sigma_1, \dots, \sigma_k, 0, \dots, 0).

In other words, truncating the sum (*) after k < r terms gives the best rank-k approximation to A in both the 2-norm and the Frobenius norm. In particular, this result implies that when A has full rank the distance from A to the nearest rank-deficient matrix is \sigma_r.

Relations with Symmetric Eigenvalue Problem

The SVD is not directly related to the eigenvalues and eigenvectors of A. However, for m\ge n, A = U \Sigma V^T implies

\notag  A^T\!A = V \mathrm{diag}(\sigma_1^2,\dots,\sigma_n^2) V^T, \quad  AA^T = U \mathrm{diag}(\sigma_1^2,\dots,\sigma_n^2,\underbrace{0,\dots,0}_{m-n}) U^T,

so the singular values of A are the square roots of the eigenvalues of the symmetric positive semidefinite matrices A^T\!A and AA^T (modulo m-n zeros in the latter case), and the singular vectors are eigenvectors. Moreover, the eigenvalues of the (m+n)\times (m+n) matrix

\notag     C = \begin{bmatrix}               0 & A \\               A^T & 0         \end{bmatrix}

are plus and minus the singular values of A, together with |m-n| additional zeros if m \ne n, and the eigenvectors of C and the singular vectors of A are also related.

Consequently, by applying results or algorithms for the eigensystem of a symmetric matrix to A^T\!A, AA^T, or C one obtains results or algorithms for the singular value decomposition of A.

Connections with Other Problems

The pseudoinverse of a matrix A\in\mathbb{R}^{n\times n} can be expressed in terms of the SVD as

\notag    A^+ = V\mathrm{diag}(\sigma_1^{-1},\dots,\sigma_r^{-1},0,\dots,0)U^T.

The least squares problem \min_x \|b - Ax\|_2, where A\in\mathbb{R}^{m\times n} with m \ge n is solved by x = A^+b, and when A is rank-deficient this is the solution of minimum 2-norm. For m < n this is an underdetermined system and x = A^+b gives the minimum 2-norm solution.

We can write A = U\Sigma V^T = UV^T \cdot V \Sigma V^T \equiv PQ, where P is orthogonal and Q is symmetric positive semidefinite. This decomposition A = PQ is the polar decomposition and Q = (A^T\!A)^{1/2} is unique. This connection between the SVD and the polar decomposition is useful both theoretically and computationally.


The SVD is used in a very wide variety of applications—too many and varied to attempt to summarize here. We just mention two.

The SVD can be used to help identify to which letters vowels and consonants have been mapped in a substitution cipher (Moler and Morrison, 1983).

An inverse use of the SVD is to construct test matrices by forming a diagonal matrix of singular values from some distribution then pre- and post-multiplying by random orthogonal matrices. The result is matrices with known singular values and 2-norm condition number that are nevertheless random. Such “randsvd” matrices are widely used to test algorithms in numerical linear algebra.

History and Computation

The SVD was introduced independently by Beltrami in 1873 and Jordan in 1874. Golub popularized the SVD as an essential computational tool and developed the first reliable algorithms for computing it. The Golub–Reinsch algorithm, dating from the late 1960s and based on bidiagonalization and the QR algorithm, is the standard way to compute the SVD. Various alternatives are available; see the references.


This is a minimal set of references, which contain further useful references within.

Related Blog Posts

This article is part of the “What Is” series, available from https://nhigham.com/category/what-is and in PDF form from the GitHub repository https://github.com/higham/what-is.

Posted in what-is | 1 Comment

What Is the Complex Step Approximation?

In many situations we need to evaluate the derivative of a function but we do not have an explicit formula for the derivative. The complex step approximation approximates the derivative (and the function value itself) from a single function evaluation. The catch is that it involves complex arithmetic.

For an analytic function f we have the Taylor expansion

\notag     \qquad\qquad\qquad\qquad  f(x + \mathrm{i}h) = f(x) + \mathrm{i}h f'(x) - h^2\displaystyle\frac{f''(x)}{2}                             + O(h^3),    \qquad\qquad\qquad\qquad(*)

where \mathrm{i} = \sqrt{-1} is the imaginary unit. Assume that f maps the real line to the real line and that x and h are real. Then equating real and imaginary parts in (*) gives \mathrm{Re} f(x+\mathrm{i}h) = f(x) + O(h^2) and \mathrm{Im} f(x+\mathrm{i}h) = hf'(x) + O(h^3). This means that for small h, the approximations

\notag     f(x) \approx \mathrm{Re} f(x+\mathrm{i}h), \quad     f'(x) \approx \mathrm{Im}  \displaystyle\frac{f(x+\mathrm{i}h)}{h}

both have error O(h^2). So a single evaluation of f at a complex argument gives, for small h, a good approximation to f'(x), as well as a good approximation to f(x) if we need it.

The usual way to approximate derivatives is with finite differences, for example by the forward difference approximation

\notag       f'(x) \approx \displaystyle\frac{f(x+h) - f(x)}{h}.

This approximation has error O(h) so it is less accurate than the complex step approximation for a given h, but more importantly it is prone to numerical cancellation. For small h, f(x+h) and f(x) agree to many significant digits and so in floating-point arithmetic the difference approximation suffers a loss of significant digits. Consequently, as h decreases the error in the computed approximation eventually starts to increase. As numerical analysis textbooks explain, the optimal choice of h that balances truncation error and rounding errors is approximately

\notag   h_{\mathrm{opt}} = 2\Bigl|\displaystyle\frac{u f(x)}{f''(x))} \Bigr|^{1/2},

where u is the unit roundoff. The optimal error is therefore of order u^{1/2}.

A simple example illustrate these ideas. For the function f(x) = \mathrm{e}^x with x = 1, we plot in the figure below the relative error for the finite difference, in blue, and the relative error for the complex step approximation, in orange, for h ranging from about 10^{-5} to 10^{-11}. The dotted lines show u and u^{1/2}. The computations are in double precision (u \approx 1.1\times 10^{-16}). The finite difference error decreases with h until it reaches about h_{\mathrm{opt}} = 2.1\times 10^{-8}; thereafter the error grows, giving the characteristic V-shaped error curve. The complex step error decreases steadily until it is of order u for h \approx u^{1/2}, and for each h it is about the square of the finite difference error, as expected from the theory.


Remarkably, one can take h extremely small in the complex step approximation (e.g., h = 10^{-100}) without any ill effects from roundoff.

The complex step approximation carries out a form of approximate automatic differentiation, with the variable h functioning like a symbolic variable that propagates through the computations in the imaginary parts.

The complex step approximation applies to gradient vectors and it can be extended to matrix functions. If f is analytic and maps real n\times n matrices to real n\times n matrices and A and E are real then (Al-Mohy and Higham, 2010)

\notag     L_f(A,E) \approx \mathrm{Im} \displaystyle\frac{f(A+\mathrm{i}hE)}{h},

where L_f(A,E) is the Fréchet derivative of f at A in the direction E. It is important to note that the method used to evaluate f must not itself use complex arithmetic (as methods based on the Schur decomposition do); if it does, then the interaction of those complex terms with the much smaller \mathrm{i}hE term can lead to damaging subtractive cancellation.

The complex step approximation has also been extended to higher derivatives by using “different imaginary units” in different components (Lantoine et al., 2012).

Here are some applications where the complex step approximation has been used.

  • Sensitivity analysis in engineering applications (Giles et al., 2003).
  • Approximating gradients in deep learning (Goodfellow et al., 2016).
  • Approximating the exponential of an operator in option pricing (Ackerer and Filipović, 2019).

Software has been developed for automatically carrying out the complex step method—for example, by Shampine (2007).

The complex step approximation has been rediscovered many times. The earliest published appearance that we are aware of is in a paper by Squire and Trapp (1998), who acknowledge earlier work of Lyness and Moler on the use of complex variables to approximate derivatives.


This is a minimal set of references, which contain further useful references within.

Related Blog Posts

This article is part of the “What Is” series, available from https://nhigham.com/category/what-is and in PDF form from the GitHub repository https://github.com/higham/what-is.

Posted in what-is | 2 Comments

What Is the Sherman–Morrison–Woodbury Formula?

When a nonsingular n\times n matrix A is perturbed by a matrix of rank k, the inverse also undergoes a rank-k perturbation. More precisely, if E has rank k and B = A+E is nonsingular then the identity A^{-1} - B^{-1} =  A^{-1} (B-A) B^{-1} shows that

\notag   \mathrm{rank}(A^{-1} -  B^{-1})    = \mathrm{rank}(A^{-1} E B^{-1}) = \mathrm{rank}(E) = k.

The Sherman–Morrison–Woodbury formula provides an explicit formula for the inverse of the perturbed matrix B.

Sherman–Morrison Formula

We will begin with the simpler case of a rank-1 perturbation: B = A + uv^*, where u and v are n-vectors, and we consider first the case where A = I. We might expect that (I + uv^*)^{-1} = I + \theta uv^* for some \theta (consider a binomial expansion of the inverse). Multiplying out, we obtain

\notag   (I + uv^*) (I + \theta uv^*) = I + (1 + \theta + \theta v^*u) uv^*,

so the product equals the identity matrix when \theta = -1/(1 + v^*u). The condition that I + uv* be nonsingular is v^*u \ne -1 (as can also be seen from \det(I + uv^*) = 1 + v^*u, derived in What Is a Block Matrix?). So

\notag    (I + uv^*)^{-1} = I - \displaystyle\frac{1}{1 + v^*u} uv^*.

For the general case write B = A + uv^* = A(I + A^{-1}u v^*). Inverting this equation and applying the previous result gives

\notag   (A + uv^*)^{-1} = A^{-1} - \displaystyle\frac{A^{-1} uv^* A^{-1}}{1 + v^* A^{-1} u},

subject to the nonsingularity condition v^*A^{-1}x \ne -1. This is known as the Sherman–Morrison formula. It explicitly identifies the rank-1 change to the inverse.

As an example, if we take u = te_i and v = e_j (where e_k is the kth column of the identity matrix) then, writing A^{-1} = (\alpha_{ij}), we have

\notag   \bigl(A + te_ie_j^*\bigr)^{-1}    = A^{-1} - \displaystyle\frac{tA^{-1}e_i e_j^* A^{-1}}{1 +  t \alpha_{ji}}.

The Frobenius norm of the change to A^{-1} is

\notag   \displaystyle\frac{  |t|\, \| A^{-1}e_i\|_2 \| e_j^*A^{-1}\|_2 }                     {|1 + t \alpha_{ji}|}.

If t is sufficiently small then this quantity is approximately maximized for i and j such that the product of the norms of ith column and jth row of A^{-1} is maximized. For an upper triangular matrix i = n and j = 1 are likely to give the maximum, which means that the inverse of an upper triangular matrix is likely to be most sensitive to perturbations in the (n,1) element of the matrix. To illustrate, we consider the matrix

\notag  T =  \left[\begin{array}{rrrr}   1 & -1 & -2 & -3\\   0 & 1 & -4 & -5\\   0 & 0 & 1 & -6\\   0 & 0 & 0 & 1   \end{array}\right]

The (i,j) element of the following matrix is \| T^{-1} - (T + 10^{-3}e_ie_j^*)^{-1} \|_F:

\notag   \left[\begin{array}{cccc}  0.044  &  0.029  &  0.006  &  0.001  \\  0.063  &  0.041  &  0.009  &  0.001  \\  0.322  &  0.212  &  0.044  &  0.007  \\  2.258  &  1.510  &  0.321  &  0.053  \\   \end{array}\right]

As our analysis suggests, the (4,1) entry is the most sensitive to perturbation.

Sherman–Morrison–Woodbury Formula

Now consider a perturbation UV^*, where U and V are n\times k. This perturbation has rank at most k, and its rank is k if U and V are both of rank k. If I + V^* A^{-1} U is nonsingular then A + UV^* is nonsingular and

\notag   (A + UV^*)^{-1} = A^{-1} - A^{-1} U (I + V^* A^{-1} U)^{-1} V^*                      A^{-1},

which is the Sherman–Morrison–Woodbury formula. The significance of this formula is that I + V^* A^{-1} U is k\times k, so if k\ll n and A^{-1} is known then it is much cheaper to evaluate the right-hand side than to invert A + UV^* directly. In practice, of course, we rarely invert matrices, but rather exploit factorizations of them. If we have an LU factorization of A then we can use it in conjunction with the Sherman–Morrison–Woodbury formula to solve (A + UV^*)x = b in O(n^2 + k^3) flops, as opposed to the O(n^3) flops required to factorize A + UV^* from scratch.

The Sherman–Morrison–Woodbury formula is straightforward to verify, by showing that the product of the two sides is the identity matrix. How can the formula be derived in the first place? Consider any two matrices F and G such that FG and GF are both defined. The associative law for matrix multiplication gives F(GF) = (FG)F, or (I + FG)F = F (I + GF), which can be written as F(I+GF)^{-1} = (I+FG)^{-1}F. Postmultiplying by G gives

\notag   F(I+GF)^{-1}G = (I+FG)^{-1}FG                 = (I+FG)^{-1}(I + FG - I)                 = I - (I+FG)^{-1}.

Setting F = U and G = V^* gives the special case of the Sherman–Morrison–Woodbury formula with A = I, and the general formula follows from A + UV^* = A(I + A^{-1}U V^*).

General Formula

We will give a different derivation of an even more general formula using block matrices. Consider the block matrix

\notag   X =  \begin{bmatrix} A & U \\ V^* & -W^{-1} \end{bmatrix}

where A is n\times n, U and V are n\times k, and W is k\times k. We will obtain a formula for (A + UWV^*)^{-1} by looking at X^{-1}.

It is straightforward to verify that

\notag    \begin{bmatrix} A & U \\ V^* & -W^{-1} \end{bmatrix}     =    \begin{bmatrix} I & 0 \\ V^*A^{-1} & I \end{bmatrix}    \begin{bmatrix} A & 0 \\ 0 & -(W^{-1} + V^*A^{-1}U) \end{bmatrix}    \begin{bmatrix} I & A^{-1}U \\ 0 & I \end{bmatrix}.


\notag \begin{aligned}    \begin{bmatrix} A & U \\ V^* & -W^{-1} \end{bmatrix}^{-1}     &=    \begin{bmatrix} I & -A^{-1}U \\ 0 & I \end{bmatrix}.    \begin{bmatrix} A^{-1} & 0 \\ 0 & -(W^{-1} + V^*A^{-1}U)^{-1} \end{bmatrix}    \begin{bmatrix} I & 0 \\ -V^*A^{-1} & I \end{bmatrix}\\[\smallskipamount]     &=    \begin{bmatrix} A^{-1} - A^{-1}U(W^{-1} + V^*A^{-1}U)^{-1}V^*A^{-1} &                    A^{-1}U(W^{-1} + V^*A^{-1U})^{-1} \\          (W^{-1} + V^*A^{-1}U)^{-1} V^*A^{-1} & -(W^{-1} + V^*A^{-1}U)^{-1}      \end{bmatrix}. \end{aligned}

In the (1,1) block we see the right-hand side of a Sherman–Morrison–Woodbury-like formula, but it is not immediately clear how this relates to (A + UWV^*)^{-1}. Let P = \bigl[\begin{smallmatrix} 0 & I \\ I & 0 \end{smallmatrix} \bigr], and note that P^{-1} = P. Then

\notag     PXP = \begin{bmatrix} -W^{-1} & V^* \\ U & A \end{bmatrix}

and applying the above formula (appropriately renaming the blocks) gives, with \times denoting a block whose value does not matter,

\notag     PX^{-1}P = (PXP)^{-1}  = \begin{bmatrix} \times & \times \\ \times & (A + UWV^*)^{-1} \end{bmatrix}.

Hence (X^{-1})_{11} = (A + UWV^*)^{-1}. Equating our two formulas for (X^{-1})_{11} gives

\notag    \qquad\qquad\qquad   (A + UWV^*)^{-1} = A^{-1} - A^{-1}U (W^{-1} + V^*A^{-1}U)^{-1} V^*A^{-1},    \qquad\qquad\qquad(*)

provided that W^{-1} + V^*A^{-1}U is nonsingular.

To see one reason why this formula is useful, suppose that the matrix A and its perturbation are symmetric and we wish to preserve symmetry in our formulas. The Sherman–Morrison–Woodbury requires us to write the perturbation as UU^*, so the perturbation must be positive semidefinite. In (*), however, we can write an arbitrary symmetric perturbation as UWU^*, with W symmetric but possibly indefinite, and obtain a symmetric formula.

The matrix -(W^{-1} + V^*A^{-1}U) is the Schur complement of A in X. Consequently the inversion formula (*) is intimately connected with the theory of Schur complements. By manipulating the block matrices in different ways it is possible to derive variations of (*). We mention just the simple rewriting

\notag   (A + UWV^*)^{-1} = A^{-1} - A^{-1}U W(I + V^*A^{-1}UW)^{-1} V^*A^{-1},

which is valid if W is singular, as long as I + WV^*A^{-1}U is nonsingular. Note that the formula is not symmetric when V = U and W = W^*. This variant can also be obtained by replacing U by UW in the Sherman–Morrison–Woodbury formula.

Historical Note

Formulas for the change in a matrix inverse under low rank perturbations have a long history. They have been rediscovered on multiple occasions, sometimes appearing without comment within other formulas. Equation (*) is given by Duncan (1944), which is the earliest appearance in print that I am aware of. For discussions of the history of these formulas see Henderson and Searle (1981) or Puntanen and Styan (2005).


This is a minimal set of references, which contain further useful references within.

Related Blog Posts

This article is part of the “What Is” series, available from https://nhigham.com/category/what-is and in PDF form from the GitHub repository https://github.com/higham/what-is.

Posted in what-is | 4 Comments