The Improved MATLAB Functions Expm and Logm

Equation from the Princeton Companion to Applied Mathematics, article “Functions of Matrices” (p. 97)

The matrix exponential is a ubiquitous matrix function, important both for theory and for practical computation. The matrix logarithm, an inverse to the exponential, is also increasingly used (see my earlier post, 400 Years of Logarithms).

MATLAB R2015b introduced new versions of the expm and logm functions. The Release Notes say

The algorithms for expm, logm, and sqrtm show increased accuracy, with logm and sqrtm additionally showing improved performance.

The help text for expm and logm is essentially unchanged from the previous versions, so what’s different about the new functions? (I will discuss sqrtm in a future post.)

The answer is that both functions make use of new backward error bounds that can be much smaller than the old ones for very nonnormal matrices, and so help to avoid a phenomenon known as overscaling. The key change is that when bounding a matrix power series p(X) = a_0 I + a_1 X + a_2 X^2 + \cdots, instead of bounding the kth term a_k X^k by |a_k| \|X\|^k, a potentially smaller bound is used.

This is best illustrated by example. Suppose we want to bound \|X^{12}\| and are not willing to compute X^{12} but are willing to compute lower powers of X. We have 12 = 6 \times 2 = 4 \times 3, so \|X^{12}\| is bounded by each of the terms (\|X^2\|)^6, (\|X^3\|)^4, (\|X^4\|)^3, and (\|X^6\|)^2. But it is easy to see that (\|X^6\|)^2 \le (\|X^2\|)^6 and (\|X^6\|)^2 \le (\|X^3\|)^4, so we can discard two of the bounds, ending up with

\|X^{12}\| \le \min( \|X^4\|^3, \|X^6\|^2 ).

This argument can be generalized so that every power of X is bounded in terms of the norms of X^p for values of p up to some small, fixed value. The gains can be significant. Consider the matrix

X = \begin{bmatrix}1 & 100 \\ 0 & 1 \end{bmatrix}.

We have \|X\|^{12} \approx 10^{24}, but

X^k = \begin{bmatrix}1 & 100k \\ 0 & 1 \end{bmatrix},

so the bound above is roughly \|X^{12}\| \le 6 \times 10^{7}, which is a significant improvement.

One way to understand what is happening is to note the inequality

\rho(X) \le \| X^k\| ^{1/k} \le \|X\|,

where \rho is the spectral radius (the largest modulus of any eigenvalue). The upper bound corresponds to the usual analysis. The lower bound is something that we cannot use to bound the norm of the power series. The middle term is what we are using, and as k\to\infty the middle term converges to the lower bound, which can be arbitrarily smaller than the upper bound.

What is the effect of these bounds on the algorithms in expm and logm? Both algorithms make use of Padé approximants, which are good only for small-normed matrices, so the algorithms begin by reducing the norm of the input matrix. Backward error bounds derived by bounding a power series as above guide the norm reduction and if the bounds are weak then the norm is reduced too much, which can result in loss of accuracy in floating point arithmetic—the phenomenon of overscaling. The new bounds greatly reduce the chance of overscaling.

In his blog post A Balancing Act for the Matrix Exponential, Cleve Moler describes a badly scaled 3-by-3 matrix for which the original expm suffers from overscaling and a loss of accuracy, but notes that the new algorithm does an excellent job.

The new logm has another fundamental change: it applies inverse scaling and squaring and Padé approximation to the whole triangular Schur factor, whereas the previous logm applied this technique to the individual diagonal blocks in conjunction with the Parlett recurrence.

For more on the algorithms underlying the new codes see these papers. The details of how the norm of a matrix power series is bounded are given in Section 4 of the first paper.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s