# Hyphenation Question: Row-wise or Rowwise?

Sam Clark of T&T Productions, the copy editor for the third edition of MATLAB Guide (co-authored with Des Higham and to be published by SIAM in December 2016), recently asked whether we would like to change “row-wise” to “rowwise”.

A search of my hard disk reveals that I have always used the hyphen, probably because I don’t like consecutive w’s. Indeed, in 1999 I published a paper Row-Wise Backward Stable Elimination Methods for the Equality Constrained Least Squares Problem .

A bit more searching found recent SIAM papers containing “rowwise”, so it is clearly acceptable usage to omit the hyphen..

My dictionaries and usage guides don’t provide any guidance as far as I can tell. Here is what some more online searching revealed.

• The Oxford English Dictionary does not contain either form (in the entry for “row” or elsewhere), but the entry for “column” contains “column-wise” but not “columnwise”.
• The Google Ngram Viewer shows a great prevalence of the hyphenated form, which was about three time as common as the unhyphenated form in the year 2000.
• A search for “row-wise” and “rowwise” at google.co.uk finds about 724,000 and 248,00 hits, respectively.
• A Google Scholar search for “row-wise” and “rowwise” finds 31,600 and 18,900 results, respectively. For each spelling, there are plenty of papers with that form in the title. The top hit for “rowwise” is a 1993 paper The rowwise correlation between two proximity matrices and the partial rowwise correlation, which manages to include the word twice for good measure!

Since the book is about MATLAB, it also seemed appropriate to check how the MATLAB documentation hyphenates the term. I could only find the hyphenated form:

doc flipdim:
When the value of dim is 1, the array is flipped row-wise down


But for columnwise I found that MATLAB R2016b is inconsistent, as the following extracts illustrate, the first being from the documentation for the Symbolic Math Toolbox version of the function.

doc reshape:
The elements are taken column-wise from A ...
Reshape a matrix row-wise by transposing the result.

doc rmmissing:
1 for row-wise (default) | 2 for column-wise

doc flipdim:
When dim is 2, the array is flipped columnwise left to right.

doc unwrap:
If P is a matrix, unwrap operates columnwise.


So what is our conclusion? We’re sticking to “row-wise” because we think it is easier to parse, especially for those whose first language is not English.

# What’s New in MATLAB R2016b

MATLAB R2016b was released in the middle of September 2016. In this post I discuss some of its new features (I will not consider the toolboxes). This is personal selection of highlights; for a complete overview see the Release Notes.

The features below are discussed in greater detail in the third edition of MATLAB Guide, to be published by SIAM in December 2016.

## Live Editor

The Live Editor, introduced in R2016a, provides an interactive environment for editing and running MATLAB code. When the code that you enter is executed the results (numerical or graphical) are displayed in the editor. The code is divided into sections that can be evaluated, and subsequently edited and re-evaluated, individually. The Live Editor works with live scripts, which have a .mlx extension. Live scripts can be published (exported) to HTML or PDF. R2016b adds more functionality to the Live Editor, including an improved equation editor and the ability to pan, zoom, and rotate axes in output figures​.

The Live Editor is particularly effective with the Symbolic Math Toolbox, thanks to the rendering of equations in typeset form, as the following image shows.

The Live Editor is a major development, with significant benefits for teaching and for script-based workflows. No doubt we will see it developed further in future releases.

## Local Functions

Local functions are what used to be called subfunctions: they are functions within functions or, to be more precise, functions that appear after the main function in a function file. What’s new in R2016b is that a script can have local functions. This is a capability I have long wanted. When writing a script I often find that a particular computation or task, such as printing certain statistics, needs to be repeated at several places in the script. Now I can put that code in a local function instead of repeating the code or having to create an external function for it.

## String Arrays

MATLAB has always had strings, created with a single quote, as in

s = 'a string'


which sets up a 1-by-8 character vector. A new datatype string has been introduced. A string is an array for which each entry contains a character vector. The syntax

str = string('a string')


sets up a 1-by-1 string array, whereas

str = string({'Boston','Manchester'})


sets up a 1-by-2 string array via a cell array. String arrays are more memory efficient than cell arrays and allow for more flexible handling of strings. They are particularly useful in conjunction with tables. According to this MathWorks video, string arrays “have a multitude of new methods that have been optimized for performance”. At the moment, support for strings across MATLAB is limited and it is inconvenient to have to set up a string array by passing a cell array to the string function. No doubt string arrays will be integrated more seamlessly in future releases.

## Tall Arrays

Tall arrays provide a way to work with data that does not fit into memory. Calculations on tall arrays are delayed until a result is explicitly requested with the gather function. MATLAB optimizes the calculations to try to minimize the number of passes through the data. Tall arrays are created with the tall function, which take as argument an array (numeric, string, datetime, or one of several other data types) or a datastore. Many MATLAB functions (but not the linear algebra functions, for example) work the same way with tall arrays as they do with in-memory arrays.

## Timetables

Tables were introduced in R2013b. They store tabular data, with columns representing variables of possibly different types. The newly introduced timetable is a table for which each row has an associated date and time, stored in the first column. The timerange function produces a subscript that can be used to select rows corresponding to a particular time interval. These new features make MATLAB an even more powerful tool for data analysis.

## Missing Values

MATLAB has new capabilities for dealing with missing values, which are defined according to the data type: NaNs (Not-a-Number) for double and single data, NaTs (Not-a-Time) for datetime data, <undefined> for categorical data, and so on. For example, the function ismissing detects missing data and fillmissing fills in missing data according to one of several rules.

## Mac OS Version

I have Mac OS X Version 10.9.5 (Mavericks) on my Mac. Although this OS is not officially supported, R2016b installed without any problem and runs fine. The pre-release versions of R2016a and R2016b would not install on Mavericks, so it seems that compatibility with older operating systems is ensured only for the actual release.

At the time of writing, there are some compatibility problems with Mac OS Version 10.12 (Sierra) for certain settings of Language & Region.

# Implicit Expansion: A Powerful New Feature of MATLAB R2016b

The latest release of MATLAB, R2016b, contains a feature called implicit expansion, which is an extension of the scalar expansion that has been part of MATLAB for many years. Scalar expansion is illustrated by

>> A = spiral(2), B = A - 1
A =
1     2
4     3
B =
0     1
3     2


Here, MATLAB subtracts 1 from every element of A, which is equivalent to expanding the scalar 1 into a matrix of ones then subtracting that matrix from A.

Implicit expansion takes this idea further by expanding vectors:

>> A = ones(2), B = A + [1 5]
A =
1     1
1     1
B =
2     6
2     6


Here, the result is the same as if the row vector was replicated along the first dimension to produce the matrix [1 5; 1 5] then that matrix was added to ones(2). In the next example a column vector is added and the replication is across the columns:

>> A = ones(2) + [1 5]'
A =
2     2
6     6


Implicit expansion also works with multidimensional arrays, though we will focus here on matrices and vectors.

So MATLAB now treats “matrix plus vector” as a legal operation. This is a controversial change, as it means that MATLAB now allows computations that are undefined in linear algebra.

Why have MathWorks made this change? A clue is in the R2016b Release Notes, which say

For example, you can calculate the mean of each column in a matrix A, then subtract the vector of mean values from each column with A - mean(A).

This suggests that the motivation is, at least partly, to simplify the coding of manipulations that are common in data science.

Implicit expansion can also be achieved with the function bsxfun that was introduced in release R2007a, though I suspect that few MATLAB users have heard of this function:

>> A = [1 4; 3 2], bsxfun(@minus,A,mean(A))
A =
1     4
3     2
ans =
-1     1
1    -1

>> A - mean(A)
ans =
-1     1
1    -1


Prior to the introduction of bsxfun, the repmat function could be used to explicitly carry out the expansion, though less efficiently and less elegantly:

>> A - repmat(mean(A),size(A,1),1)
ans =
-1     1
1    -1


An application where the new functionality is particularly attractive is multiplication by a diagonal matrix.

>> format short e
>> A = ones(3); d = [1 1e-4 1e-8];
>> A.*d  %  A*diag(d)
ans =
1.0000e+00   1.0000e-04   1.0000e-08
1.0000e+00   1.0000e-04   1.0000e-08
1.0000e+00   1.0000e-04   1.0000e-08
>> A.*d' % diag(d)*A
ans =
1.0000e+00   1.0000e+00   1.0000e+00
1.0000e-04   1.0000e-04   1.0000e-04
1.0000e-08   1.0000e-08   1.0000e-08


The .* expressions are faster than forming and multiplying by diag(d) (as is the syntax bsxfun(@times,A,d)). We can even multiply with the inverse of diag(d) with

>> A./d
ans =
1       10000   100000000
1       10000   100000000
1       10000   100000000


It is now possible to add a column vector to a row vector, or to subtract them:

>> d = (1:3)'; d - d'
ans =
0    -1    -2
1     0    -1
2     1     0


This usage allows very short expressions for forming the Hilbert matrix and Cauchy matrices (look at the source code for hilb.m with type hilb or edit hilb).

The max and min functions support implicit expansion, so an elegant way to form the matrix $A$ with $(i,j)$ element $\min(i,j)$ is with

d = (1:n); A = min(d,d');


and this is precisely what gallery('minij',n) now does.

Another function that can benefit from implicit expansion is vander, which forms a Vandermonde matrix. Currently the function forms the matrix in three lines, with calls to repmat and cumprod. Instead we can do it as follows, in a formula that is closer to the mathematical definition and hence easier to check.

A = (v(:) .^ (n-1:-1:0)')';  % Equivalent to A = vander(v)


The latter code is, however, slower than the current vander for large dimensions, presumably because exponentiating each element independently is slower than using repeated multiplication.

An obvious objection to implicit expansion is that it could cause havoc in linear algebra courses, where students will be able to carry out operations that the instructor and textbook have said are not allowed. Moreover, it will allow programs with certain mistyped expressions to run that would previously have generated an error, making debugging more difficult.

I can see several responses to this objection. First, MATLAB was already inconsistent with linear algebra in its scalar expansion. When a mathematician writes (with a common abuse of notation) $A - \sigma$, with a scalar $\sigma$, he or she usually means $A - \sigma I$ and not $A - \sigma E$ with $E$ the matrix of ones.

Second, I have been using the prerelease version of R2016b for a few months, while working on the third edition of MATLAB Guide, and have not encountered any problems caused by implicit expansion—either with existing codes or with new code that I have written.

A third point in favour of implicit expansion is that it is particularly compelling with elementwise operations (those beginning with a dot), as the multiplication by a diagonal matrix above illustrates, and since such operations are not a part of linear algebra confusion is less likely.

Finally, it is worth noting that implicit expansion fits into the MATLAB philosophy of “useful defaults” or “doing the right thing”, whereby MATLAB makes sensible choices when a user’s request is arguably invalid or not fully specified. This is present in the many functions that have optional arguments. But it can also be seen in examples such as

% No figure is open and no parallel pool is running.
>> close         % Close figure.
>> delete(gcp)   % Shut down parallel pool.


where no error is generated even though there is no figure to close or parallel pool to shut down.

I suspect that people’s reactions to implicit expansion will be polarized: they will either be horrified or will regard it as natural and useful. Now that I have had time to get used to the concept—and especially now that I have seen the benefits both for clarity of code (the minij matrix) and for speed (multiplication by a diagonal matrix)—I like it. It will be interesting to see the community’s reaction.

# The One-Line Maze Program in MATLAB

A classic one-line program for the Commodore 64 microcomputer is

10 PRINT CHR\$(205.5+RND(1)); GOTO 10


This is essentially what was printed in the section “Random Graphics” of the Commodore 64 User’s Guide (1982). The program prints a random maze that gradually builds up on-screen. The following video demonstrates it

I recently came across a 309-page book by Montfort et al. (MIT Press, 2013) dedicated to discussing this program from every conceivable angle. The book, which can be freely downloaded in PDF form, has as its title the program itself and was written by a team of ten authors with backgrounds in digital media, art, literature, and computer science. I found the book an interesting read, not least for the memories it brought back of my days programming the Commodore PET and Commodore 64 machines in the early 1980s (discussed in this post about the Commodore PET and this post about the Commodore 64). I suspect that never has so much been written about a single line of code!

Various translations of the program into other languages have been done, but I could not find a MATLAB version. Here, then, is my MATLAB offering, which takes advantage of the MATLAB char function’s ability to produce Unicode characters:

while 1, fprintf('%s\n',char(rand(1,80)+9585.5)); pause(.2), end


The pause command is not necessary but helps to slow the printing down, and the argument of pause may need adjusting for your computer.

Are there other interesting MATLAB one-liners? This one is from Mike Croucher:

x=[-2:.001:2]; plot(x,(sqrt(cos(x)).*cos(200*x)+sqrt(abs(x))-0.7).*(4-x.*x).^0.01)


And here is one to compute the Mandelbrot set, condensed from the code mandel in MATLAB Guide:

[x,y]=ndgrid(-1.5:5e-3:1.5); c=x+1i*y; z=c; for k=1:50, z=z.^2+c; end, contourf(x,y,abs(z)<1e6)


If you know any other good one-liners please put them in the comment box below.

# Iterating MATLAB Commands

Some MATLAB commands can be “applied to themselves”. A good example is help help, which provides help on the help function. For what other functions fun is fun fun a legal statement that produces something interesting? Here are some examples (by no means an exhaustive list).

char char
demo demo
diary diary
doc doc
diff diff
double double
edit edit
error error
iskeyword iskeyword
length length
magic magic
max max
size size
sort sort
spy spy
unique unique
upper upper
warning warning
which which


Can you see why these are legal and guess what the effect of these calls will be? Search for “command-function duality” in the MATLAB help for some clues.

Are there any legal triple invocations? Yes, for example

and and and
char char char
isa isa isa
max max max


Indeed char can be iterated as many times as you like, but and and isa can be iterated three times but not twice.

# Improved MATLAB Function Sqrtm

The MATLAB function sqrtm, for computing a square root of a matrix, first appeared in the 1980s. It was improved in MATLAB 5.3 (1999) and again in MATLAB 2015b. In this post I will explain how the recent changes have brought significant speed improvements.

Recall that every $n$-by-$n$ nonsingular matrix $A$ has a square root: a matrix $X$ such that $X^2 = A$. In fact, it has at least two square roots, $\pm X$, and possibly infinitely many. These two extremes occur when $A$ has a single block in its Jordan form (two square roots) and when an eigenvalue appears in two or more blocks in the Jordan form (infinitely many square roots).

In practice, it is usually the principal square root that is wanted, which is the one whose eigenvalues lie in the right-half plane. This square root is unique if $A$ has no real negative eigenvalues. We make it unique in all cases by taking the square root of a negative number to be the one with nonnegative imaginary part.

The original sqrtm transformed $A$ to Schur form and then applied a recurrence of Parlett, designed for general matrix functions; in fact it simply called the MATLAB funm function of that time. This approach can be unstable when $A$ has nearly repeated eigenvalues. I pointed out the instability in a 1999 report A New Sqrtm for MATLAB and provided a replacement for sqrtm that employs a recurrence derived specifically for the square root by Björck and Hammarling in 1983. The latter recurrence is perfectly stable. My function also provided an estimate of the condition number of the matrix square root.

The importance of sqrtm has grown over the years because logm (for the matrix logarithm) depends on it, as do codes for other matrix functions, for computing arbitrary powers of a matrix and inverse trigonometric and inverse hyperbolic functions.

For a triangular matrix $T$, the cost of the recurrence for computing $T^{1/2}$ is the same as the cost of computing $T^{-1}$, namely $n^3/3$ flops. But while the inverse of a triangular matrix is a level 3 BLAS operation, and so has been very efficiently implemented in libraries, the square root computation is not in the level 3 BLAS standard. As a result, my sqrtm implemented the Björck–Hammarling recurrence in M-code as a triply nested loop and was rather slow.

The new sqrtm introduced in MATLAB 2015b contains a new implementation of the Björck–Hammarling recurrence that, while still in M-code, is much faster. Here is a comparison of the underlying function sqrtm_tri (contained in toolbox/matlab/matfun/private/sqrtm_tri.m) with the relevant piece of code extracted from the old sqrtm. Shown are execution times in seconds for random triangular matrices an a quad-core Intel Core i7 processor.

n sqrtm_tri old sqrtm
10 0.0024 0.0008
100 0.0017 0.014
1000 0.45 3.12

For $n=10$, the new code is slower. But for $n=100$ we already have a factor 8 speedup, rising to a factor 69 for $n=1000$. The slowdown for $n=10$ is for a combination of two reasons: the new code is more general, as it supports the real Schur form, and it contains function calls that generate overheads for small $n$.

How does sqrtm_tri work? It uses a recursive partitioning technique. It writes

$T = \begin{bmatrix} T_{11} & T_{12} \\ 0 & T_{22} \\ \end{bmatrix}$

and notes that

$T^{1/2} = \begin{bmatrix} T_{11}^{1/2} & X_{12} \\ 0 & T_{22}^{1/2} \\ \end{bmatrix},$

where $T_{11}^{1/2} X_{12} + X_{12} T_{22}^{1/2} = T_{12}$. In this way the task of computing the square root of $T$ is reduced to the tasks of recursively computing the square roots of the smaller matrices $T_{11}$ and $T_{22}$ and then solving the Sylvester equation for $X_{12}$. The Sylvester equation is solved using an LAPACK routine, for efficiency. If you’d like to take a look at the code, type edit private/sqrtm_tri at the MATLAB prompt. For more on this recursive scheme for computing square roots of triangular matrices see Blocked Schur Algorithms for Computing the Matrix Square Root (2013) by Deadman, Higham and Ralha.

The sqrtm in MATLAB 2015b includes two further refinements.

• For real matrices it uses the real Schur form, which means that all computations are carried out in real arithmetic, giving a speed advantage and ensuring that the result will not be contaminated by imaginary parts at the roundoff level.
• It estimates the 1-norm condition number of the matrix square root instead of the 2-norm condition number, so exploits the normest1 function.

Finally, I note that the product of two triangular matrices is also not a level-3 BLAS routine, yet again it is needed in matrix function codes. A proposal for it to be included in the Intel Math Kernel Library was made recently by Peter Kandolf, and I strongly support the proposal.

# A Collection of Invalid Correlation Matrices

I’ve written before (here) about the increasingly common problem of matrices that are supposed to be correlation matrices (symmetric and positive semidefinite with ones on the diagonal) turning out to have some negative eigenvalues. This is usually bad news because it means that subsequent computations are unjustified and even dangerous. The problem occurs in a wide variety of situations. For example in portfolio optimization a consequence could be to take arbitrarily large positions in a stock, as discussed by Schmelzer and Hauser in Seven Sins in Portfolio Optimization.

Much research has been done over the last fifteen years or so on how to compute the nearest correlation matrix to a given matrix, and these techniques provide a natural way to correct an “invalid” correlation matrix. Of course, other approaches can be used, such as going back to the underlying data and massaging it appropriately, but it is clear from the literature that this is not always possible and practitioners may not have the mathematical or statistical knowledge to do it.

Nataša Strabić and I have built up a collection of invalid correlation matrices, which we used most recently in work on Bounds for the Distance to the Nearest Correlation Matrix. These are mostly real-life matrices, which makes them valuable for test purposes.

We have made our collection of invalid correlation matrices available in MATLAB form on GitHub as the repository matrices-correlation-invalid. I am delighted to be able to include, with the permission of investment company Orbis, two relatively large matrices, of dimensions 1399 and 3120, arising in finance. These were the matrices I used in my original 2002 paper.

# Empty Matrices in MATLAB

What matrix has zero norm, unit determinant, and is its own inverse? The conventional answer would be that there is no such matrix. But the empty matrix [ ] in MATLAB satisfies these conditions:

>> A = []; norm(A), det(A), inv(A)
ans =
0
ans =
1
ans =
[]


While many MATLAB users will be familiar with the use of [ ] as a way of removing a row or column of a matrix (e.g., A(:,1) = []), or omitting an argument in a function call (e.g., max(A,[],2)), fewer will be aware that [ ] is just one in a whole family of empty matrices. Indeed [ ] is the 0-by-0 empty matrix

>> size([])
ans =
0     0


Empty matrices can have dimension n-by-0 or 0-by-n for any nonnegative integer n. One way to construct them is with double.empty (or the empty method of any other MATLAB class):

>> double.empty
ans =
[]
>> double.empty(4,0)
ans =
Empty matrix: 4-by-0


What makes empty matrices particularly useful is that they satisfy natural generalizations of the rules of matrix algebra. In particular, matrix multiplicaton is defined whenever the inner dimensions match up.

>> A = double.empty(0,5)*double.empty(5,0)
A =
[]
>> A = double.empty(2,0)*double.empty(0,4)
A =
0     0     0     0
0     0     0     0


As the second example shows, the product of empty matrices with positive outer dimensions has zero entries. This ensures that expressions like the following work as we would hope:

>> p = 0; A = ones(3,2); B = ones(3,p); C = ones(2,3); D = ones(p,3);
>> [A B]*[C; D]
ans =
2     2     2
2     2     2
2     2     2


In examples such as this empty matrices are very convenient, as they avoid us having to program around edge cases.

Empty matrices have been in MATLAB since 1986, their inclusion having been suggested by Rod Smart and Rob Schreiber $\lbrack 1 \rbrack$. A 1989 MATLAB manual says

We’re not sure we’ve done it correctly, or even consistently, but we have found the idea useful.

In those days there was only one empty matrix, the 0-by-0 matrix, and this led Nett and Haddad (1993) to describe the MATLAB implementation of the empty matrix concept as “neither correct, consistent, or useful, at least not for system-theoretic applications”. Nowadays MATLAB gets it right and indeed it adheres to the rules suggested by those authors and by de Boor (1990). If you are wondering how the values for norm([]), det([]) and inv([]) given above are obtained, see de Boor’s article for an explanation in terms of linear algebra transformations.

The concept of empty matrices dates back to before MATLAB. The earliest reference I am aware of is a 1970 book by Stoer and Witzgall. As the extract below shows, these authors recognized the need to support empty matrices of varying dimension and they understood how multiplication of empty matrices should work.

## Reference

1. John N. Little, The MathWorks Newsletter, 1 (1), March 1986.

# Updated Catalogue of Software for Matrix Functions

Edvin Deadman and I have updated the catalogue of software for matrix functions that we produced in 2014 (and which was discussed in this post). The new version, which has undergone some minor reorganization, is available here. It covers what is available in languages (C++, Fortran, Java, Julia, Python), problem solving environments (GNU Octave, Maple, Mathematica, MATLAB, R, Scilab), and libraries (GNU Scientific Library, NAG Library, SLICOT).

Here are some highlights of changes in the last two years that are reflected in the new version.

Other changes to the catalogue include these.

• Two more R packages are included.

Suggestions for inclusion in a future revision are welcome.

# The Improved MATLAB Functions Expm and Logm

The matrix exponential is a ubiquitous matrix function, important both for theory and for practical computation. The matrix logarithm, an inverse to the exponential, is also increasingly used (see my earlier post, 400 Years of Logarithms).

MATLAB R2015b introduced new versions of the expm and logm functions. The Release Notes say

The algorithms for expm, logm, and sqrtm show increased accuracy, with logm and sqrtm additionally showing improved performance.

The help text for expm and logm is essentially unchanged from the previous versions, so what’s different about the new functions? (I will discuss sqrtm in a future post.)

The answer is that both functions make use of new backward error bounds that can be much smaller than the old ones for very nonnormal matrices, and so help to avoid a phenomenon known as overscaling. The key change is that when bounding a matrix power series $p(X) = a_0 I + a_1 X + a_2 X^2 + \cdots$, instead of bounding the $k$th term $a_k X^k$ by $|a_k| \|X\|^k$, a potentially smaller bound is used.

This is best illustrated by example. Suppose we want to bound $\|X^{12}\|$ and are not willing to compute $X^{12}$ but are willing to compute lower powers of $X$. We have $12 = 6 \times 2 = 4 \times 3$, so $\|X^{12}\|$ is bounded by each of the terms $(\|X^2\|)^6$, $(\|X^3\|)^4$, $(\|X^4\|)^3$, and $(\|X^6\|)^2$. But it is easy to see that $(\|X^6\|)^2 \le (\|X^2\|)^6$ and $(\|X^6\|)^2 \le (\|X^3\|)^4$, so we can discard two of the bounds, ending up with

$\|X^{12}\| \le \min( \|X^4\|^3, \|X^6\|^2 ).$

This argument can be generalized so that every power of $X$ is bounded in terms of the norms of $X^p$ for values of $p$ up to some small, fixed value. The gains can be significant. Consider the matrix

$X = \begin{bmatrix}1 & 100 \\ 0 & 1 \end{bmatrix}.$

We have $\|X\|^{12} \approx 10^{24}$, but

$X^k = \begin{bmatrix}1 & 100k \\ 0 & 1 \end{bmatrix},$

so the bound above is roughly $\|X^{12}\| \le 6 \times 10^{7}$, which is a significant improvement.

One way to understand what is happening is to note the inequality

$\rho(X) \le \| X^k\| ^{1/k} \le \|X\|,$

where $\rho$ is the spectral radius (the largest modulus of any eigenvalue). The upper bound corresponds to the usual analysis. The lower bound is something that we cannot use to bound the norm of the power series. The middle term is what we are using, and as $k\to\infty$ the middle term converges to the lower bound, which can be arbitrarily smaller than the upper bound.

What is the effect of these bounds on the algorithms in expm and logm? Both algorithms make use of Padé approximants, which are good only for small-normed matrices, so the algorithms begin by reducing the norm of the input matrix. Backward error bounds derived by bounding a power series as above guide the norm reduction and if the bounds are weak then the norm is reduced too much, which can result in loss of accuracy in floating point arithmetic—the phenomenon of overscaling. The new bounds greatly reduce the chance of overscaling.

In his blog post A Balancing Act for the Matrix Exponential, Cleve Moler describes a badly scaled 3-by-3 matrix for which the original expm suffers from overscaling and a loss of accuracy, but notes that the new algorithm does an excellent job.

The new logm has another fundamental change: it applies inverse scaling and squaring and Padé approximation to the whole triangular Schur factor, whereas the previous logm applied this technique to the individual diagonal blocks in conjunction with the Parlett recurrence.

For more on the algorithms underlying the new codes see these papers. The details of how the norm of a matrix power series is bounded are given in Section 4 of the first paper.