A subspace of is an *invariant subspace* of if for all . If we partition and conformably we can write

which gives , showing that the columns of span an invariant subspace of . Furthermore, . The first column of is an eigenvector of corresponding to the eigenvalue , but the other columns are not eigenvectors, in general. Eigenvectors can be computed by solving upper triangular systems involving , where is an eigenvalue.

Write , where and is strictly upper triangular. Taking Frobenius norms gives , or

Hence is independent of the particular Schur decomposition and it provides a measure of the departure from normality. The matrix is normal (that is, ) if and only if . So a normal matrix is unitarily diagonalizable: .

An important application of the Schur decomposition is to compute matrix functions. The relation shows that computing reduces to computing a function of a triangular matrix. Matrix functions illustrate what Van Loan (1975) describes as “one of the most basic tenets of numerical algebra”, namely “anything that the Jordan decomposition can do, the Schur decomposition can do better!”. Indeed the Jordan canonical form is built on a possibly ill conditioned similarity transformation while the Schur decomposition employs a perfectly conditioned unitary similarity, and the full upper triangular factor of the Schur form can do most of what the Jordan form’s bidiagonal factor can do.

An *upper quasi-triangular matrix* is a block upper triangular matrix

whose diagonal blocks are either or . A real matrix has a *real Schur decomposition* in which in which all the factors are real, is orthogonal, and is upper quasi-triangular with any diagonal blocks having complex conjugate eigenvalues. If is normal then the blocks have the form

which has eigenvalues .

The Schur decomposition can be computed by the QR algorithm at a cost of about flops for and , or flops for only.

In MATLAB, the Schur decomposition is computed with the `schur`

function: the command `[Q,T] = schur(A)`

returns the real Schur form if is real and otherwise the complex Schur form. The complex Schur form for a real matrix can be obtained with `[Q,T] = schur(A,'complex')`

. The `rsf2csf`

function converts the real Schur form to the complex Schur form. The= `ordschur`

function takes a Schur decomposition and modifies it so that the eigenvalues appear in a specified order along the diagonal of .

- Nicholas J. Higham, Functions of Matrices: Theory and Computation, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2008.
- C. F. Van Loan, A Study of the Matrix Exponential, Numerical Analysis Report No. 10, University of Manchester, UK, 1975.

This article is part of the “What Is” series, available from https://nhigham.com/index-of-what-is-articles/ and in PDF form from the GitHub repository https://github.com/higham/what-is.

Premultiplying a matrix by reorders the rows and postmultiplying by reorders the columns. A permutation matrix that has the desired reordering effect is constructed by doing the same operations on the identity matrix.

Examples of permutation matrices are the identity matrix , the reverse identity matrix , and the shift matrix (also called the cyclic permutation matrix), illustrated for by

Pre- or postmultiplying a matrix by reverses the order of the rows and columns, respectively. Pre- or postmultiplying a matrix by shifts the rows or columns, respectively, one place forward and moves the first one to the last position—that is, it cyclically permutes the rows or columns. Note that is a symmetric Hankel matrix and is a circulant matrix.

An *elementary permutation matrix* differs from in just two rows and columns, and , say. It can be written , where is the th column of . Such a matrix is symmetric and so satisfies , and it has determinant . A general permutation matrix can be written as a product of elementary permutation matrices , where is such that .

It is easy to show that , which means that the eigenvalues of are , where is the th root of unity. The matrix has two diagonals of s, which move up through the matrix as it is powered: for and . The following animated gif superposes MATLAB spy plots of , , …, .

The shift matrix plays a fundamental role in characterizing irreducible permutation matrices. Recall that a matrix is irreducible if there does not exist a permutation matrix such that

where and are square, nonempty submatrices.

For a permutation matrix the following conditions are equivalent.Theorem 1.

- is irreducible.
- There exists a permutation matrix such that
- The eigenvalues of are .

One consequence of Theorem 1 is that for any irreducible permutation matrix , .

The next result shows that a reducible permutation matrix can be expressed in terms of irreducible permutation matrices.

Every reducible permutation matrix is permutation similar to a direct sum of irreducible permutation matrices.Theorem 2.

Another notable permutation matrix is the vec-permutation matrix, which relates to , where is the Kronecker product.

A permutation matrix is an example of a *doubly stochastic matrix*: a nonnegative matrix whose row and column sums are all equal to . A classic result characterizes doubly stochastic matrices in terms of permutation matrices.

A matrix is doubly stochastic if and only if it is a convex combination of permutation matrices.Theorem 3 (Birkhoff).

In coding, memory can be saved by representing a permutation matrix as an integer vector , where is the column index of the within the th row of . MATLAB functions that return permutation matrices can also return the permutation in vector form. Here is an example with the MATLAB `lu`

function that illustrates how permuting a matrix can be done using the vector permutation representation.

>> A = gallery('frank',4), [L,U,P] = lu(A); P A = 4 3 2 1 3 3 2 1 0 2 2 1 0 0 1 1 P = 1 0 0 0 0 0 1 0 0 0 0 1 0 1 0 0 >> P*A ans = 4 3 2 1 0 2 2 1 0 0 1 1 3 3 2 1 >> [L,U,p] = lu(A,'vector'); p p = 1 3 4 2 >> A(p,:) ans = 4 3 2 1 0 2 2 1 0 0 1 1 3 3 2 1

For more on handling permutations in MATLAB see section 24.3 of MATLAB Guide.

For proofs of Theorems 1–3 see Zhang (2011, Sec. 5.6). Theorem 3 is also proved in Horn and Johnson (2013, Thm. 8.7.2).

Permutations play a key role in the fast Fourier transform and its efficient implementation; see Van Loan (1992).

- Desmond Higham and Nicholas Higham, MATLAB Guide, third edition, SIAM, 2017.
- Roger A. Horn and Charles R. Johnson, Matrix Analysis, second edition, Cambridge University Press, 2013. My review of the second edition.
- Charles F. Van Loan, Computational Frameworks for the Fast Fourier Transform, SIAM, 1992.
- Fuzhen Zhang, Matrix Theory: Basic Results and Techniques, Springer, 2011.

MATLAB Online now has themes, including a dark theme (which is my preference). We will have to wait for a future release for themes to be supported on desktop MATLAB.

I recall that @nhigham was asking for this. Currently @MATLAB Online only at the moment though. Desktop MATLAB isn’t there yet

— Mike Croucher (@walkingrandomly) March 11, 2022

One can now write `qr(A,'econ')`

instead of `qr(A,0)`

and `gsvd(A,B,'econ')`

instead of `gsvd(A,B)`

for the “economy size” decompositions. This is useful as the `'econ'`

form is more descriptive. The `svd`

function already supported the `'econ'`

argument. The economy-size QR factorization is sometimes called the thin QR factorization.

The `round`

function, which rounds to the nearest integer, now breaks ties by rounding away from zero by default and has several other tie-breaking options (albeit not stochastic rounding). See a sequence of four blog posts on this topic by Cleve Moler starting with this one from February 2021.

The `null`

(nullspace) and `orth`

(orthonormal basis for the range) functions now accept a tolerance as a second argument, and any singular values less than that tolerance are treated as zero. The default tolerance is `max(size(A)) * eps(norm(A))`

. This change brings the two functions into line with `rank`

, which already accepted the tolerance. If you are working in double precision (the MATLAB default) and your matrix has inherent errors of order (for example), you might set the tolerance to , since singular values smaller than this are indistinguishable from zero.

The unit testing framework can now generate docx, html, and pdf reports after test execution, by using the function `generatePDFReport`

in the latter case. This is useful for keeping a record of test results and for printing them. We use unit testing in Anymatrix and have now added an option to return the results in a variable so that the user can call one of these new functions.

Previously, if you wanted to check whether a matrix had all finite values you would need to use a construction such as `all(all(isfinite(A)))`

or `all(isfinite(A),'all')`

. The new `allfinite`

function does this in one go: `allfinite(A)`

returns true or false according as all the elements of `A`

are finite or not, and it works for arrays of any dimension.

Similarly, `anynan`

and `anymissing`

check for NaNs or missing values. A missing value is a NaN for numerical arrays, but is indicated in other ways for other data types.

The new `pagemldivide`

, `pagemrdivide`

, and `pageinv`

functions solve linear equations and calculate matrix inverses using pages of -dimensional arrays, while `tensorprod`

calculates tensor products (inner products, outer products, or a combination of the two) between two -dimensional arrays.

The append option of the `exportgraphics`

function now supports the GIF format, enabling one to create animated GIFs (previously only multipage PDF files were supported). The key command is `exportgraphics(gca,file_name,"Append",true)`

. There are other ways of creating animated GIFs in MATLAB, but this one is particularly easy. Here is an example M-file (based on cheb3plot in MATLAB Guide) with its output below.

%CHEB_GIF Animated GIF of Chebyshev polynomials. % Based on cheb3plot in MATLAB Guide. x = linspace(-1,1,1500)'; p = 49 Y = ones(length(x),p); Y(:,2) = x; for k = 3:p Y(:,k) = 2*x.*Y(:,k-1) - Y(:,k-2); end delete cheby_animated.gif a = get(groot,'defaultAxesColorOrder'); m = length(a); for j = 1:p-1 % length(k) plot(x,Y(:,j),'LineWidth',1.5,'color',a(1+mod(j-1,m),:)); xlim([-1 1]), ylim([-1 1]) % Must freeze axes. title(sprintf('%2.0f', j),'FontWeight','normal') exportgraphics(gca,"cheby_animated.gif","Append",true) end

The inverse of also satisfies , as we now show. The equation says that for , where is the th column of and is the th unit vector. Hence the columns of span , which means that the columns are linearly independent. Now , so every column of is in the null space of . But this contradicts the linear independence of the columns of unless , that is, .

The inverse of a nonsingular matrix is unique. If then premultiplying by gives , or, since , .

The inverse of the inverse is the inverse: , which is just another way of interpreting the equations .

Since the determinant of a product of matrices is the product of the determinants, the equation implies , so the inverse can only exist when . In fact, the inverse always exists when .

An explicit formula for the inverse is

where the adjugate is defined by

and where denotes the submatrix of obtained by deleting row and column . A special case is the formula

The equation implies .

The following result collects some equivalent conditions for a matrix to be nonsingular. We denote by the null space of (also called the kernel).

For the following conditions are equivalent to being nonsingular:Theorem 1.

- ,
- ,
- has a unique solution , for any ,
- none of the eigenvalues of is zero,
- .

A useful formula is

Here are some facts about the inverses of matrices of special types.

- A diagonal matrix is nonsingular if for all , and .
- An upper (lower) triangular matrix is nonsingular if its diagonal elements are nonzero, and the inverse is upper (lower) triangular with element .
- If and , then is nonsingular and
This is the Sherman–Morrison formula.

The Cayley-–Hamilton theorem says that a matrix satisfies its own characteristic equation, that is, if , then . In other words, , and if is nonsingular then multiplying through by gives (since

This means that is expressible as a polynomial of degree at most in (with coefficients that depend on ).

The inverse is an important theoretical tool, but it is rarely necessary to compute it explicitly. If we wish to solve a linear system of equations then computing and then forming is both slower and less accurate in floating-point arithmetic than using LU factorization (Gaussian elimination) to solve the system directly. Indeed, for one would not solve by computing .

For sparse matrices, computing the inverse may not even be practical, as the inverse is usually dense.

If one needs to compute the inverse, how should one do it? We will consider the cost of different methods, measured by the number of elementary arithmetic operations (addition, subtraction, division, multiplication) required. Using (1), the cost is that of computing one determinant of order and determinants of order . Since computing a determinant costs at least operations by standard methods, this approach costs at least operations, which is prohibitively expensive unless is very small. Instead one can compute an LU factorization with pivoting and then solve the systems for the columns of , at a total cost of operations.

Equation (2) does not give a good method for computing , because computing the coefficients and evaluating a matrix polynomial are both expensive.

It is possible to exploit fast matrix multiplication methods, which compute the product of two matrices in operations for some . By using a block LU factorization recursively, one can reduce matrix inversion to matrix multiplication. If we use Strassen’s fast matrix multiplication method, which has , then we can compute in operations.

MATLAB uses the backslash and forward slash for “matrix division”, with the meanings and . Note that because matrix multiplication is not commutative, , in general. We have and . In MATLAB, the inverse can be compute with `inv(A)`

, which uses LU factorization with pivoting.

If is then the equation requires to be , as does . Rank considerations show that at most one of these equations can hold if . For example, if is a nonzero row vector, then for , but . This is an example of a generalized inverse.

Here is a triangular matrix with an interesting inverse. This example is adapted from the LINPACK Users’ Guide, which has the matrix, with “LINPACK” replacing “INVERSE” on the front cover and the inverse on the back cover.

- What Is a Generalized Inverse? (2020)
- What Is a Sparse Matrix? (2020)
- What Is an LU Factorization? (2021)
- What Is the Adjugate of a Matrix? (2020)
- What Is the Determinant of a Matrix? (2021)
- What Is the Sherman–Morrison–Woodbury Formula? (2020)

is a solution, in some appropriate sense, of the equation

It suffices to consider the case , because backslash treats the columns independently, and we write this as

The MATLAB backslash operator handles several cases depending on the relative sizes of the row and column dimensions of and whether it is rank deficient.

When is square, backslash returns , computed by LU factorization with partial pivoting (and of course without forming ). There is no special treatment for singular matrices, so for them division by zero may occur and the output may contain NaNs (in practice, what happens will usually depend on the rounding errors). For example:

>> A = [1 0; 0 0], b = [1 0]', x = A\b A = 1 0 0 0 b = 1 0 Warning: Matrix is singular to working precision. x = 1 NaN

Backslash take advantage of various kinds of structure in ; see MATLAB Guide (section 9.3) or `doc mldivide`

in MATLAB.

An overdetermined system has no solutions, in general. Backslash yields a least squares (LS) solution, which is unique if has full rank. If is rank-deficient then there are infinitely many LS solutions, and backslash returns a *basic solution*: one with at most nonzeros. Such a solution is not, in general, unique.

An underdetermined system has fewer equations than unknowns, so either there is no solution of there are infinitely many. In the latter case produces a basic solution and in the former case a basic LS solution. Example:

>> A = [1 1 1; 1 1 0]; b = [3 2]'; x = A\b x = 2.0000e+00 0 1.0000e+00

Another basic solution is , and the minimum -norm solution is .

Now we turn to the special case , which in terms of equation (1) is a solution to . If then is not a basic solution, so ; in fact, if and it is matrix of NaNs if .

For an underdetermined system with full-rank , is not necessarily the identity matrix:

>> A = [1 0 1; 0 1 0], X = A\A A = 1 0 1 0 1 0 X = 1 0 1 0 1 0 0 0 0

But for an overdetermined system with full-rank , *is* the identity matrix:

>> A'\A' ans = 1.0000e+00 0 -1.9185e-17 1.0000e+00

The MATLAB definition of is a pragmatic one, as it computes a solution or LS solution to in the most efficient way, using LU factorization () or QR factorization ). Often, one wants the solution of minimum -norm, which can be expressed as , where is the pseudoinverse of . In MATLAB, can be computed by `lsqminnorm(A,b)`

or `pinv(A)*b`

, the former expression being preferred as it avoids the unnecessary computation of and it uses a complete orthogonal factorization instead of an SVD.

When the right-hand side is a matrix, , `lsqminnorm(A,B)`

and `pinv(A)*B`

give the solution of minimal Frobenius norm, which we write as . Then , which is the orthogonal projector onto , and it is equal to the identity matrix when and has full rank. For the matrix above:

>> A = [1 0 1; 0 1 0], X = lsqminnorm(A,A) A = 1 0 1 0 1 0 X = 5.0000e-01 0 5.0000e-01 0 1.0000e+00 0 5.0000e-01 0 5.0000e-01

- What Is an LU Factorization? (2021)
- What Is a QR Factorization? (2020)
- What Is a Rank-Revealing Factorization? (2021)

Theorem(Jordan canonical form). Any matrix can be expressed as

where is nonsingular and . The matrix is unique up to the ordering of the blocks .

The matrix is (up to reordering of the diagonal blocks) the *Jordan canonical form* of (or the Jordan form, for short).

The bidiagonal matrices are called *Jordan blocks*. Clearly, the eigenvalues of are repeated times and has a single eigenvector, . Two different Jordan blocks can have the same eigenvalues.

In total, has linearly independent eigenvectors, and the same is true of .

The Jordan canonical form is an invaluable tool in matrix analysis, as it provides a concrete way to prove and understand many results. However, the Jordan form can not be reliably computed in finite precision arithmetic, so it is of little use computationally, except in special cases such as when is Hermitian or normal.

For a Jordan block we have

The superdiagonal of ones moves up to the right with each increase in the index of the power, until it disappears off the top corner of the matrix.

It is easy to see that , and so

For , these quantities provide information about the size of the Jordan blocks associated with . To be specific, let

and

By considering the equations above, it can be shown that is the number of Jordan blocks of size at least in which appears. Moreover, the number of Jordan blocks of size is . Therefore if we know the eigenvalues and the ranks of for each eigenvalue and appropriate then we can determine the Jordan structure. As an important special case, if then we know that appears in a single Jordan block. The sequence of is known as the *Weyr characteristic*, and it satisfies .

As an example of a matrix for which we can easily deduce the Jordan form consider the nilpotent matrix , for which and all the eigenvalues are zero. Since , we have , , and . Hence and , so there are Jordan blocks. (In fact, can be permuted into Jordan form by a similarity transformation.)

Here is an example with the matrix `anymatrix('core/collatz',11).`

We have and , so and are simple eigenvalues. All the other eigenvalues are and they have the following and values:

We conclude that the eigenvalue occurs in one block of order , one block of order , and two blocks of order .

A matrix and its transpose have the same Jordan form. One way to see this is to note that implies

where is the identity matrix with the its columns reversed. A consequence is that and are similar.

A version of the Jordan form with and real exists for . The main change is how complex eigenvalues are represented. Since the eigenvalues now occur in complex conjugate pairs and , and each of the pair has the same Jordan structure (which follows from the fact that a matrix and its complex conjugate have the same rank), pairs of Jordan blocks corresponding to and are combined into a real block of twice the size. For example, Jordan blocks

become

in the real Jordan form, where . Note that the eigenvalues of are .

Proofs of the Jordan canonical form and its real variant can be found in many textbooks. See also Brualdi (1987) and Fletcher and Sorensen (1983), who give proofs that go via the Schur decomposition.

- Richard A. Brualdi. The Jordan Canonical Form: An Old Proof. Amer. Math. Monthly, 94(3):257–267,1987.
- Richard A. Brualdi, Pei Pei, and Xingzhi Zhan. An Extremal Sparsity Property of the Jordan Canonical Form. Linear Algebra Appl., 429(10):2367–2372, 2008.
- R. Fletcher and D. C. Sorensen. An Algorithmic Derivation of the Jordan Canonical Form. Amer. Math. Monthly, 90(1):12–16, 1983.

It arises when a second derivative is approximated by the central second difference . (Accordingly, the second difference matrix is sometimes defined as .) In MATLAB, can be generated by `gallery('tridiag',n)`

, which is returned as a aparse matrix.

This is Gil Strang’s favorite matrix. The photo, from his home page, shows a birthday cake representation of the matrix.

The second difference matrix is symmetric positive definite. The easiest way to see this is to define the full rank rectangular matrix

and note that . The factorization corresponds to representing a central difference as the product of a forward difference and a backward difference. Other properties of the second difference matrix are that it is diagonally dominant, a Toeplitz matrix, and an -matrix.

In an LU factorization the pivots are , , , …, . Hence the pivots form a decreasing sequence tending to 1 as . The diagonal of the Cholesky factor contains the square roots of the pivots. This means that in the Cholesky factorization with upper bidiagonal, and as .

Since the determinant is the product of the pivots, .

The inverse of is full, with element for . For example,

The -norm condition number satisfies (as follows from the formula (1) below for the eigenvalues).

The eigenvalues of are

where , with corresponding eigenvector

The matrix with

is therefore an eigenvector matrix for : .

Various modifications of the second difference matrix arise and similar results can be derived. For example, consider the matrix obtained by changing the element to :

It can be shown that has element and eigenvalues , .

If we perturb the of to , we obtain a singular matrix, but suppose we perturb the element to :

The inverse is , where with element is a matrix of Givens, and the eigenvalues are , .

The factorization is noted by Strang (2012).

For a derivation of the eigenvalues and eigenvectors see Todd (1977, p. 155 ff.). For the eigenvalues of see Fortiana and Cuadras (1997). Givens’s matrix is described by Newman and Todd (1958) and Todd (1977).

This is a minimal set of references, which contain further useful references within.

- J. Fortiana and C. N. Cuadras, A Family of Matrices, the Discretized Brownian Bridge, and Distance-Based Regression, Linear Algebra Appl. 264, 173–188, 1997.
- Morris Newman and John Todd, The Evaluation of Matrix Inversion Programs, J. Soc. Indust. Appl. Math. 6(4), 466–476, 1958.
- Gilbert Strang, Essays in Linear Algebra, Wellesley-Cambridge Press, Wellesley, MA, USA, 2012. Chapter A.3: “My Favorite Matrix”.
- John Todd, Basic Numerical Mathematics, Vol. : Numerical Algebra, Birkhäuser, Basel, and Academic Press, New York, 1977.

- What Is a Diagonally Dominant Matrix? (2021)
- What Is a Sparse Matrix? (2020)
- What Is a Symmetric Positive Definite Matrix? (2020)
- What Is a Tridiagonal Matrix? (2022)
- What Is an M-Matrix? (2021)

was introduced by Frank in 1958 as a test matrix for eigensolvers.

Taking the Laplace expansion about the first column, we obtain , and since we have .

In MATLAB, the computed determinant of the matrix and its transpose can be far from the exact values of :

>> n = 20; F = anymatrix('gallery/frank',n); >> [det(F), det(F')] ans = 1.0000e+00 -1.4320e+01 >> n = 49; F = anymatrix('gallery/frank',n); det(F) ans = -1.406934439401568e+45

This behavior illustrates the sensitivity of the determinant rather than numerical instability in the evaluation and it is very dependent on the arithmetic (different results may be obtained in different releases of MATLAB). The sensitivity can be seen by noting that

which means that changing the element from 1 to changes the determinant by .

It is easily verified that

Hence is lower Hessenberg with integer entries. This factorization provides another way to see that .

From (1) we see that is singular for , which implies that

for any subordinate matrix norm, showing that the condition number grows very rapidly with . In fact, this lower bound is within a factor of the condition number for the -, -, and -norms for .

Denote by the characteristic polynomial of . By expanding about the first column one can show that

Using (3), one can show by induction that

This means that the eigenvalues of occur in reciprocal pairs, and hence that is an eigenvalue when is odd. It also follows that is palindromic when is even and anti-palindromic when is odd. Examples:

>> charpoly( anymatrix('gallery/frank',6) ) ans = 1 -21 120 -215 120 -21 1 >> charpoly( anymatrix('gallery/frank',7) ) ans = 1 -28 231 -665 665 -231 28 -1

Since has nonzero subdiagonal entries, for any , and hence is nonderogatory, that is, no eigenvalue appears in more than one Jordan block in the Jordan canonical form. It can be shown that the eigenvalues are real and positive and that they can be expressed in terms of the zeros of Hermite polynomials. Furthermore, the eigenvalues are distinct.

If each eigenvalue of an matrix undergoes a small perturbation then the determinant, being the product of the eigenvalues, undergoes a perturbation up to about times as large. From (1) we see that a change to of order can perturb by , and it follows that some eigenvalues must undergo a large relative perturbation. The condition number of a simple eigenvalue is defined as the reciprocal of the cosine of the angle between its left and right eigenvectors. For the Frank matrix it is the small eigenvalues that are ill conditioned, as shown here for .

>> n = 9; F = anymatrix('gallery/frank',n); >> [V,D,cond_evals] = condeig(F); evals = diag(D); >> [~,k] = sort(evals,'descend'); >> [evals(k) cond_evals(k)] ans = 2.2320e+01 1.9916e+00 1.2193e+01 2.3903e+00 6.1507e+00 1.5268e+00 2.6729e+00 3.6210e+00 1.0000e+00 6.8615e+01 3.7412e-01 1.5996e+03 1.6258e-01 1.1907e+04 8.2016e-02 2.5134e+04 4.4803e-02 1.4762e+04

Frank found that “gives our selected procedures difficulties” and that “accuracy was lost in the smaller roots”. The difficulties encountered by Frank’s codes were shown by Wilkinson to be caused by the inherent sensitivity of the eigenvalues to perturbations in the matrix.

This is a minimal set of references, which contain further useful references within.

- Werner Frank, Computing Eigenvalues of Complex Matrices by Determinant Evaluation and by Methods of Danilewski and Wielandt, J. Soc. Indust. Appl. Math. 6(4), 378–392, 1958.
- Efruz Özlem Mersin and Mustafa Bahçsi, Sturm Theorem for the Generalized Frank Matrix, Hacettepe Journal of Mathematics and Statistics 50, 1002–1011, 2021.
- J. H. Wilkinson, Error Analysis of Floating-Point Computation, Numer. Math. 2(1), 319–340, 1960.
- J. H. Wilkinson, The Algebraic Eigenvalue Problem, Oxford University Press, 1965, Section 5.45.

where the norm is assumed to satisfy .

Note that the limit is taken from above. If we take the limit from below then we obtain a generally different quantity: writing ,

The logarithmic norm is not a matrix norm; indeed it can be negative: .

The logarithmic norm can also be expressed in terms of the matrix exponential.

Lemma 1.For ,

We give some basic properties of the logarithmic norm.

It is easy to see that

For , we define for and .

Lemma 2.For and ,

- ,
- ,
- ,
- .

The next result shows the crucial property that features in an easily evaluated bound for the norm of and that, moreover, is the smallest constant that can appear in this bound.

Theorem 3.For and a consistent matrix norm,

Moreover,

Proof. Given any , by Lemma 1 there exists such that

or

(since for all ). Then for any integer , , and hence holds for all . Since is arbitrary, it follows that .

Finally, if for all then for all and taking we conclude that .

The logarithmic norm was introduced by Dahlquist (1958) and Lozinskii (1958) in order to obtain error bounds for numerical methods for solving differential equations. The bound (2) can alternatively be proved by using differential inequalities (see Söderlind (2006)). Not only is (2) sharper than , but is increasing in while potentially decays, since is possible.

The *spectral abscissa* is defined by

where denotes the spectrum of (the set of eigenvalues). Whereas the norm bounds the spectral radius (), the logarithmic norm bounds the spectral abscissa, as shown by the next result.

Theorem 4. For and a consistent matrix norm,

Combining (1) with (2) gives

In view of Lemma 1, the logarithmic norm can be expressed as the one-sided derivative of at , so determines the initial rate of change of (as well as providing the bound for all ). The long-term behavior is determined by the spectral abscissa , since as if and only if , which can be proved using the Jordan canonical form.

The next result provides a bound on the norm of the inverse of a matrix in terms of the logarithmic norm.

Theorem 5. For a nonsingular matrix and a subordinate matrix norm, if or then

Explicit formulas are available for the logarithmic norm s corresponding to the , , and -norms.

Theorem 6. For ,

where denotes the largest eigenvalue of a Hermitian matrix.

Proof. We have

The formula for follows, since implies . For the -norm, using , we have

As a special case of (4), if is normal, so that with unitary and , then .

We can make some observations about for .

- If then .
- if and only if for all and is strictly row diagonally dominant.
- For the -norm the bound (3) is the same as a bound based on diagonal dominance except that it is applicable only when the diagonal is one-signed.

For a numerical example, consider the tridiagonal matrix `anymatrix('gallery/lesp')`

, which has the form illustrated for by

We find that and , and it is easy to see that and for all . Therefore Theorem 4 shows that as and gives a faster decaying bound than and .

Now consider the subordinate matrix norm based on the vector norm , where is a Hermitian positive definite matrix. The corresponding logarithmic norm can be expressed as the largest eigenvalue of a Hermitian definite generalized eigenvalue problem.

Theorem 7. For ,

Theorem 7 allows us to make a connection with the theory of ordinary differential equations (ODEs)

Let be symmetric positive definite and consider the inner product and the corresponding norm defined by . It can be shown that for ,

The function satisfies a one-sided Lipschitz condition if there is a function such that

for all in some region and all . For the linear differential equation with in (5), using (6) we obtain

and so the logarithmic norm can be taken as a one-sided Lipschitz constant. This observation leads to results on contractivity of ODEs; see Lambert (1991) for details.

For more on the use of the logarithmic norm in differential equations see Coppel (1965) and Söderlind (2006).

The proof of Theorem 3 is from Hinrichsen and Pritchard (2005).

This is a minimal set of references, which contain further useful references within.

- W. A. Coppel, Stability and Asymptotic Behavior of Differential Equations}. D. C. Heath and Company, Boston, MA. USA, 1965.
- Germund Dahlquist. Stability and Error Bounds in the Numerical Integration of Ordinary Differential Equations. PhD thesis, Royal Inst. of Technology, Stockholm, Sweden, 1958.
- Diederich Hinrichsen and Anthony J. Pritchard. Mathematical Systems Theory I. Modelling, State Space Analysis, Stability and Robustness. Springer-Verlag, Berlin, Germany, 2005.
- J. D. Lambert. Numerical Methods for Ordinary Differential Systems. The Initial Value Problem. John Wiley, Chichester, 1991.
- Gustaf Söderlind, The Logarithmic Norm. History and Modern Theory. BIT, 46(3):631–652, 2006.
- Torsten Ström. On Logarithmic Norms. SIAM J. Numer. Anal., 12(5):741–753, 1975.

- Anymatrix: An Extensible MATLAB Matrix Collection (2021)
- What is a Diagonally Dominant Matrix? (2021)
- What Is a Matrix Norm? (2021)

An important example is the matrix that arises in discretizating the Poisson partial differential equation by a standard five-point operator, illustrated for by

It is symmetric positive definite, diagonally dominant, a Toeplitz matrix, and an -matrix.

Tridiagonal matrices have many special properties and various algorithms exist that exploit their structure.

It can be useful to symmetrize a matrix by transforming it with a diagonal matrix. The next result shows when symmetrization is possible by similarity.

Theorem 1.If is tridiagonal and for all then there exists a diagonal with positive diagonal elements such that is symmetric, with element .

Proof.Let . Equating and elements in the matrix gives

or

As long as for all we can set and solve (2) to obtain real, positive , . The formula for the off-diagonal elements of the symmetrized matrix follows from (1) and (2).

The LU factors of a tridiagonal matrix are bidiagonal:

The equation gives the recurrence

The recurrence breaks down with division by zero if one of the leading principal submatrices , , is singular. In general, partial pivoting must be used to ensure existence and numerical stability, giving a factorization where has at most two nonzeros per column and has an extra superdiagonal. The growth factor is easily seen to be bounded by .

For a tridiagonal Toeplitz matrix

the elements of the LU factors converge as if is diagonally dominant.

Theorem 2.Suppose that has an LU factorization with LU factors (3) and that and . Then decreases monotonically and

From (4), it follows that under the conditions of Theorem 2, increases monotonically and . Note that the conditions of Theorem 2 are satisfied if is diagonally dominant by rows, since implies . Note also that if we symmetrize using Theorem 1 then we obtain the matrix , which is irreducibly diagonally dominant and hence positive definite if .

The inverse of a tridiagonal matrix is full, in general. For example,

Since an tridiagonal matrix depends on only parameters, the same must be true of its inverse, meaning that there must be relations between the elements of the inverse. Indeed, in any submatrix whose elements lie in the upper triangle is singular, and the submatrix is also singular. The next result explains this special structure. We note that a tridiagonal matrix is irreducible if and are nonzero for all .

Theorem 3.If is tridiagonal, nonsingular, and irreducible then there are vectors , , , and , all of whose elements are nonzero, such that

The theorem says that the upper triangle of the inverse agrees with the upper triangle of a rank- matrix () and the lower triangle of the inverse agrees with the lower triangle of another rank- matrix (). This explains the singular submatrices that we see in the example above.

If a tridiagonal matrix is reducible with then it has the block form

where , and so

in which the block is rank if . This blocking can be applied recursively until Theorem 1 can be applied to all the diagonal blocks.

The inverse of the Toeplitz tridiagonal matrix is known explicitly; see Dow (2003, Sec. 3.1).

The most widely used methods for finding eigenvalues and eigenvectors of Hermitian matrices reduce the matrix to tridiagonal form by a finite sequence of unitary similarity transformations and then solve the tridiagonal eigenvalue problem. Tridiagonal eigenvalue problems also arise directly, for example in connection with orthogonal polynomials and special functions.

The eigenvalues of the Toeplitz tridiagonal matrix in (5) are given by

The eigenvalues are also known for certain variations of the symmetric matrix in which the and elements are modified (Gregory and Karney, 1969).

Some special results hold for the eigenvalues of general tridiagonal matrices. A matrix is derogatory if an eigenvalue appears in more than one Jordan block in the Jordan canonical form, and nonderogatory otherwise.

Theorem 4.If is an irreducible tridiagonal matrix then it is nonderogatory.

Proof.Let , for any . The matrix is lower triangular with nonzero diagonal elements and hence it is nonsingular. Therefore has rank at least for all . If were derogatory then the rank of would be at most when is an eigenvalue, so must be nonderogatory.

Corollary 5.If is tridiagonal with for all then the eigenvalues of are real and simple.

Proof.By Theorem 1 the eigenvalues of are those of the symmetric matrix and so are real. The matrix is tridiagonal and irreducible so it is nonderogatory by Theorem 4, which means that its eigenvalues are simple because it is symmetric.

The formula (6) confirms the conclusion of Corollary 5 for tridiagonal Toeplitz matrices.

Corollary 5 guarantees that the eigenvalues are distinct but not that they are well separated. The spacing of the eigenvalues in (6) clearly reduces as increases. Wilkinson constructed a symmetric tridiagonal matrix called , defined by

For example,

Here are the two largest eigenvalues of , as computed by MATLAB.

>> A = anymatrix('matlab/wilkinson',21); >> e = eig(A); e([20 21]) ans = 10.746194182903322 10.746194182903393

These eigenvalues (which are correct to the digits shown) agree almost to the machine precision.

Theorem 2 is obtained for symmetric matrices by Malcolm and Palmer (1974), who suggest saving work in computing the LU factorization by setting for once is close enough to the limit.

A sample reference for Theorem 3 is Ikebe (1979), which is one of many papers on inverses of banded matrices.

The eigenvectors of a symmetric tridiagonal matrix satisfy some intricate relations; see Parlett (1998, sec. 7.9).

This is a minimal set of references, which contain further useful references within.

- Murray Dow, Explicit Inverses of Toeplitz and Associated Matrices, ANZIAM J. 44, E185–E215, 2003.
- Robert Gregory and David Karney, A Collection of Matrices for Testing Computational Algorithms, Wiley, 1969.
- Yasuhiko Ikebe, On inverses of Hessenberg matrices. Linear Algebra Appl., 24:93–97, 1979.
- Michael A. Malcolm and John Palmer, A Fast Method for Solving a Class of Tridiagonal Linear Systems, Comm. ACM 17, 14–17, 1974.
- Beresford Parlett, The Symmetric Eigenvalue Problem, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 1998.
- J. H. Wilkinson, The Algebraic Eigenvalue Problem, Oxford University Press, 1965, Section 5.45.