What Is a Tridiagonal Matrix?

A tridiagonal matrix is a square matrix whose elements are zero away from the main diagonal, the subdiagonal, and the superdiagonal. In other words, it is a banded matrix with upper and lower bandwidths both equal to $1$ . It has the form

$\notag A = \begin{bmatrix} d_1 & e_1 & & &\\ c_2 & d_2 & e_2 & &\\ & c_3 & \ddots & \ddots &\\ & & \ddots & \ddots & e_{n-1}\\ & & & c_n & d_n \end{bmatrix} \in\mathbb{C}^{n\times n}.$

An important example is the matrix $T_n$ that arises in discretizating the Poisson partial differential equation by a standard five-point operator, illustrated for $n = 5$ by

$\notag T_5 = \left[ \begin{array}{@{}*{4}{r@{\mskip10mu}}r} 4 & -1 & & & \\ -1 & 4 & -1 & & \\ & -1 & 4 & -1 & \\ & & -1 & 4 & -1 \\ & & & -1 & 4 \end{array}\right].$

It is symmetric positive definite, diagonally dominant, a Toeplitz matrix, and an $M$ -matrix.

Tridiagonal matrices have many special properties and various algorithms exist that exploit their structure.

Symmetrization

It can be useful to symmetrize a matrix by transforming it with a diagonal matrix. The next result shows when symmetrization is possible by similarity.

Theorem 1. If $A\in\mathbb{R}^{n\times n}$ is tridiagonal and $c_i e_{i-1} > 0$ for all $i$ then there exists a diagonal $D\in\mathbb{R}^{n\times n}$ with positive diagonal elements such that $D^{-1}AD$ is symmetric, with $(i-1,i)$ element $(c_ie_{i-1})^{1/2}$ .

Proof. Let $D = \mathrm{diag}(\omega_i)$ . Equating $(i,i-1)$ and $(i-1,i)$ elements in the matrix $D^{-1}AD$ gives

$\notag c_i \displaystyle\frac{\omega_{i-1}}{\omega_i} = e_{i-1} \displaystyle\frac{\omega_i}{\omega_{i-1}}, \quad i = 2:n, \qquad(1)$

or

$\notag \biggl(\displaystyle\frac{\omega_{i-1}}{\omega_i}\biggr)^2 = \frac{e_{i-1} }{c_i}, \quad i = 2:n. \qquad (2)$

As long as $c_i e_{i-1} > 0$ for all $i$ we can set $\omega_1 = 1$ and solve (2) to obtain real, positive $\omega_i$ , $i = 2:n$ . The formula for the off-diagonal elements of the symmetrized matrix follows from (1) and (2).

LU Factorization

The LU factors of a tridiagonal matrix are bidiagonal:

$\notag L = \begin{bmatrix} 1 & & & & \\ \ell_2 & 1 & & & \\ & \ell_3 & 1 & & \\ & & \ddots & \ddots & \\ & & & \ell_n & 1 \end{bmatrix}, \quad U = \begin{bmatrix} u_1 & e_1 & & & \\ & u_2 & e_2 & & \\ & & \ddots & \ddots & \\ & & & \ddots & e_{n-1}\\ & & & & u_n \end{bmatrix}. \qquad (3)$

The equation $A = LU$ gives the recurrence

$\notag u_1 = d_1, \qquad \ell_i = c_i/u_{i-1}, \quad u_i = d_i - \ell_i e_{i-1}, \quad i=2\colon n. \qquad (4)$

The recurrence breaks down with division by zero if one of the leading principal submatrices $A(1:k,1:k)$ , $k = 1:n-1$ , is singular. In general, partial pivoting must be used to ensure existence and numerical stability, giving a factorization $PA = LU$ where $L$ has at most two nonzeros per column and $U$ has an extra superdiagonal. The growth factor $\rho_n$ is easily seen to be bounded by $2$ .

For a tridiagonal Toeplitz matrix

$\notag T_n(c,d,e) = \begin{bmatrix} d & e & & \\ c & d & \ddots & \\ & \ddots & \ddots & e \\ & & c & d \end{bmatrix} \in\mathbb{C}^{n\times n} \qquad (5)$

the elements of the LU factors converge as $n\to\infty$ if $A$ is diagonally dominant.

Theorem 2. Suppose that $T_n(c,d,e)$ has an LU factorization with LU factors (3) and that $ce > 0$ and $|d| \ge 2\sqrt{ce}$ . Then $|u_j|$ decreases monotonically and

$\notag \lim_{n\to \infty}u_n = \frac{1}{2} \Bigl(d + \mathrm{sign}(d) \sqrt{ d^2 - 4ce } \Bigr).$

From (4), it follows that under the conditions of Theorem 2, $|\ell_i|$ increases monotonically and $\lim_{n\to\infty}\ell_n = e/\lim_{n\to\infty}u_n$ . Note that the conditions of Theorem 2 are satisfied if $A$ is diagonally dominant by rows, since $ce> 0$ implies $d \ge c + e \ge 2\sqrt{ce}$ . Note also that if we symmetrize $A$ using Theorem 1 then we obtain the matrix $T_n(\sqrt{ce},d, \sqrt{ce})$ , which is irreducibly diagonally dominant and hence positive definite if $d > 0$ .

Inverse

The inverse of a tridiagonal matrix is full, in general. For example,

$\notag T_5(-1,3,-1)^{-1} = \left[\begin{array}{rrrrr} 3 & -1 & 0 & 0 & 0\\ -1 & 3 & -1 & 0 & 0\\ 0 & -1 & 3 & -1 & 0\\ 0 & 0 & -1 & 3 & -1\\ 0 & 0 & 0 & -1 & 3 \end{array}\right]^{-1} = \frac{1}{144} \left[\begin{array}{ccccc} 55 & 21 & 8 & 3 & 1 \\ 21 & 63 & 24 & 9 & 3 \\ 8 & 24 & 64 & 24 & 8 \\ 3 & 9 & 24 & 63 & 21\\ 1 & 3 & 8 & 21 & 55 \end{array}\right].$

Since an $n\times n$ tridiagonal matrix depends on only $3n-2$ parameters, the same must be true of its inverse, meaning that there must be relations between the elements of the inverse. Indeed, in $T_5(-1,3,-1)^{-1}$ any $2\times 2$ submatrix whose elements lie in the upper triangle is singular, and the $(1:3,3:5)$ submatrix is also singular. The next result explains this special structure. We note that a tridiagonal matrix is irreducible if $a_{i+1,i}$ and $a_{i,i+1}$ are nonzero for all $i$ .

Theorem 3. If $A\in\mathbb{C}^{n\times n}$ is tridiagonal, nonsingular, and irreducible then there are vectors $u$ , $v$ , $x$ , and $y$ , all of whose elements are nonzero, such that

$\notag (A^{-1})_{ij} = \begin{cases} u_iv_j, & i\le j,\\ x_iy_j, & i\ge j.\\[4pt] \end{cases}$

The theorem says that the upper triangle of the inverse agrees with the upper triangle of a rank- $1$ matrix ( $uv^T$ ) and the lower triangle of the inverse agrees with the lower triangle of another rank- $1$ matrix ( $xy^T$ ). This explains the singular submatrices that we see in the example above.

If a tridiagonal matrix $A$ is reducible with $a_{k,k+1} = 0$ then it has the block form

$\notag \begin{bmatrix} A_{11} & 0 \\ A_{21} & A_{22} \end{bmatrix},$

where $A_{21} = a_{k+1,k}e_1e_k^T$ , and so

$\notag \begin{bmatrix} A_{11}^{-1} & 0 \\ - A_{22}^{-1} A_{21} A_{11}^{-1} & A_{22}^{-1} \end{bmatrix},$

in which the $(2,1)$ block is rank $1$ if $a_{k+1,k}\ne 0$ . This blocking can be applied recursively until Theorem 1 can be applied to all the diagonal blocks.

The inverse of the Toeplitz tridiagonal matrix $T_n(a,b,c)$ is known explicitly; see Dow (2003, Sec. 3.1).

Eigenvalues

The most widely used methods for finding eigenvalues and eigenvectors of Hermitian matrices reduce the matrix to tridiagonal form by a finite sequence of unitary similarity transformations and then solve the tridiagonal eigenvalue problem. Tridiagonal eigenvalue problems also arise directly, for example in connection with orthogonal polynomials and special functions.

The eigenvalues of the Toeplitz tridiagonal matrix $T_n(c,d,e)$ in (5) are given by

$\notag d + 2 (ce)^{1/2} \cos\biggl( \displaystyle\frac{k \pi}{n+1} \biggr), \quad k = 1:n. \qquad (6)$

The eigenvalues are also known for certain variations of the symmetric matrix $T_n(c,d,c)$ in which the $(1,1)$ and $(n,n)$ elements are modified (Gregory and Karney, 1969).

Some special results hold for the eigenvalues of general tridiagonal matrices. A matrix is derogatory if an eigenvalue appears in more than one Jordan block in the Jordan canonical form, and nonderogatory otherwise.

Theorem 4. If $A\in\mathbb{C}^{n\times n}$ is an irreducible tridiagonal matrix then it is nonderogatory.

Proof. Let $G = A - \lambda I$ , for any $\lambda$ . The matrix $G(1:n-1,2:n)$ is lower triangular with nonzero diagonal elements $e_1, \dots, e_{n-1}$ and hence it is nonsingular. Therefore $G$ has rank at least $n-1$ for all $\lambda$ . If $A$ were derogatory then the rank of $G$ would be at most $n-2$ when $\lambda$ is an eigenvalue, so $A$ must be nonderogatory.

Corollary 5. If $A\in\mathbb{R}^{n\times n}$ is tridiagonal with $c_i e_{i-1} > 0$ for all $i$ then the eigenvalues of $A$ are real and simple.

Proof. By Theorem 1 the eigenvalues of $A$ are those of the symmetric matrix $S = D^{-1}AD$ and so are real. The matrix $S$ is tridiagonal and irreducible so it is nonderogatory by Theorem 4, which means that its eigenvalues are simple because it is symmetric.

The formula (6) confirms the conclusion of Corollary 5 for tridiagonal Toeplitz matrices.

Corollary 5 guarantees that the eigenvalues are distinct but not that they are well separated. The spacing of the eigenvalues in (6) clearly reduces as $n$ increases. Wilkinson constructed a symmetric tridiagonal matrix called $W_n^+$ , defined by

$\notag \begin{aligned} d_i &= \frac{n+1}{2} - i = d_{n-i+1}, \quad i = 1:n/2, \qquad d_{(n+1)/2} = 0 \quad \mathrm{if}~n~\mathrm{is~odd}, \\ c_i &= e_{i-1} = 1. \end{aligned}$

For example,

$\notag W_5^+ = \left[\begin{array}{ccccc} 2 & 1 & 0 & 0 & 0\\ 1 & 1 & 1 & 0 & 0\\ 0 & 1 & 0 & 1 & 0\\ 0 & 0 & 1 & 1 & 1\\ 0 & 0 & 0 & 1 & 2 \end{array}\right].$

Here are the two largest eigenvalues of $W_{21}^+$ , as computed by MATLAB.

>> A = anymatrix('matlab/wilkinson',21); 
>> e = eig(A); e([20 21])
ans =
  10.746194182903322
  10.746194182903393

These eigenvalues (which are correct to the digits shown) agree almost to the machine precision.

Notes

Theorem 2 is obtained for symmetric matrices by Malcolm and Palmer (1974), who suggest saving work in computing the LU factorization by setting $u_j = u_k$ for $j > k$ once $u_k$ is close enough to the limit.

A sample reference for Theorem 3 is Ikebe (1979), which is one of many papers on inverses of banded matrices.

The eigenvectors of a symmetric tridiagonal matrix satisfy some intricate relations; see Parlett (1998, sec. 7.9).

References

This is a minimal set of references, which contain further useful references within.

Murray Dow, Explicit Inverses of Toeplitz and Associated Matrices, ANZIAM J. 44, E185–E215, 2003.
Robert Gregory and David Karney, A Collection of Matrices for Testing Computational Algorithms, Wiley, 1969.
Yasuhiko Ikebe, On inverses of Hessenberg matrices. Linear Algebra Appl., 24:93–97, 1979.
Michael A. Malcolm and John Palmer, A Fast Method for Solving a Class of Tridiagonal Linear Systems, Comm. ACM 17, 14–17, 1974.
Beresford Parlett, The Symmetric Eigenvalue Problem, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 1998.
J. H. Wilkinson, The Algebraic Eigenvalue Problem, Oxford University Press, 1965, Section 5.45.

What Is a Blocked Algorithm?

In numerical linear algebra a blocked algorithm organizes a computation so that it works on contiguous chunks of data. A blocked algorithm and the corresponding unblocked algorithm (with blocks of size $1$ ) are mathematically equivalent, but the blocked algorithm is generally more efficient on modern computers.

A simple example of blocking is in computing an inner product $s = x^Ty$ of vectors $x,y\in\mathbb{R}^n$ . The unblocked algorithm

$s = x_1y_1$
for $i = 2\colon n$
$s = s + x_ky_k$
end

can be expressed in blocked form, with block size $b$ , as

for $i=1\colon n/b$
$s_i = \sum_{j=(i-1)b+1}^{ib} x_jy_j$ % Compute by the unblocked algorithm.
end
$s = \sum_{i=1}^{n/b} s_i$

The sums of $b$ terms in line 2 of the blocked algorithm are independent and could be evaluated in parallel, whereas the unblocked form is inherently sequential.

To see the full benefits of blocking we need to consider an algorithm operating on matrices, of which matrix multiplication is the most important example. Suppose we wish to compute the product $C = AB$ of $n\times n$ matrices $A$ and $B$ . The natural computation is, from the definition of matrix multiplication, the “point algorithm”

$C = 0$
for $i=1\colon n$
for $j=1\colon n$
for $k=1\colon n$
$c_{ij} = c_{ij} + a_{ik}b_{kj}$
end
end
end

Let $A = (A_{ij})$ , $B = (B_{ij})$ , and $C = (C_{ij})$ be partitioned into blocks of size $b$ , where $r = n/b$ is assumed to be an integer. The blocked computation is

$C = 0$
for $i=1\colon r$
for $j=1\colon r$
for $k=1\colon r$
$C_{ij} = C_{ij} + A_{ik}B_{kj}$
end
end
end

On a computer with a hierarchical memory the blocked form can be much more efficient than the point form if the blocks fit into the high speed memory, as much less data transfer is required. Indeed line 5 of the blocked algorithm performs $O(b^3)$ flops on about $O(n^2)$ data, whereas the point algorithm performs $O(1)$ flops on $O(1)$ data on line 5, or $O(n)$ flops on $O(n)$ data if we combine lines 4–6 into a vector inner product. It is the $O(b)$ flops-to-data ratio that gives the blocked algorithm its advantage, because it masks the memory access costs.

The LAPACK (first released in 1992) was the first program library to systematically use blocked algorithms for a wide range of linear algebra computations.

An extra advantage that blocked algorithms have over unblocked algorithms is a reduction in the error constant in a rounding error bound by a factor $b$ or more and, typically, a reduction in the actual error (Higham, 2021).

The adjective “block” is sometimes used in place of “blocked”, but we prefer to reserve “block” for the meaning described in the next section.

Block Algorithms

A block algorithm is a generalization of a scalar algorithm in which the basic scalar operations become matrix operations ( $\alpha+\beta$ , $\alpha\beta$ , and $\alpha/\beta$ become $A+B$ , $AB$ , and $AB^{-1}$ ). It usually computes a block factorization, in which a matrix property based on the nonzero structure becomes the corresponding property blockwise; in particular, the scalars 0 and 1 become the zero matrix and the identity matrix, respectively.

An example of a block factorization is block LU factorization. For a $4\times 4$ matrix $A$ an LU factorization can be written in the form

$\notag A = \left[ \begin{tabular}{cc|cc} 1 & & & \\ x & 1 & & \\ \hline x & x & 1 & \\ x & x & x & 1 \end{tabular} \right] \left[ \begin{tabular}{cc|cc} x & x & x & x \\ & x & x & x \\ \hline & & x & x \\ & & & x \end{tabular} \right].$

A block LU factorization has the form

$\notag A = \left[ \begin{tabular}{cc|cc} 1 & 0 & & \\ 0 & 1 & & \\ \hline x & x & 1 & 0 \\ x & x & 0 & 1 \end{tabular} \right] \left[ \begin{tabular}{cc|cc} x & x & x & x \\ x & x & x & x \\ \hline & & x & x \\ & & x & x \end{tabular} \right].$

Clearly, these are different factorizations. In general, a block LU factorization has the form $A = LU$ with $L$ block lower triangular, with identity matrices on the diagonal, and $U$ block upper triangular. A blocked algorithm for computing the LU factorization and a block algorithm for computing the block LU factorization are readily derived.

The adjective “block” also applies to a variety of matrix properties, for which there are often special-purpose block algorithms. For example, the matrix

$\notag \begin{bmatrix} A_{11} & A_{12} & & & \\ A_{21} & A_{22} & A_{23} & & \\ & \ddots & \ddots & \ddots & \\ & & \ddots & \ddots & A_{n-1,1} \\ & & & A_{n,n-1} & A_{n,n} \end{bmatrix}$

is a block tridiagonal matrix, and a block Toeplitz matrix has constant block diagonals:

$\notag \begin{bmatrix} A_1 & A_2 & \dots & \dots & A_n \\ A_{-1} & A_1 & A_{2} & & \vdots \\ \vdots & A_{-1} & \ddots & \ddots & \vdots \\ \vdots & & \ddots & \ddots & A_2 \\ A_{1-n}& \dots & \dots & A_{-1} & A_1 \end{bmatrix}.$

One can define block diagonal dominance as a generalization of diagonal dominance. A block Householder matrix generalizes a Householder matrix: it is a perturbation of the identity matrix by a matrix of rank greater than or equal to $1$ .

References

This is a minimal set of references, which contain further useful references within.

Nicholas J. Higham, Accuracy and Stability of Numerical Algorithms, second edition, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2002. (Chapter 13, “Block LU Factorization”.)
Nicholas J. Higham, Numerical Stability of Algorithms at Extreme Scale and Low Precisions, MIMS EPrint 2021.14, Manchester Institute for Mathematical Sciences, The University of Manchester, UK, September 2021.

What Is a Matrix Norm?

A matrix norm is a function $\|\cdot\| : \mathbb{C}^{m\times n} \to \mathbb{R}$ satisfying

$\|A\| \ge 0$ with equality if and only if $A=0$ (nonnegativity),
$\|\alpha A\| =|\alpha| \|A\|$ for all $\alpha\in\mathbb{C}$ , $A\in\mathbb{C}^{m\times n}$ (homogeneity),
$\|A+B\| \le \|A\|+\|B\|$ for all $A,B\in\mathbb{C}^{m\times n}$ (the triangle inequality).

These are analogues of the defining properties of a vector norm.

An important class of matrix norms is the subordinate matrix norms. Given a vector norm on $\mathbb{C}^n$ the corresponding subordinate matrix norm on $\mathbb{C}^{m\times n}$ is defined by

$\notag \|A\| = \displaystyle\max_{x\ne0} \frac{ \|Ax\| }{ \|x\| },$

or, equivalently,

$\|A\| = \displaystyle\max_{\|x\| = 1} \|Ax\|.$

For the $1$ -, $2$ -, and $\infty$ -vector norms it can be shown that

$\notag \begin{alignedat}{2} \|A\|_1 &= \max_{1\le j\le n}\sum_{i=1}^m |a_{ij}|, &&\qquad \mbox{``max column sum'',} \\ \|A\|_{\infty} &= \max_{1\le i\le m}\sum_{j=1}^n |a_{ij}|, &&\qquad \mbox{``max row sum'',} \\ \|A\|_2 &= \sigma_{\max}(A), &&\qquad \mbox{spectral norm,} \end{alignedat}$

where $\sigma_{\max}(A)$ denotes the largest singular value of $A$ . To remember the formulas for the $1$ – and $\infty$ -norms, note that 1 is a vertical symbol (for columns) and $\infty$ is a horizontal symbol (for rows). For the general Hölder $p$ -norm no explicit formula is known for $\|A\|_p$ for $p\ne 1,2,\infty$ .

An immediate implication of the definition of subordinate matrix norm is that

$\notag \|AB\| \le \|A\| \|B\|$

whenever the product is defined. A norm satisfying this condition is called consistent or submultiplicative.

Another commonly used norm is the Frobenius norm,

$\notag \|A\|_F = \biggl( \displaystyle\sum_{i=1}^m \sum_{j=1}^n |a_{ij}|^2 \biggr)^{1/2} = \bigl( \mathrm{trace}(A^*\!A) \bigr)^{1/2}.$

The Frobenius norm is not subordinate to any vector norm (since $\|I_n\|_F = n^{1/2}$ , whereas $\|I_n\| = 1$ for any subordinate norm), but it is consistent.

The $2$ -norm and the Frobenius norm are unitarily invariant: they satisfy $\|UAV\| = \|A\|$ for any unitary matrices $U$ and $V$ . For the Frobenius norm the invariance follows easily from the trace formula.

As for vector norms, all matrix norms are equivalent in that any two norms differ from each other by at most a constant. This table shows the optimal constants $\alpha_{pq}$ such that $\|A\|_p \le \alpha_{pq} \|A\|_q$ for $A\in\mathbb{C}^{m\times n}$ and the norms mentioned above. Each inequality in the table is attained for some $A$ .

For any $A\in\mathbb{C}^{n\times n}$ and any consistent matrix norm,

$\notag \rho(A) \le \|A\|, \qquad\qquad (1)$

where $\rho$ is the spectral radius (the largest absolute value of any eigenvalue). To prove this inequality, let $\lambda$ be an eigenvalue of $A$ and $x$ the corresponding eigenvector (necessarily nonzero), and form the matrix $X=[x,x,\dots,x] \in \mathbb{C}^{n\times n}$ . Then $AX = \lambda X$ , so $|\lambda| \|X\| = \|AX\| \le \|A\| \, \|X\|$ , giving $|\lambda|\le\|A\|$ since $X \ne 0$ . For a subordinate norm it suffices to take norms in the equation $Ax = \lambda x$ .

A useful relation is

$\notag \|A\|_2 \le \bigl( \|A\|_1 \|A\|_{\infty}\bigr)^{1/2}, \qquad\qquad (2)$

which follows from $\|A\|_2^2 = \|A^*A\|_2 = \rho(A^*A) \le \| A^*A \|_{\infty} \le \|A^*\|_{\infty} \|A\|_{\infty} = \|A\|_1 \|A\|_{\infty}$ , where the first two equalities are obtained from the singular value decomposition and we have used (1).

Mixed Subordinate Matrix Norms

A more general subordinate matrix norm can be defined by taking different vector norms in the numerator and denominator:

$\notag \|A\|_{\alpha,\beta} = \displaystyle\max_{x\ne 0} \frac{ \|Ax\|_{\beta} }{ \|x\|_{\alpha} }.$

Some authors denote this norm by $\|A\|_{\alpha\to\beta}$ .

A useful characterization of $\|A\|_{\alpha,\beta}$ is given in the next result. Recall that $\|\cdot\|^D$ denotes the dual of the vector norm $\|\cdot\|$ .

Theorem 1. For $A\in\mathbb{C}^{m\times n}$ ,

$\notag \|A\|_{\alpha,\beta} = \displaystyle\max_{x\in\mathbb{C}^n \atop y \in\mathbb{C}^m} \frac{\mathrm{Re}\, y^*Ax}{\|y\|_\beta^D \|x\|_\alpha}.$

Proof. We have

$\notag \begin{aligned} \displaystyle \max_{x\in\mathbb{C}^n \atop y \in\mathbb{C}^m} \frac{\mathrm{Re}\, y^*Ax}{\|y\|_\beta^D \|x\|_\alpha} &= \max_{x\in\mathbb{C}^n} \frac{1}{\|x\|_\alpha} \max_{y\in\mathbb{C}^m} \frac{\mathrm{Re}\, y^*Ax}{\|y\|_\beta^D} \\ &= \max_{x\in\mathbb{C}^n} \frac{\| Ax \|_\beta }{\|x\|_\alpha} = \|A\|_{\alpha,\beta}, \end{aligned}$

where the second equality follows from the definition of dual vector norm and the fact that the dual of the dual norm is the original norm.

We can now obtain a connection between the norms of $A$ and $A^*$ . Here, $\|A^*\|_{\beta^D,\alpha^D}$ denotes $\max_{x\ne 0} \|Ax\|_\alpha^D / \|x\|_\beta^D$ .

Theorem 2. If $A\in\mathbb{C}^{m\times n}$ then $\|A\|_{\alpha,\beta} = \|A^*\|_{\beta^D,\alpha^D}$ .

Proof. Using Theorem 1, we have

$\notag \|A^*\|_{\alpha,\beta} = \displaystyle\max_{x\in\mathbb{C}^n \atop y \in\mathbb{C}^m} \frac{\mathrm{Re}\, y^*Ax}{\|y\|_\beta^D \|x\|_\alpha} = \displaystyle\max_{x\in\mathbb{C}^n \atop y \in\mathbb{C}^m} \frac{\mathrm{Re}\, x^*(A^*y)}{\|x\|_\alpha \|y\|_{\beta^D}} = \|A^*\|_{\beta^D,\alpha^D}. \quad\square$

If we take the $\alpha$ – and $\beta$ -norms to be the same $p$ -norm then we have $\|A\|_p = \|A^*\|_q$ , where $p^{-1} + q^{-1} = 1$ (giving, in particular, $\|A\|_2 = \|A^*\|_2$ and $\|A\|_1 = \|A^*\|_\infty$ , which are easily obtained directly).

Now we give explicit formulas for the $(\alpha,\beta)$ norm when $\alpha$ or $\beta$ is $1$ or $\infty$ and the other is a general norm.

Theorem 3. For $A\in\mathbb{C}^{m\times n}$ ,

$\notag \begin{alignedat}{2} \|A\|_{1,\beta} &= \max_j \| A(:,j) \|_{\beta}, &\qquad\qquad& (3) \\ \|A\|_{\alpha,\infty} &= \max_i \|A(i,:)^*\|_{\alpha}^D, &\qquad\qquad& (4) \\ \end{alignedat}$

For $A\in\mathbb{R}^{m\times n}$ ,

$\notag \|A\|_{\infty,\beta} = \displaystyle\max_{x\in Z} \|Az\|_{\beta}, \qquad\qquad (5)$

where

$Z = \{ z\in\mathbb{R}^n: z_i = \pm 1~\mathrm{for~all}~i \,\},$

and if $A$ is symmetric positive semidefinite then

$\notag \|A\|_{\infty,1} = \displaystyle\max_{x\in Z} z^T\!Az.$

Proof. For (3),

$\notag \|Ax\|_{\beta} = \Big\| \sum_j x_j A(\colon,j) \Bigr\|_{\beta} \le \displaystyle\max_j \| A(\colon,j) \|_{\beta} \sum_j |x_j|,$

with equality for $x=e_k$ , where the maximum is attained for $j=k$ . For (4), using the Hölder inequality,

$\|Ax\|_{\infty} = \displaystyle\max_i | A(i,\colon) x | \le \max_i \bigl( \|A(i,\colon)^*\|_{\alpha}^D \|x\|_{\alpha} \bigr) = \|x\|_{\alpha} \max_i \|A(i,\colon)^*\|_{\alpha}^D.$

Equality is attained for an $x$ that gives equality in the Hölder inequality involving the $k$ th row of $A$ , where the maximum is attained for $i=k$ .

Turning to (5), we have $\|A\|_{\infty,\beta} = \max_{ \|x\|_\infty =1} \|Ax\|_\beta$ . The unit cube $\{\, x\in\mathbb{R}^n: -e \le x \le e\,\}$ , where $e = [1,1,\dots,1]^T$ , is a convex polyhedron, so any point within it is a convex combination of the vertices, which are the elements of $Z$ . Hence $\|x\|_\infty = 1$ implies

$\notag x = \displaystyle\sum_{z\in Z} \lambda_z Z, \quad \mathrm{where} \quad \lambda_z \ge 0, \quad \sum_{z\in Z} \lambda_z = 1$

and then

$\notag \|Ax\|_\beta = \biggl\| \displaystyle\sum_{z\in Z} \lambda_z Az \biggr\|_\beta \le \max_{z\in Z} \|Az\|_\beta.$

Hence $\max_{\|x\|_\infty = 1} \|Ax\|_\beta \le \max_{z\in Z} \|Az\|_\beta$ , but trivially $\max_{z\in Z} \|Az\|_\beta \le \max_{\|x\|_\infty = 1} \|Ax\|_\beta$ and (5) follows.

Finally, if $A$ is symmetric positive semidefinite let $\xi_z = \mathrm{sign}(Az) \in Z$ . Then, using a Cholesky factorization $A = R^T\!R$ (which exists even if $A$ is singular) and the Cauchy–Schwarz inequality,

$\notag \begin{aligned} \max_{z\in Z} \|Az\|_1 &= \max_{z\in Z} \xi_z^T Az = \max_{z\in Z} \xi_z^T R^T\!Rz\\ &= \max_{z\in Z} (R\xi_z)^T Rz \le \max_{z\in Z} \|R\xi_z\|_2 \|Rz\|_2\\ &\le \max_{z\in Z} \|Rz\|_2^2\\ &= \max_{z\in Z} z^T\!Az. \end{aligned}$

Conversely, for $z\in Z$ we have

$\notag z^T\!Az = |z^T\!Az| \le |z|^T|Az| = e^T |Az| = \|Az\|_1,$

so $\max_{z\in Z} z^T\!Az \le \max_{z\in Z} \|Az\|_1$ . Hence $\max_{z\in Z} z^T\!Az = \max_{z\in Z} \|Az\|_1 = \|A\|_{\infty,1}$ , using (5). $\square$

As special cases of (3) and (4) we have

$\notag \begin{aligned} \|A\|_{1,\infty} &= \max_{i,j} |a_{ij}|, \qquad\qquad\qquad (6) \\ \|A\|_{2,\infty} &= \max_i \|A(i,:)^*\|_2, \\ \|A\|_{1,2} &= \max_J \|A(:,j)\|_2. \end{aligned}$

We also obtain by using Theorem 2 and (5), for $A\in\mathbb{R}^{m\times n}$ ,

$\notag \|A\|_{2,1} = \displaystyle\max_{z\in Z} \|A^Tz\|_2.$

The $(2,\infty)$ -norm has recently found use in statistics (Cape, Tang, and Priebe, 2019), the motivation being that because it satisfies

$\notag \|A\|_{2,\infty} \le \|A\|_2 \le m^{1/2} \|A\|_{2,\infty},$

the $(2,\infty)$ -norm can be much smaller than the $2$ -norm when $m \gg n$ and so can be a better norm to use in bounds. The $(2,\infty)$ – and $(\infty,2)$ -norms are used by Rebrova and Vershynin (2018) in bounding the $2$ -norm of a random matrix after zeroing a submatrix. They note that the $2$ -norm of a random $n\times n$ matrix involves maximizing over infinitely many random variables, while the $(\infty,2)$ -norm and $(2,\infty)$ -norm involve only $2^n$ and $n$ random variables, respectively.

The ( $\alpha,\beta$ ) norm is not consistent, but for any vector norm $\gamma$ , we have

$\notag \begin{aligned} \|AB\|_{\alpha,\beta} &= \max_{x\ne0} \frac{ \|ABx\|_\beta }{ \|x\|_\alpha} = \max_{x\ne0} \frac{ \|A(Bx)\|_\beta }{ \|Bx\|_\gamma} \frac{ \|Bx\|_\gamma}{ \|x\|_\alpha} \le \max_{x\ne0} \frac{ \|A(Bx)\|_\beta }{ \|Bx\|_\gamma} \max_{x\ne0} \frac{ \|Bx\|_\gamma}{ \|x\|_\alpha}\\ &\le \|A\|_{\gamma,\beta} \|B\|_{\alpha,\gamma}. \end{aligned}$

Matrices With Constant $p$ -Norms

Let $A$ be a nonnegative square matrix whose row and column sums are all equal to $\mu$ . This class of matrices includes magic squares and doubly stochastic matrices. We have $\|A\|_1 = \|A\|_{\infty} = \mu$ , so $\|A\|_2 \le \mu$ by (2). But $Ae = \mu e$ for the vector $e$ of $1$ s, so $\mu$ is an eigenvalue of $A$ and hence $\|A\|_2 \ge \mu$ by (1). Hence $\|A\|_1 = \|A\|_2 = \|A\|_{\infty} = \mu$ . In fact, $\|A\|_p = \mu$ for all $p\in[1,\infty]$ (see Higham (2002, Prob. 6.4) and Stoer and Witzgall (1962)).

Computing Norms

The problem of computing or estimating $\|A\|_{\alpha,\beta}$ may be difficult for two reasons: we may not have an explicit formula for the norm, or $A$ may be given implicitly, as $A = f(B)$ , $A = B^{-1}$ , or $A = BC$ , for example, and we may wish to compute the norm without forming $A$ . Both difficulties are overcome by a generalization of the power method proposed by Tao in 1975 for arbitrary norms and independently Boyd in 1984 for $\alpha$ and $\beta$ both $p$ -norms (see Higham, 2002, Sec. $\,$ 15.2). The power method accesses $A$ only through matrix–vector products with $A$ and $A^*$ , so $A$ does not need to be explicitly available.

Let us focus on the case where $\alpha$ and $\beta$ are the same $p$ -norm. The power method is implemented in the Matrix Computation Toolbox as pnorm and we use it here to estimate the $\pi$ -norm and the $99$ -norm, which we compare with the $1$ -, $2$ -, and $\infty$ -norms.

>> A = gallery('frank',4)
A =
     4     3     2     1
     3     3     2     1
     0     2     2     1
     0     0     1     1
>> [norm(A,1) norm(A,2) pnorm(A,pi), pnorm(A,99), norm(A,inf)]
ans =
   8.0000e+00   7.6237e+00   8.0714e+00   9.8716e+00   1.0000e+01

The plot in the top half of the following figure shows the estimated $p$ -norm for the matrix gallery('binomial',8) for $p \in[1,25]$ . This shape of curve is typical, because, as the plot in the lower half indicates, $\log(\|A\|_p)$ is a convex function of $1/p$ for $p\ge 1$ (see Higham, 2002, Sec. $\,$ 6.3).

The power method for the $(1,\infty)$ norm, which is the max norm in (6), is investigated by Higham and Relton (2016) and extended to estimate the $k$ largest values $a_{ij}$ or $|a_{ij}|$ .

A Norm Related to Grothendieck’s Inequality

A norm on $\mathbb{C}^{n\times n}$ can be defined by

$\notag \begin{aligned} \|A\|^{(t)} = \max \biggl\{ \bigg|\sum_{i,j=1}^n a_{ij} y^*_jx_i \bigg| : x_i,y_j \in\mathbb{C}^t, \; \|x_i\|_2 \le 1, \; \|y_j\|_2 \le 1, \; i,j=1\colon t \, \biggr\}. \end{aligned}$

Note that

$\notag \|A\|^{(1)} = \displaystyle\max_{0\ne x,y\in\mathbb{C}^n} \frac{|y^*Ax|}{\|y\|_\infty \|x\|_\infty} = \|A\|_{\infty,1}.$

A famous inequality of Grothendieck states that $\|A\|^{(n)} \le c \|A\|^{(1)}$ for all $A$ , for some constant $c$ independent of $n$ . Much work has been done to estimate the smallest possible constant $c$ , which is known to be in the interval $[1.33,1.41]$ , or $[1.67,1.79]$ for the analogous inequality for real data. See Friedland, Lim, and Zhang (2018) and the references therein.

Notes

The formula (5) with $\beta = 1$ is due to Rohn (2000), who shows that evaluating it is NP-hard. For general $\beta$ , the formula is given by Lewis (2010).

Computing mixed subordinate matrix norms based on $p$ -norms is generally NP-hard (Hendrickx and Olshevsky, 2010).

References

This is a minimal set of references, which contain further useful references within.

Joshua Cape, Minh Tang, and Carey Priebe, The Two-To-Infinity Norm and Singular Subspace Geometry with Applications to High-Dimensional Statistics, Ann. Statist. 47(5), 2405–2439, 2019.
Shmuel Friedland, Lek-Heng Lim and Jinjie Zhang, An Elementary and Unified Proof of Grothendieck’s Inequality, L’Enseignement Mathématique 64 (3), 327–351, 2018.
Julien Hendrickx and Alex Olshevsky, Matrix $p$ -Norms are NP-Hard to Approximate if $p\neq1,2,\infty$ , SIAM J. Matrix Anal. Appl. 31(5), 2802–2812, 2010.
Nicholas J. Higham, Accuracy and Stability of Numerical Algorithms, second edition, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2002.
Nicholas J. Higham and Samuel Relton, Estimating the Largest Elements of a Matrix, SIAM J. Sci. Comput. 38(5), C584–C601, 2016,
Andrew D. Lewis, A Top Nine List: Most Popular Induced Matrix Norms, Manuscript, 2010.
E. Rebrova and R. Vershynin, Norms of Random Matrices: Local and Global Problems, Adv. Math. 324, 40–83, 2018.
Josef Stoer and C. Witzgall, Transformations by Diagonal Matrices in a Normed Space, Numer. Math. 4(1), 158–171, 1962.

What Is a Vector Norm?

A vector norm measures the size, or length, of a vector. For complex $n$ -vectors, a vector norm is a function $\|\cdot\| : \mathbb{C}^n \to \mathbb{R}$ satisfying

$\|x\| \ge 0$ with equality if and only if $x=0$ (nonnegativity),
$\|\alpha x\| =|\alpha| \|x\|$ for all $\alpha\in\mathbb{C}$ $x\in\mathbb{C}^n$ (homogeneity),
$\|x+y\| \le \|x\|+\|y\|$ for all $x,y\in\mathbb{C}^n$ (the triangle inequality).

An important class of norms is the Hölder $p$ -norms

$\|x\|_p = \biggl( \displaystyle\sum_{i=1}^n |x_i|^p \biggr)^{1/p}, \quad p\ge 1. \qquad\qquad (1)$

The $\infty$ -norm is interpreted as $\lim_{p\to\infty}\|x\|_p$ and is given by

$\notag \|x\|_{\infty} = \displaystyle\max_{1\le i\le n} |x_i|.$

Other important special cases are

$\notag \begin{alignedat}{2} \|x\|_1 &= \sum_{i=1}^n |x_i|, &\quad& \mbox{``Manhattan'' or ``taxi~cab'' norm}, \\ \|x\|_2 &= \biggl( \sum_{i=1}^n |x_i|^2 \biggr)^{1/2} = (x^*x)^{1/2}, &\quad& \mbox{Euclidean length}. \end{alignedat}$

A useful concept is that of the dual norm associated with a given vector norm, which is defined by

$\notag \|y\|^D = \displaystyle\max_{x\ne0} \displaystyle\frac{\mathop{\mathrm{Re}}x^* y}{\|x\|}.$

The maximum is attained because the definition is equivalent to $\|y\|^D = \max\{ \, \mathop{\mathrm{Re}} x^*y: \|x\| = 1\,\}$ , in which we are maximizing a continuous function over a compact set. Importantly, the dual of the dual norm is the original norm (Horn and Johnson, 2013, Thm. $\,$ 5.5.9(c)).

We can rewrite the definition of dual norm, using the homogeneity of vector norms, as

$\notag \|y\|^D = \displaystyle\max_{|c| = 1} \| cy \|^D = \max_{|c| = 1} \max_{x\ne 0} \frac{\mathop{\mathrm{Re}} x^*(cy) }{\|x\|} = \max_{x\ne 0} \max_{|c| = 1} \frac{\mathop{\mathrm{Re}} c(x^*y) }{\|x\|} = \max_{x\ne 0} \frac{ |x^*y| }{\|x\|}.$

Hence we have the attainable inequality

$\notag |x^*y| \le \|x\| \, \|y\|^D, \qquad\qquad (2)$

which is the generalized Hölder inequality.

The dual of the $p$ -norm is the $q$ -norm, where $p^{-1} + q^{-1} = 1$ , so for the $p$ -norms the inequality (2) becomes the (standard) Hölder inequality,

$\notag |x^*y| \le \|x\|_p \, \|y\|_q, \quad \displaystyle\frac{1}{p} + \frac{1}{q} = 1.$

An important special case is the Cauchy–Schwarz inequality,

$\notag |x^*y| \le \|x\|_2 \, \|y\|_2.$

The notation $\|x\|_0$ is used to denote the number of nonzero entries in $x$ , even though it is not a vector norm and is not obtained from (1) with $p = 0$ . In portfolio optimization, if $x_k$ specifies how much to invest in stock $k$ then the inequality $\|x\|_0 \le k$ says “invest in at most $k$ stocks”.

In numerical linear algebra, vector norms play a crucial role in the definition of a subordinate matrix norm, as we will explain in the next post in this series.

All norms on $\mathbb{C}^n$ are equivalent in the sense that for any two norms $\|\cdot\|_\alpha$ and $\|\cdot\|_\beta$ there exist positive constants $\nu_1$ and $\nu_2$ such that

$\nu_1\|x\|_\alpha \le \|x\|_\beta \le \nu_2 \|x\|_\alpha \quad \mathrm{for~all}~x\in \mathbb{C}^n.$

For example, it is easy to see that

$\notag \begin{aligned} \|x\|_2 &\le \|x\|_1 \le \sqrt{n} \|x\|_2,\\ \|x\|_\infty &\le \|x\|_2 \le \sqrt{n} \|x\|_\infty,\\ \|x\|_\infty &\le \|x\|_1 \le n \|x\|_\infty. \end{aligned}$

The 2-norm is invariant under unitary transformations: if $Q^*Q = I$ , then $\|Qx\|^2 = x^*Q^* Qx = x^*x = \|x\|_2^2$ .

Care must be taken to avoid overflow and (damaging) underflow when evaluating a vector $p$ -norm in floating-point arithmetic for $p\ne 1,\infty$ . One can simply use the formula $\|x\|_p = \| (x/\|x\|_{\infty}) \|_p \|x\|_{\infty}$ , but this requires two passes over the data (the first to evaluate $\|x\|_{\infty}$ ). For more efficient one-pass algorithms for the $2$ -norm see Higham (2002, Sec. 21.8) and Harayama et al. (2021).

References

This is a minimal set of references, which contain further useful references within.

Takeyuki Harayama, Shuhei Kudo, Daichi Mukunoki, Toshiyuki Imamura, and Daisuke Takahashi, A Rapid Euclidean Norm Calculation Algorithm that Reduces Overflow and Underflow, in Computational Science and Its Applications–ICCSA 2021, O. Gervasi et al., eds, 95–110, Springer, 2021.
Nicholas J. Higham, Accuracy and Stability of Numerical Algorithms, second edition, Society for Industrial and Applied Mathematics,
Roger A. Horn and Charles R. Johnson, Matrix Analysis, second edition, Cambridge University Press, 2013. My review of the second edition.

Note: This article was revised on October 12, 2021 to change the definition of dual norm to use $\mathrm{Re}$ .

This article is part of the “What Is” series, available from https://nhigham.com/category/what-is and in PDF form from the GitHub repository https://github.com/higham/what-is.

What Is a Totally Nonnegative Matrix?

The determinant of a square submatrix of a matrix is called a minor. A matrix $A\in\mathbb{R}^{n\times n}$ is totally positive if every minor is positive. It is totally nonnegative if every minor is nonnegative. These definitions require, in particular, that all the matrix elements must be nonnegative or positive, as must $\det(A)$ .

An important property is that total nonnegativity is preserved under matrix multiplication and hence under taking positive integer powers.

Theorem 1. If $A,B\in\mathbb{R}^{n\times n}$ are totally nonnegative then so is $AB$ .

Theorem 1 is a direct consequence of the Binet–Cauchy theorem on determinants (also known as the Cauchy–Binet theorem). To state it, we need a way of specifying submatrices. We say the vector $\alpha = [\alpha_1,\alpha_2,\dots,\alpha_k]$ is an index vector of order $k$ if its components are integers from the set $\{1,2,\dots,n\}$ satisfying $\alpha_1 < \alpha_2 < \cdots < \alpha_k$ . If $\alpha$ and $\beta$ are index vectors of order $k$ and $\ell$ , respectively, then $A(\alpha, \beta)$ denotes the $k\times \ell$ matrix with ( $i,j$ ) element $a_{\alpha_i,\beta_j}$ .

Theorem 2. (Binet–Cauchy) Let $A\in\mathbb{R}^{m\times n}$ , $B\in\mathbb{R}^{n\times p}$ , and $C = AB$ . If $\alpha$ and $\beta$ are index vectors of order $k$ and $1 \le k \le \min(m,n,p)$ then

$\notag \det(C(\alpha,\beta)) = \sum_{\gamma} \det( A(\alpha,\gamma) ) \det( B(\gamma,\beta) ), \qquad (1)$

where the sum is over all index vectors $\gamma$ of order $k$ .

Note than when $k = m = n = p$ , (1) reduces to the well-known relation $\det(AB) = \det(A)\det(B)$ , while when $k = 1$ , (1) reduces to the definition of matrix multiplication.

Totally nonnegative matrices have many interesting determinantal properties. For example, they satisfy Fischer’s inequality, first proved for symmetric positive definite matrices.

Theorem 3. (Fischer) If $A\in\mathbb{R}^{n\times n}$ is totally nonnegative then for any index vector $\alpha$ ,

$\notag \det(A) \le \det(A(\alpha)) \det(A(\alpha^c)), \qquad (2)$

where $\alpha^c$ comprises the indices not in $\alpha$ .

By repeatedly applying (2) with $\alpha$ containing just one element, we obtain Hadamard’s inequality for totally nonnegative $A$ :

$\notag \det(A) \le a_{11} a_{22} \cdots a_{nn}.$

Examples

We give some examples of totally positive matrices, showing how they can be generated in MATLAB. We use the Anymatrix toolbox.

A matrix well known to be positive definite, but which is also totally positive, is the Hilbert matrix $H\in\mathbb{R}^{n\times n}$ , with $h_{ij} = 1/(i+j-1)$ . The Hilbert matrix is a particular case of a Cauchy matrix $C$ , with $c_{ij} = 1/(x_i + y_j)$ for given vectors $x,y\in\mathbb{R}^{n\times n}$ . A Cauchy matrix is totally positive if $0 < x_1 < x_2 < \cdots < x_n$ and $0 < y_1 < y_2 < \cdots < y_n$ , which follows from the formula

$\notag \det(C_n) = \displaystyle\frac{\displaystyle\prod_{1\le i < j \le n} (x_j-x_i) (y_j-y_i) } {\displaystyle\prod_{1\le i,j \le n} (x_i+y_j) }.$

In MATLAB, the Hilbert matrix is hilb(n) and the Cauchy matrix can be generated by gallery('cauchy',x,y) (or anymatrix('gallery/cauchy',x,y)).

A Vandermonde matrix

$\notag V = V(x_1,x_2,\dots,x_n) = \begin{bmatrix} 1 & 1 & \dots & 1 \\ x_1 & x_2 & \dots & x_n \\ \vdots &\vdots & & \vdots \\ x_1^{n-1} & x_2^{n-1} & \dots & x_n^{n-1} \end{bmatrix} \in \mathbb{C}^{n\times n}$

is totally positive if the points $x_i$ satisfy $0 < x_1 < x_2 < \cdots < x_n$ . As a partial check, the general formula

$\notag \det(V) = \displaystyle\prod_{1\le i < j \le n}^n (x_i - x_j)$

shows that every leading principal minor is positive. In MATLAB, a Vandermonde matrix can be generated by anymatrix('core/vand',x).

The Pascal matrix $P_n\in\mathbb{R}^{n\times n}$ is defined by

$p_{ij} = \displaystyle\frac{ (i+j-2)! }{ (i-1)! (j-1)! } = {i+j-2 \choose j-1}.$

For example, in MATLAB:

>> P = pascal(5)
P =
     1     1     1     1     1
     1     2     3     4     5
     1     3     6    10    15
     1     4    10    20    35
     1     5    15    35    70

The Pascal matrix is totally positive for all $n$ (see the section below on bidiagonal factorizations).

The one-parameter correlation matrix $C_n(\theta)$ with off-diagonal elements given by $\theta$ with $0 \le \theta < 1$ , illustrated by

$\notag C_3(\theta) = \begin{bmatrix} 1 & \theta & \theta \\ \theta & 1 & \theta \\ \theta & \theta & 1 \\ \end{bmatrix},$

is not totally positive because while the principal minors are all positive, the submatrix $A([1,2],[2,3]) = \bigl[\begin{smallmatrix} \theta & \theta \\ 1 & \theta \end{smallmatrix}\bigr]$ has nonpositive determinant. However, the Kac–Murdock–Szegö matrix $K_n(\theta) = (\theta^{|i-j|})$ , with $0 \le \theta < 1$ , illustrated by

$\notag K_3(\theta) = \begin{bmatrix} 1 & \theta & \theta^2 \\ \theta & 1 & \theta \\ \theta^2 & \theta & 1 \\ \end{bmatrix}$

is totally positive thanks to the decay of the elements way from the diagonal. In MATLAB, the Kac–Murdock–Szegö matrix can be generated by gallery('kms',n,rho).

The lower Hessenberg Toeplitz matrix $H_n$ with all elements $1$ on and below the superdiagonal, illustrated for $n = 4$ by

$\notag H_4 = \begin{bmatrix} 1 & 1 & 0 & 0 \\ 1 & 1 & 1 & 0 \\ 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 \\ \end{bmatrix},$

is totally nonnegative. It has $\lfloor n/2 \rfloor$ zero eigenvalues, which appear in a single Jordan block, and its largest eigenvalue is $2(1+\cos(2\pi/(n+2)))$ . In MATLAB, this matrix can be generated by anymatrix('core/hessfull01',n). This and other binary totally nonnegative matrices are studied by Brualdi and Kirkland (2010).

Finally, consider a nonnegative $4\times 4$ bidiagonal matrix factorized into a product of elementary nonnegative bidiagonal matrices (nonnegative means that the elements of the matrix are nonnegative):

$\notag \begin{aligned} L = \begin{bmatrix} 1 & 0 & 0 & 0 \\ \ell_{21} & 1 & 0 & 0 \\ 0 & \ell_{32} & 1 & 0 \\ 0 & 9 & \ell_{43} & 1 \\ \end{bmatrix} &= \begin{bmatrix} 1 & 0 & 0 & 0 \\ \ell_{21} & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ \end{bmatrix} \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & \ell_{32} & 1 & 0 \\ 0 & 0 & 0 & 1 \\ \end{bmatrix} \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & \ell_{43} & 1 \\ \end{bmatrix}\\ &\equiv L_1(\ell_{21}) L_2(\ell_{32}) L_3(\ell_{43}). \end{aligned}$

It is easy to see by inspection that $L_1$ , $L_2$ , and $L_3$ are totally nonnegative, so $L$ is totally nonnegative by Theorem 1. With $D = \mathrm{diag}(1,-1,1,1,-1)$ , we have

$\notag \begin{aligned} DL^{-1}D &= (DLD)^{-1} = (DL_1D \cdot DL_2D \cdot DL_3D )^{-1}\\ &= (DL_3D)^{-1} (DL_2D)^{-1} (DL_3D)^{-1}\\ &= L_3(\ell_{43}) L_2(\ell_{32}) L_1(\ell_{21}), \qquad (3)\ \end{aligned}$

which is a product of totally nonnegative matrices and hence is totally nonnegative by Theorem 1. This example clearly generalizes to show that an $n\times n$ nonnegative bidiagonal matrix is totally nonnegative.

Inverse

Recall that the inverse of a nonsingular $A\in\mathbb{R}^{n\times n}$ is given by $A^{-1} = \mathrm{adj}(A)/\det(A)$ , where

$\mathrm{adj}(A) = \bigl( (-1)^{i+j} \det(A_{ji}) \bigr)$

and $A_{pq}$ denotes the submatrix of $A$ obtained by deleting row $p$ and column $q$ . If $A$ is nonsingular and totally nonnegative then it follows that $A^{-1}$ has a checkerboard (alternating) sign pattern. Indeed, we can write $A^{-1} = DBD$ , where $D = \mathrm{diag}((-1)^{i+1})$ and $B$ has nonnegative elements, and in fact it can be shown that $B$ is totally nonnegative using Theorem 1, Theorem 6, and (3). For example, here is the inverse of the $4\times 4$ Pascal matrix:

>> inv(sym(pascal(5)))
ans =
[  5, -10,  10,  -5,  1]
[-10,  30, -35,  19, -4]
[ 10, -35,  46, -27,  6]
[ -5,  19, -27,  17, -4]
[  1,  -4,   6,  -4,  1]

Eigensystem

A totally nonnegative matrix has nonnegative trace and determinant, so the sum and product of its eigenvalues are both nonnegative. In fact, all the eigenvalues are real and nonnegative. Since a Jordan block corresponding to a nonnegative eigenvalue is totally nonnegative any Jordan form with nonnegative eigenvalues is possible. More can be said of $A$ is irreducible. Recall that a matrix $A\in\mathbb{C}^{n\times n}$ is irreducible if there does not exist a permutation matrix $P$ such that

$\notag P^TAP = \begin{bmatrix} A_{11} & A_{12} \\ 0 & A_{22} \end{bmatrix}$

where $A_{11}$ and $A_{22}$ are square, nonempty submatrices.

Theorem 4. If $A\in\mathbb{R}^{n\times n}$ is totally nonnegative then its eigenvalues are all real and nonnegative. If $A$ is also irreducible then the positive eigenvalues are distinct.

If $A$ is nonsingular and totally nonnegative and irreducible then by the theorem we can write the eigenvalues as $\lambda_1 > \lambda_2 > \cdots > \lambda_n >0$ . It is known that the eigenvector $x_k$ associated with $\lambda_k$ has $k-1$ sign changes, that is, $(x_k)_{i+1}$ and ( $x_k)_i$ have opposite signs for $k-1$ values of $i$ (any zero elements are deleted before counting sign changes). Note that for $k=1$ , we already know from Perron–Frobenius theory that there is a positive eigenvector $x_1$ . This result is illustrated by the Pascal matrix above:

>> A = pascal(5); [V,d] = eig(A,'vector'); [~,k] = sort(d,'descend');
>> evals = d', evecs = V(:,k)
evals =
   1.0835e-02   1.8124e-01   1.0000e+00   5.5175e+00   9.2290e+01
evecs =
   1.7491e-02   2.4293e-01  -7.6605e-01  -5.7063e-01   1.6803e-01
   7.4918e-02   4.8079e-01  -3.8302e-01   5.5872e-01  -5.5168e-01
   2.0547e-01   6.1098e-01   1.6415e-01   2.5292e-01   7.0255e-01
   4.5154e-01   4.1303e-01   4.3774e-01  -5.1785e-01  -4.0710e-01
   8.6486e-01  -4.0736e-01  -2.1887e-01   1.7342e-01   9.0025e-02

Note that the number of sign changes (but not the number of negative elements) increases by $1$ as we go from one column to the next

The class of nonsingular totally nonnegative irreducible matrices is known as the oscillatory matrices, because such matrices arise in the analysis of small oscillations of elastic systems. An equivalent definition (in fact, the usual definition) is that an oscillatory matrix is a totally nonnegative matrix for which $A^q$ is totally positive for some positive integer $q$ .

LU Factorization

The next result shows that a totally nonnegative matrix has an LU factorization with special properties. We will need the following special case of Fischer’s inequality (Theorem 3):

$\notag \det(A) \le \det \bigl( A(1\colon p,1\colon p) \bigr) \det \bigl( A(p+1\colon n,p+1\colon n) \bigr), \quad p=1\colon n-1. \qquad (4)$

Theorem 5. If $A\in\mathbb{R}^{n\times n}$ is nonsingular and totally nonnegative then it has an LU factorization with $L$ and $U$ totally nonnegative and the growth factor $\rho_n = 1$ .

Proof. Since $A$ is nonsingular and every minor is nonnegative, (4) shows that $\det(A(1\colon p,1\colon p))>0$ for $p=1\colon n-1$ , which guarantees the existence of an LU factorization. That the elements of $L$ and $U$ are nonnegative follows from explicit determinantal formulas for the elements of $L$ and $U$ . The total nonnegativity of $L$ and $U$ is proved by Cryer (1976). Gaussian elimination starts with $A^{(1)} = A$ and computes $a_{ij}^{(k+1)} = a_{ij}^{(k)} - m_{ik}a_{kj}^{(k)} = a_{ij}^{(k)} - \ell_{ik} u_{kj} \le a_{ij}^{(k)}$ , since $\ell_{ik}, u_{kj} \ge 0$ . Thus $a_{ij} = a_{ij}^{(1)} \ge a_{ij}^{(2)} \ge \cdots \ge a_{ij}^{(r)}$ , $r = \min (i,j)$ . For $i > j$ , $a_{ij}^{(r)} \ge a_{ij}^{(r+1)} = \cdots = a_{ij}^{(n)} = 0$ ; for $j \ge i$ , $a_{ij}^{(r)} = \cdots = a_{ij}^{(n)} = u_{ij} \ge 0$ . Thus $0 \le a_{ij}^{(k)} \le a_{ij}$ for all $i,j,k$ and hence $\rho_n \le 1$ . But $\rho_n\ge1$ , so $\rho_n=1$ .

Theorem 5 implies that it is safe to compute the LU factorization without pivoting of a nonsingular totally nonnegativity matrix: the factorization does not break down and it is numerically stable. In fact, the computed LU factors have a strong componentwise form of stability. As shown by De Boor and Pinkus (1977), for small enough unit roundoff $u$ the computed factors $\widehat{L}$ and $\widehat{U}$ will have nonnegative elements and so from the standard backward error result for LU factorization,

$\notag \widehat{L}\widehat{U} = A + \Delta A, \quad |\Delta A| \le \gamma_n |\widehat{L}||\widehat{U}| \quad \Bigl(\gamma_n = \displaystyle\frac{nu}{1-nu} \Bigr),$

we have

$\notag |\widehat{L}||\widehat{U}| = |\widehat{L}\widehat{U}| = |A + \Delta A| \le |A| + \gamma_n |\widehat{L}||\widehat{U}|,$

which gives $|\widehat{L}||\widehat{U}| \le (1 - \gamma_n)^{-1}|A|$ and hence

$\notag \widehat{L}\widehat{U} = A + \Delta A, \quad |\Delta A| \le \displaystyle\frac{\gamma_n}{1-\gamma_n} |A|,$

which is about as strong a backward error result as we could hope for. The significance of this result is reduced, however, by the fact that for some important classes of totally nonnegative matrices, including Vandermonde matrices and Cauchy matrices, structure-exploiting linear system solvers exist that are substantially faster, and potentially more accurate, than LU factorization.

Factorization into a Product of Bidiagonal Matrices

We showed above that any nonnegative bidiagonal matrix is totally nonnegative. The next result shows that any nonsingular totally nonnegative matrix has an LU factorization in which $L$ and $U$ can be factorized into a product of nonnegative bidiagonal matrices.

Theorem 6. (Gasca and Peña, 1996) A nonsingular matrix $A\in\mathbb{R}^{n\times n}$ is totally nonnegative if and only if it it can be factorized as

$\notag A = L_{n-1} L_{n-2} \dots L_1 D U_1 U_2 \dots U_{n-1}, \qquad (5)$

where $D$ is a diagonal matrix with positive diagonal entries and $L_i$ and $U_i$ are unit lower and unit upper bidiagonal matrices, respectively, with the first $i-1$ entries along the subdiagonal of $L_i$ and $U_i^T$ zero and the rest nonnegative.

An analogue of Theorem 6 holds for totally positive matrices, the only difference being that the last $n-i+1$ subdiagonal entries of $L_i$ and $U_i^T$ are positive.

The factorization (5) can be computed by Neville elimination, which is a version of Gaussian elimination in which the eliminations are between adjacent rows, working from the bottom of each column upwards.

This factorization into bidiagonal factors can be used to obtain simple proofs of various properties of totally nonnegative matrices and totally positive matrices (Fallat, 2001). It also provides an efficient way to generates such matrices. If all the parameters in $D$ and the $L_i$ and $U_i$ are set to $1$ then the Pascal matrix is generated.

Testing for Total Positivity

An $n\times n$ matrix has $\sum_{i=1}^n {n \choose k} = 2^n-1$ principal minors (ones based on submatrices centred on the diagonal) and $\sum_{i=1}^n {n \choose k}^2 = {2n \choose n} -1 \approx 4^n/(n\pi)^{1/2}$ minors in total. However, it is not necessary to check all these minors to test for total positivity.

Theorem 7. (Gasca and Peña, 1996) The matrix $A\in\mathbb{R}^{n\times n}$ is totally positive if and only if $\det(A(\alpha,\beta)) > 0$ for all index vectors $\alpha$ and $\beta$ such that one of $\alpha$ and $\beta$ is $[1,2,\dots,k]$ and the entries of the other are $k$ consecutive integers.

Theorem 7 shows that only $n(n+1)$ minors need to be tested. Gasca and Peña have also show that total nonnegativity can be tested by checking about $2^{n+1} + n^2/2$ minors. A more efficient way to test for total nonnegativity is to compute the factorization in Theorem 6 and check the signs of the entries.

Notes

The results we have described show that totally nonnegative and totally positive matrices are analogous in many ways to symmetric positive (semi)definite matrices. The analogies go further because totally nonnegative and totally positive matrices also satisfy eigenvalue interlacing inequalities (albeit weaker than for symmetric matrices) and the eigenvalues of an oscillatory matrix majorize the diagonal elements. See Fallat and Johnson (2011) or Fallat (2014) for details.

References

This is a minimal set of references, which contain further useful references within.

Richard A. Brualdi and Steve Kirkland, Totally Nonnegative (0,1)-Matrices, Linear Algebra Appl. 432, 1650–1662, 2010.
Colin W. Cryer, Some Properties of Totally Positive Matrices, Linear Algebra Appl. 15, 1–25, 1976.
Carl de Boor and Allan Pinkus, Backward Error Analysis for Totally Positive Linear Systems, 27, 485–490, 1977.
Mariano Gasca and Juan M. Peña, On Factorizations of Totally Positive Matrices, in Mariano Gasca and Charles Micchelli, eds, Total Positivity and Its Applications, 109–130, Springer, 1996.
Shaun M. Fallat, Bidiagonal Factorizations of Totally Nonnegative Matrices, Amer. Math. Monthly 108 (6), 697–712, 2001.
Shaun M. Fallat, Totally Positive and Totally Nonnegative Matrices, in Handbook of Linear Algebra, Leslie Hogben, ed, 29.1–29.17, Chapman and Hall/CRC, 2014.
Shaun M. Fallat and Charles R. Johnson, Totally Nonnegative Matrices, Princeton University Press, Princeton, NJ, USA, 2011.

What Is the Perron–Frobenius Theorem?

A real matrix is nonnegative if all its elements are nonnegative and it is positive if all its elements are positive. Nonnegative matrices arise in a wide variety of applications, for example as matrices of probabilities in Markov processes and as adjacency matrices of graphs. Information about the eigensystem is often essential in these applications.

Perron (1907) proved results about the eigensystem of a positive matrix and Frobenius (1912) extended them to nonnegative matrices.

The following three results of increasing specificity summarize the key spectral properties of nonnegative matrices proved by Perron and Frobenius. Recall that a simple eigenvalue of an $n\times n$ matrix is one with algebraic multiplicity $1$ , that is, it occurs only once in the set of $n$ eigenvalues. We denote by $\rho(A)$ the spectral radius of $A$ , the largest absolute value of any eigenvalue of $A$ .

Theorem 1. (Perron–Frobenius) If $A\in\mathbb{R}^{n\times n}$ is nonnegative then

$\rho(A)$ is an eigenvalue of $A$ ,

there is a nonnegative eigenvector $x$ such that $Ax = \rho(A)x$ .

A matrix $A\in\mathbb{C}^{n\times n}$ is reducible if there is a permutation matrix $P$ such that

$\notag P^TAP = \begin{bmatrix} A_{11} & A_{12} \\ 0 & A_{22} \end{bmatrix}$

where $A_{11}$ and $A_{22}$ are square, nonempty submatrices; it is irreducible if it is not reducible. Examples of reducible matrices are triangular matrices and matrices with a zero row or column. A positive matrix is trivially irreducible.

Theorem 2. (Perron–Frobenius) If $A\in\mathbb{R}^{n\times n}$ is nonnegative and irreducible then

$\rho(A)$ is an eigenvalue of $A$ ,

$\rho(A)>0$ ,

there is a positive eigenvector $x$ such that $Ax = \rho(A) x$ ,

$\rho(A)$ is a simple eigenvalue.

Theorem 3. (Perron) If $A\in\mathbb{R}^{n\times n}$ is positive then Theorem 2 holds and, in addition, $|\lambda| < \rho(A)$ for any eigenvalue $\lambda$ with $\lambda \ne \rho(A)$ .

For nonnegative, irreducible $A$ , the eigenvalue $\rho(A)$ is called the Perron root of $A$ and the corresponding positive eigenvector $x$ , normalized so that $\|x\|_1 = 1$ , is called the Perron vector.

It is a good exercise to apply the theorems to all binary $2\times 2$ matrices. Here are some interesting cases.

$A = \bigl[\begin{smallmatrix}0 & 1 \\ 0 & 0 \end{smallmatrix}\bigr]$ : Theorem 1 says that $\rho(A) = 0$ is an eigenvalue and and that it has a nonnegative eigenvector. Indeed $[1~0]^T$ is an eigenvector. Note that $A$ is reducible and $0$ is a repeated eigenvalue.
$A = \bigl[\begin{smallmatrix}0 & 1 \\ 1 & 0 \end{smallmatrix}\bigr]$ : $A$ is irreducible and Theorem 2 says that $\rho(A)$ is a simple eigenvalue with positive eigenvector. Indeed the eigenvalues are $\pm 1$ and $[1~1]^T/2$ is the Perron vector for the Perron root $1$ . This matrix has two eigenvalues of maximal modulus.
$A = \bigl[\begin{smallmatrix}1 & 1 \\ 1 & 1 \end{smallmatrix}\bigr]$ : Theorem 3 says that $\rho(A) = 2$ is an eigenvalue with positive eigenvector and that the other eigenvalue has modulus less than $2$ . Indeed the eigenvalues are the Perron root $2$ , with Perron vector $[1~1]^T/2$ , and $0$ .

For another example, consider the irreducible matrix

$\notag B = \begin{bmatrix} 0 & 0 & 1\\ 1 & 0 & 0\\ 0 & 1 & 0 \end{bmatrix}, \quad \Lambda(B) = \bigl\{ 1, \textstyle\frac{1}{2}( -1 \pm \sqrt{3}\mskip1mu\mathrm{i} ) \bigr\}.$

Note that $B$ is a companion matrix and a permutation matrix. Theorem 2 correctly tells us that $\rho(A) = 1$ is an eigenvalue of $A$ , and that it has a corresponding positive eigenvector, the Perron vector $[1~1~1]^T/3$ . Two of the eigenvalues are complex, however, and all three eigenvalues have modulus 1, as they must because $B$ is orthogonal.

A stochastic matrix is a nonnegative matrix whose row sums are all equal to $1$ . A stochastic matrix satisfies $Ae = e$ , where $e = [1,1,\dots,1]^T$ , which means that $A$ has an eigenvalue $1$ , and so $\rho(A) \ge 1$ . Since $\rho(A) \le \|A\|$ for any norm, by taking the $\infty$ -norm we conclude that $\rho(A) = 1$ . For a stochastic matrix, Theorem 1 does not give any further information. If $A$ is irreducible then Theorem 2 tells us that $\rho(A)$ is a simple eigenvalue, and if $A$ is positive Theorem 3 tells us that every other eigenvalue has modulus less than $\rho(A)$ .

The next result is easily proved using Theorem 3 together with the Jordan canonical form. It shows that the powers of a positive matrix behave like multiples of a rank-1 matrix.

Theorem 4. If $A\in\mathbb{R}^{n\times n}$ is positive, $x$ is the Perron vector of $A$ , and $y$ is the Perron vector of $A^T$ then

$\notag \displaystyle\lim_{k\to\infty} \left( \displaystyle\frac{A}{\rho(A)} \right)^k = \displaystyle\frac{xy^T}{y^Tx}.$

Note that $y$ in the theorem is a left eigenvector of $A$ corresponding to $\rho(A)$ , that is, $y^TA = \rho(A)y^T$ (since $\rho(A^T) = \rho(A)$ ).

If $A$ is stochastic and positive then Theorem 4 is applicable and $x = n^{-1}e$ . If $A$ also has unit column sums, so that it is doubly stochastic, then $y = n^{-1}e$ and Theorem 4 says that $\lim_{k\to\infty}A^k = n^{-1}ee^T$ . We illustrate this result in MATLAB using a scaled magic square matrix.

>> n = 4; M = magic(n), A = M/sum(M(1,:)) % Doubly stochastic matrix.
A =
    16     2     3    13
     5    11    10     8
     9     7     6    12
     4    14    15     1
A =
   4.7059e-01   5.8824e-02   8.8235e-02   3.8235e-01
   1.4706e-01   3.2353e-01   2.9412e-01   2.3529e-01
   2.6471e-01   2.0588e-01   1.7647e-01   3.5294e-01
   1.1765e-01   4.1176e-01   4.4118e-01   2.9412e-02

>> for k = 8:8:32, fprintf('%11.2e',norm(A^k-ones(n)/n,1)), end, disp(' ')
   3.21e-05   7.37e-10   1.71e-14   8.05e-16

References

This is a minimal set of references, which contain further useful references within.

Roger A. Horn and Charles R. Johnson, Matrix Analysis, second edition, Cambridge University Press, 2013. Chapter 8. My review of the second edition.
Carl D. Meyer, Matrix Analysis and Applied Linear Algebra, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2000. Chapter 8.
Helene Shapiro, Linear Algebra and Matrices. Topics for a Second Course, American Mathematical Society, Providence, RI, USA, 2015. Chapter 17.

This article is part of the “What Is” series, available from https://nhigham.com/category/what-is and in PDF form from the GitHub repository https://github.com/higham/what-is.

What Is the Kac–Murdock–Szegö Matrix?

The Kac–Murdock–Szegö matrix is the symmetric Toeplitz matrix

$\notag A_n(\rho) = \begin{bmatrix} 1 & \rho & \rho^2 & \dots & \rho^{n-1} \\ \rho & 1 & \rho & \dots & \rho^{n-2} \\ \rho^2 & \rho & 1 & \ddots & \vdots \\ \vdots & \vdots & \ddots & \ddots & \rho \\ \rho^{n-1} & \rho^{n-2} & \dots & \rho & 1 \end{bmatrix} \in\mathbb{R}^{n\times n}. \qquad(1)$

It was considered by Kac, Murdock, and Szegö (1953), who investigated its spectral properties. It arises in the autoregressive AR(1) model in statistics and signal processing.

The matrix is singular for $\rho=1$ , as $A_n(1)$ is the rank- $1$ matrix $ee^T$ , and it is also rank- $1$ for $\rho = -1$ , as in this case every column is a multiple of the vector with alternating elements $\pm 1$ . The determinant $\det(A_n(\rho)) = (1-\rho^2)^{n-1}$ . For $\rho \ne \pm 1$ , $A_n$ is nonsingular and the inverse is the tridiagonal (but not Toeplitz) matrix

$\notag A_n(\rho)^{-1} = \displaystyle\frac{1}{1-\rho^2} \begin{bmatrix} 1 & -\rho & 0 & \dots & \dots & 0 \\ -\rho & 1+\rho^2 & -\rho & \dots & \dots & 0 \\ 0 & -\rho & 1+\rho^2 & \ddots & \dots & \vdots \\ \vdots & \vdots & \ddots & \ddots & \ddots & 0 \\ 0 & \dots & \dots & -\rho & 1+\rho^2 & -\rho \\ 0 & \dots & \dots & 0 & -\rho & 1 \end{bmatrix}. \qquad (2)$

For $-1 < \rho < 1$ , $A_n(\rho)$ is positive definite, since every leading principal submatrix has positive determinant, as can also be seen by noting that the inverse is diagonally dominant with positive diagonal, so that $A_n^{-1}$ is positive definite and hence $A_n$ is positive definite.

For $-1 \le \rho \le 1$ , $A_n(\rho)$ is positive semidefinite, so it is a correlation matrix for $\rho$ in this range.

For $0 \le \rho \le 1$ , $A_n(\rho)$ is totally nonnegative, that is. every submatrix has nonnegative determinant. For $0 < \rho < 1$ , we know that $A_n(\rho)$ is nonsingular, and it is clearly irreducible, and together with the total nonnegativity these properties imply that the eigenvalues are distinct and positive (this can also be deduced from the fact that the inverse is tridiagonal with nonzero subdiagonal and superdiagonal entries).

It is straightforward to verify that $A_n$ has a factorization $A_n = LDL^*$ with $L$ the inverse of a unit lower bidiagonal matrix:

$\notag L = \begin{bmatrix} 1 & & & &\\ -\rho & 1 & & &\\ & -\rho & 1 & &\\ & & \ddots & \ddots &\\ & & &-\rho &1 \end{bmatrix}^{-1}, \quad D = \mathrm{diag}(1, 1-\rho^2, 1-\rho^2, \dots, 1-\rho^2). \qquad (3)$

This factorization can be used to prove all the properties stated above.

From (1) and (2) we can derive the formulas

$\notag \begin{aligned} \|A_n\|_{1,\infty} &= 2 \left(\displaystyle\frac{1-\rho^{k+1}}{1-\rho}\right) -1 - (2k-n+1) \rho^k, \quad k = \lfloor n/2 \rfloor, \\ \|A_n^{-1}\|_{\infty} &= (1+2\rho+\rho^2)/(1-\rho^2) = (1+\rho)/(1-\rho). \end{aligned}$

Hence we have an explicit formula for the condition number $\kappa_p(A_n) = \|A_n\|_{1,\infty} \|A_n^{-1}\|_{1,\infty}$ for $p = 1,\infty$ .

We can allow $\rho$ to be complex, in which case the definition (1) is modified to conjugate the elements below the diagonal. The factorization $A = LDL^*$ continues to hold with $D$ in $(2)$ replaced by $\mathrm{diag}(1, 1-|\rho|^2, 1-|\rho|^2, \dots, 1-|\rho|^2)$ .

The Kac–Murdock–Szegö matrix (for real or complex $\rho$ ) can be generated in MATLAB as gallery('kms',n,rho).

References

This is a minimal set of references, which contain further useful references within.

George Fikioris, Spectral Properties of Kac–Murdock-Szegö Matrices with a Complex Parameter, Linear Algebra Appl 553, 182–210, 2018.
M. Kac, W. L. Murdock, and G. Szegö, On the Eigen-values of Certain Hermitian Forms, Journal of Rational Mechanics and Analysis 2, 767–800, 1953.

What Is the Determinant of a Matrix?

The determinant of an $n\times n$ matrix $A$ is defined by

$\notag \det(A) = \displaystyle\sum_j (-1)^{\mathop{\mathrm{sgn}}j} a_{1,j_1}a_{2,j_2} \dots a_{n,j_n}, \qquad (1)$

where the sum is over all $n!$ permutations $j = (j_1,j_2,\dots,j_n$ ) of the sequence $(1,2,\dots,n)$ and $\mathop{\mathrm{sgn}}j$ is the number of inversions in $j$ , that is, the number of pairs $(j_k,j_\ell)$ with $k j_\ell$ . Each term in the sum is a signed product of $n$ entries of $A$ and the product contains one entry taken from each row and one from each column.

The determinant is sometimes written with vertical bars, as $|A|$ .

Three fundamental properties are

$\notag \begin{aligned} \det(\alpha A) &= \alpha^n \det(A)\; \mathrm{for~any~scalar~}\alpha,\qquad(2)\\ \det(A^T) &= \det(A), \qquad(3)\\ \det(AB) &= \det(A)\det(B) \mathrm{~for~} n\times n~ A \mathrm{~and~} B.\qquad(4) \end{aligned}$

The first property is immediate, the second can be proved using properties of permutations, and the third is proved in texts on linear algebra and matrix theory.

An alternative, recursive expression for the determinant is the Laplace expansion

$\notag \det(A) = \displaystyle\sum_{j=1}^n (-1)^{i+j} a_{ij} \det (A_{ij}). \qquad(5)$

for any $i\in\{1,2,\dots,n\}$ , where $A_{ij}$ denotes the $(n-1)\times (n-1)$ submatrix of $A$ obtained by deleting row $i$ and column $j$ , and $\det(a) = a$ for a scalar $a$ . This formula is called the expansion by minors because $\det (A_{ij})$ is a minor of $A$ .

For some types of matrices the determinant is easy to evaluate. If $T$ is triangular then $\det(T) = \prod_{i=1}^n t_{ii}$ . If $Q$ is unitary then $Q^*Q = I$ implies $|\det(Q)| = 1$ on using (3) and (4). An explicit formula exists for the determinant of a Vandermonde matrix.

The determinant of $A$ is connected with the eigenvalues $\lambda_i$ of $A$ via the property $\det(A) = \prod_{i=1}^n \lambda_i$ . Since the eigenvalues are the roots of the characteristic polynomial $\det(tI - A)$ , this relation follows by setting $t = 0$ in the expression

$\notag \det(tI - A) = t^n + a_{n-1}t^{n-1} + \cdots + a_1 t + a_0 = \displaystyle\prod_{i=1}^n (t - \lambda_i).$

For $n=2$ , the determinant is

$\notag \det\biggl( \begin{bmatrix} a & b \\ c & d \end{bmatrix} \biggr) = ad - bc,$

but already for $n=3$ the determinant is tedious to write down. If one must compute $\det(A)$ , the formulas (1) and (5) are too expensive unless $n$ is very small: they have an exponential cost. The best approach is to use a factorization of $A$ involving factors that are triangular or orthogonal, so that the determinants of the factors are easily computed. If $PA = LU$ is an LU factorization, with $P$ a permutation matrix, $L$ unit lower triangular, and $U$ upper triangular, then $\det(A) = \det(P) \prod_{i=1}^n u_{ii} = \pm \prod_{i=1}^n u_{ii}$ . As this expression indicates, the determinant is prone to overflow and underflow in floating-point arithmetic, so it may be preferable to compute $\log(|\det(A)|) = \sum_{i=1}^n \log|u_{ii}|$ .

The determinant features in the formula

$\notag A^{-1} = \displaystyle\frac{\mathrm{adj}(A)}{\det(A)}$

for the inverse, where $\mathrm{adj}(A)$ is the adjugate of $A$ (recall that $\mathrm{adj}(A)$ has $(i,j)$ element $(-1)^{i+j} \det(A_{ji})$ ). More generally, Cramer’s rule says that the components of the solution to a linear system $Ax = b$ are given by $x_i = \det(A_i(b))/\det(A)$ , where $A_i(b)$ denotes $A$ with its $i$ th column replaced by $b$ . While mathematically elegant, Cramer’s rule is of no practical use, as it is both expensive and numerically unstable in finite precision arithmetic.

Inequalities

A celebrated bound for the determinant of a Hermitian positive definite matrix $H\in\mathbb{C}^{n\times n}$ is Hadamard’s inequality. Note that for such $H$ , $\det(H)$ is real and positive (being the product of the eigenvalues, which are real and positive) and the diagonal elements are also real and positive (since $h_{ii} = e_i^*He_i > 0$ ).

Theorem 1 (Hadamard’s inequality). For a Hermitian positive definite matrix $H\in\mathbb{C}^{n\times n}$ ,

$\notag \det(H) \le \displaystyle\prod_{i=1}^n h_{ii},$

with equality if and only if $H$ is diagonal.

Theorem 1 is easy to prove using a Cholesky factorization.

The following corollary can be obtained by applying Theorem 1 to $H = A^*A$ or by using a QR factorization of $A$ .

Corollary 2. For $A = [a_1,a_2,\dots,a_n] \in\mathbb{C}^{n\times n}$ ,

$\notag \det(A) \le \displaystyle\prod_{i=1}^n \|a_i\|_2,$

with equality if and only if the columns of $A$ are orthogonal.

Obviously, one can apply the corollary to $A^*$ and obtain the analogous bound with column norms replaced by row norms.

The determinant of $A\in\mathbb{R}^{n\times n}$ can be interpreted as the volume of the parallelepiped $\{\, \sum_{i=1}^n t_ia_i : 0 \le t_i \le 1, ~ i = 1\colon n\,\}$ , whose sides are the columns of $A$ . Corollary 2 says that for columns of given lengths the volume is maximized when the columns are orthogonal.

Nearness to Singularity and Conditioning

The determinant characterizes nonsingularity: $A$ is singular if and only if $\det(A) = 0$ . It might be tempting to use $|\det(A)|$ as a measure of how close a nonsingular matrix $A$ is to being singular, but this measure is flawed, not least because of the sensitivity of the determinant to scaling. Indeed if $Q$ is unitary then $\det(\alpha Q) = \alpha^n \det(Q)$ can be given any value by a suitable choice of $\alpha$ , yet $\alpha Q$ is perfectly conditioned: $\kappa_2(\alpha Q) = 1$ , where $\kappa(A) = \|A\| \|A^{-1}\|$ is the condition number.

To deal with the poor scaling one might normalize the determinant: in view of Corollary 2,

$\notag \psi(A) = \displaystyle\frac{\prod_{i=1}^n \|a_i\|_2} {\det(A)}$

satisfies $\psi(A) \ge 1$ and $\psi(A) = 1$ if and only if the columns of $A$ are orthogonal. Birkhoff (1975) calls $\psi$ the Hadamard condition number. In general, $\psi$ is not related to the condition number $\kappa$ , but if $A$ has columns of unit $2$ -norm then it can be shown that $\kappa_2(A) < 2\psi(A)$ (Higham, 2002, Prob. 14.13). Dixon (1984) shows that for classes of $n\times n$ random matrices $A_n$ that include matrices with elements independently drawn from a normal distribution with mean $0$ , the probability that the inequality

$n^{1/4 - \epsilon} \mathrm{e}^{n/2} < \psi(A_n) < n^{1/4 + \epsilon} \mathrm{e}^{n/2}$

holds tends to $1$ as $n\to\infty$ for any $\epsilon > 0$ , so $\psi(A_n) \approx n^{1/4}\mathrm{e}^{n/2}$ for large $n$ . This exponential growth is much faster than the growth of $\kappa$ , for which Edelman (1998) showed that for the standard normal distribution, $\mathbb{E}(\log(\kappa_2(A_n))) \approx \log n + 1.537$ , where $\mathbb{E}$ denotes the mean value. This MATLAB example illustrates these points.

>> rng(1); n = 50; A = randn(n); 
>> psi = prod(sqrt(sum(A.*A)))/abs(det(A)), kappa2 = cond(A)
psi =
   5.3632e+10
kappa2 =
   1.5285e+02
>> ratio = psi/(n^(0.25)*exp(n/2))
ratio =
   2.8011e-01

The relative distance from $A$ to the set of singular matrices is equal to the reciprocal of the condition number.

Theorem 3 (Gastinel, Kahan). For $A\in\mathbb{C}^{n\times n}$ and any subordinate matrix norm,

$\notag \min \left\{ \displaystyle\frac{\|\Delta A\|}{\|A\|} : A+\Delta A\mathrm{~singular} \right\} = \displaystyle\frac{1}{\kappa(A)}.$

Notes

Determinants came before matrices, historically. Most linear algebra textbooks make significant use of determinants, but a lot can be done without them. Axler (1995) shows how the theory of eigenvalues can be developed without using determinants.

Determinants have little application in practical computations, but they are a useful theoretical tool in numerical analysis, particularly for proving nonsingularity.

There is a large number of formulas and identities for determinants. Sir Thomas Muir collected many of them in his five-volume magnum opus The Theory of Determinants in the Historical Order of Development, published between 1890 and 1930. Brualdi and Schneider (1983) give concise derivations of many identities using Gaussian elimination, bringing out connections between the identities.

The quantity obtained by modifying the definition (1) of determinant to remove the $(-1)^{\mathop{\mathrm{sgn}}j}$ term is the permanent. The permanent arises in combinatorics and quantum mechanics and is much harder to compute than the determinant: no algorithm is known for computing the permanent in $p(n)$ operations for a polynomial $p$ .

References

This is a minimal set of references, which contain further useful references within.

Sheldon Axler, Down With Determinants!, Amer. Math. Monthly 102, 139–154, 1995.
Garrett Birkhoff, Two Hadamard Numbers for Matrices, Comm. ACM 18, 25–29, 1975.
Richard Brualdi and Hans Schneider, Determinantal Identities: Gauss, Schur, Cauchy, Sylvester, Kronecker, Jacobi, Binet, Laplace, Muir, and Cayley, Linear Algebra Appl. 52–53, 769–791, 1983.
John Dixon, How Good is Hadamard’s Inequality for Determinants?, Canadian Math. Bulletin 27, 260–264, 1984.
Alan Edelman, Eigenvalues and Condition Numbers of Random Matrices, SIAM J. Matrix Anal. Appl. 9(4), 543–560, 1988.
Nicholas J. Higham, Accuracy and Stability of Numerical Algorithms, second edition, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2002.

What Is a Vandermonde Matrix?

A Vandermonde matrix is defined in terms of scalars $x_1$ , $x_2$ , …, $x_n\in\mathbb{C}$ by

The $x_i$ are called points or nodes. Note that while we have indexed the nodes from $1$ , they are usually indexed from $0$ in papers concerned with algorithms for solving Vandermonde systems.

Vandermonde matrices arise in polynomial interpolation. Suppose we wish to find a polynomial $p_{n-1}(x) = a_nx^{n-1} + a_{n-1}x^{n-2} + \cdots + a_1$ of degree at most $n-1$ that interpolates to the data $(x_i,f_i)_{i=1}^n$ , that is, $p_{n-1}(x_i) = f_i$ , $i=1\colon n$ . These equations are equivalent to

$\notag V^Ta = f \quad \mathrm{(dual)},$

where $a = [a_1,a_2,\dots,a_n]^T$ is the vector of coefficients. This is known as the dual problem. We know from polynomial interpolation theory that there is a unique interpolant if the $x_i$ are distinct, so this is the condition for $V$ to be nonsingular.

The problem

$\notag Vy = b \quad \mathrm{(primal)}$

is called the primal problem, and it arises when we determine the weights for a quadrature rule: given moments $b_i$ find weights $y_i$ such that $\sum_{j=1}^n y_j^{} x_j^{\,i-1} = b_i$ , $i=1\colon n$ .

Determinant

The determinant of $V$ is a function of the $n$ points $x_i$ . If $x_i = x_j$ for some $i\ne j$ then $V$ has identical $i$ th and $j$ th columns, so is singular. Hence the determinant must have a factor $x_i - x_j$ . Consequently, we have

$\notag \det( V(x_1,x_2,\dots,x_n) ) = c \displaystyle\prod_{i,j = 1\atop i > j}^n (x_i - x_j),$

where, since both sides have degree $n(n-1)/2$ in the $x_i$ , $c$ is a constant. But $\det(V)$ contains a term $x_2 x_3^2 \dots x_n^{n-1}$ (from the main diagonal), so $c = 1$ . Hence

$\notag \det(V) = \displaystyle\prod_{i,j = 1\atop i > j}^n (x_i - x_j). \qquad (1)$

This formula confirms that $V$ is nonsingular precisely when the $x_i$ are distinct.

Inverse

Now assume that $V$ is nonsingular and let $V^{-1} = W = (w_{ij})_{i,j=1}^n$ . Equating elements in the $i$ th row of $WV = I$ gives

$\sum_{j=1}^n w_{ij} x_k^{\mskip1mu j-1} = \delta_{ik}, \quad k=1\colon n,$

where $\delta_{ij}$ is the Kronecker delta (equal to $1$ if $i=j$ and $0$ otherwise). These equations say that the polynomial $\sum_{j=1}^n w_{ij} x^{\mskip1mu j-1}$ takes the value $1$ at $x = x_i$ and $0$ at $x = x_k$ , $k\ne i$ . It is not hard to see that this polynomial is the Lagrange basis polynomial:

$\notag \sum_{j=1}^n w_{ij} x^{j-1} = \displaystyle\prod_{k=1\atop k\ne i}^n \left( \frac{x-x_k}{x_i-x_k} \right) =: \ell_i(x). \qquad (2)$

We deduce that

$\notag w_{ij} = \displaystyle\frac{ (-1)^{n-j} \sigma_{n-j}(x_1,\dots,x_{i-1},x_{i+1},\dots,x_n) } { \displaystyle\prod_{k=1 \atop k\ne i}^n (x_i-x_k) }, \qquad (3)$

where $\sigma_k(y_1,\dots,y_n)$ denotes the sum of all distinct products of $k$ of the arguments $y_1,\dots,y_n$ (that is, $\sigma_k$ is the $k$ th elementary symmetric function).

From (1) and (3) we see that if the $x_i$ are real and positive and arranged in increasing order $0 < x_1 < x_2 < \cdots 0$ and $V^{-1}$ has a checkerboard sign pattern: the $(i,j)$ element has sign $(-1)^{i+j}$ .

Note that summing (2) over $i$ gives

$\notag \displaystyle\sum_{j=1}^n x^{j-1} \sum_{i=1}^n w_{ij} = \sum_{i=1}^n \ell_i(x) = 1,$

where the second equality follows from the fact that $\sum_{i=1}^n \ell_i(x)$ is a degree $n-1$ polynomial that takes the value $1$ at the $n$ distinct points $x_i$ . Hence

$\notag \displaystyle\sum_{i=1}^n w_{ij} = \delta_{j1},$

so the elements in the $j$ th column of the inverse sum to $1$ for $j = 1$ and $0$ for $j\ge 2$ .

Example

To illustrate the formulas above, here is an example, with $x_i = (i-1)/(n-1)$ and $n = 5$ :

$\notag V = \left[\begin{array}{ccccc} 1 & 1 & 1 & 1 & 1\\ 0 & \frac{1}{4} & \frac{1}{2} & \frac{3}{4} & 1\\[\smallskipamount] 0 & \frac{1}{16} & \frac{1}{4} & \frac{9}{16} & 1\\[\smallskipamount] 0 & \frac{1}{64} & \frac{1}{8} & \frac{27}{64} & 1\\[\smallskipamount] 0 & \frac{1}{256} & \frac{1}{16} & \frac{81}{256} & 1 \end{array}\right], \quad V^{-1} = \left[\begin{array}{ccccc} 1 & -\frac{25}{3} & \frac{70}{3} & -\frac{80}{3} & \frac{32}{3}\\[\smallskipamount] 0 & 16 & -\frac{208}{3} & 96 & -\frac{128}{3}\\ 0 & -12 & 76 & -128 & 64\\[\smallskipamount] 0 & \frac{16}{3} & -\frac{112}{3} & \frac{224}{3} & -\frac{128}{3}\\[\smallskipamount] 0 & -1 & \frac{22}{3} & -16 & \frac{32}{3} \end{array}\right],$

for which $\det(V) = 9/32768$ .

Conditioning

Vandermonde matrices are notorious for being ill conditioned. The ill conditioning stems from the monomials being a poor basis for the polynomials on the real line. For arbitrary distinct points $x_i$ , Gautschi showed that $V_n = V(x_1, x_2, \dots, x_n)$ satisfies

$\notag \displaystyle\max_i \displaystyle\prod_{j\ne i} \frac{ \max(1,|x_j|) }{ |x_i-x_j| } \le \|V_n^{-1}\|_{\infty} \le \displaystyle\max_i \prod_{j\ne i} \frac{ 1+|x_j| }{ |x_i-x_j| },$

with equality on the right when $x_j = |x_j| e^{\mathrm{i}\theta}$ for all $j$ with a fixed $\theta$ (in particular, when $x_j\ge0$ for all $j$ ). Note that the upper and lower bounds differ by at most a factor $2^{n-1}$ . It is also known that for any set of real points $x_i$ ,

$\notag \kappa_2(V_n) \ge \Bigl(\displaystyle\frac{2}{n}\Bigr)^{1/2} \, (1+\sqrt{2})^{n-2}$

and that for $x_i = 1/i$ we have $\kappa_{\infty}(V_n) > n^{n+1}$ , where the lower bound is an extremely fast growing function of the dimension!

These exponential lower bounds are alarming, but they do not necessarily rule out the use of Vandermonde matrices in practice. One of the reasons is that there are specialized algorithms for solving Vandermonde systems whose accuracy is not dependent on the condition number $\kappa$ , and which in some cases can be proved to be highly accurate. The first such algorithm is an $O(n^2)$ operation algorithm for solving $V_ny =b$ of Björck and Pereyra (1970). There is now a long list of generalizations of this algorithm in various directions, including for confluent Vandermonde-like matrices (Higham, 1990), as well as for more specialized problems (Demmel and Koev, 2005) and more general ones (Bella et al., 2009). Another important observation is that the exponential lower bounds are for real nodes. For complex nodes $V_n$ can be much better conditioned. Indeed when the $x_i$ are the roots of unity, $V_n/\sqrt{n}$ is the unitary Fourier matrix and so $V_n$ is perfectly conditioned.

Generalizations

Two ways in which Vandermonde matrices have been generalized are by allowing confluency of the points $x_i$ and by replacing the monomials by other polynomials. Confluency arises when the $x_i$ are not distinct. If we assume that equal $x_i$ are contiguous then a confluent Vandermonde matrix is obtained by “differentiating” the previous column for each of the repeated points. For example, with points $x_1, x_1, x_1, x_2, x_2$ we obtain

$\notag \begin{bmatrix} 1 & 0 & 0 & 1 & 0 \\ x_1 & 1 & 0 & x_2 & 1 \\ x_1^2 & 2x_1 & 2 & x_2^2 & 2x_2 \\ x_1^3 & 3x_1^2 & 6x_1 & x_2^3 & 3x_2^2 \\ x_1^4 & 4x_1^3 & 12x_1^2 & x_2^4 & 4x_2^3 \end{bmatrix}. \qquad (4)$

The transpose of a confluent Vandermonde matrix arises in Hermite interpolation; it is nonsingular if the points corresponding to the “nonconfluent columns” are distinct (that is, if $x_1 \ne x_2$ in the case of (4)).

A Vandermonde-like matrix is defined in terms of a set of polynomials $\{p_i(x)\}_{i=0}^n$ with $p_i$ having degree $i$ :

$\notag \begin{bmatrix} p_0(x_1) & p_0(x_2) & \dots & p_0(x_n)\\ p_1(x_1) & p_1(x_2) & \dots & p_1(x_n)\\ \vdots & \vdots & \dots & \vdots\\ p_{n-1}(x_1) & p_{n-1}(x_2) & \dots & p_{n-1}(x_n)\\ \end{bmatrix}.$

Of most interest are polynomials that satisfy a three-term recurrence, in particular, orthogonal polynomials. Such matrices can be much better conditioned than general Vandermonde matrices.

Notes

Algorithms for solving confluent Vandermonde-like systems and their rounding error analysis are described in the chapter “Vandermonde systems” of Higham (2002).

Gautschi has written many papers on the conditioning of Vandermonde matrices, beginning in 1962. We mention just his most recent paper on this topic: Gautschi (2011).

References

This is a minimal set of references, which contain further useful references within.

T. Bella, Y. Eidelman. I. Gohberg. I. Koltracht, and V. Olshevsky, A Fast Björck–Pereyra-Type Algorithm for Solving Hessenberg-Quasiseparable-Vandermonde Systems, SIAM J. Matrix Anal. Appl. 31(2), 790–815, 2009.
James W. Demmel and Plamen Koev, The Accurate and Efficient Solution of a Totally Positive Generalized Vandermonde Linear System, SIAM J. Matrix Anal. Appl. 2791, 142–152, 2005.
Walter Gautschi, Optimally Scaled and Optimally Conditioned Vandermonde and Vandermonde-like matrices, BIT 51, 103–125, 2011.
Nicholas J. Higham, Accuracy and Stability of Numerical Algorithms, second edition, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2002.
Nicholas J. Higham, Stability Analysis of Algorithms for Solving Confluent Vandermonde-like Systems, SIAM J. Matrix Anal. Appl. 11, 23–41, 1990.

What Is the Wilson Matrix?

The $4\times 4$ matrix

$\notag W = \begin{bmatrix} 5 & 7 & 6 & 5 \\ 7 & 10 & 8 & 7 \\ 6 & 8 & 10 & 9 \\ 5 & 7 & 9 & 10 \end{bmatrix}$

appears in a 1946 paper by Morris, in which it is described as having been “devised by Mr. T. S. Wilson.” The matrix is symmetric positive definite with determinant $1$ and inverse

$\notag W^{-1} = \begin{bmatrix} 68 & -41 & -17 & 10\\ -41 & 25 & 10 & -6\\ -17 & 10 & 5 & -3\\ 10 & -6 & -3 & 2 \end{bmatrix},$

so it is moderately ill conditioned with $\kappa_2(W) = \|W\|_2 \|W^{-1}\|_2 \approx 2.98409\times 10^3$ . This little matrix has been used as an example and for test purposes in many research papers and books over the years, in particular by John Todd, who described it as “the notorious matrix $W$ of T. S. Wilson”.

Rutishauser (1968) stated that “the famous Wilson matrix is not a very striking example of an ill-conditioned matrix”, on the basis that $\kappa_2(A)\le 40{,}000$ for a “positive definite symmetric $4\times 4$ matrix with integer elements not exceeding $10$ ” and he gave the positive definite matrix

$\notag A_0 = \begin{bmatrix} 10 & 1 & 4 & 0 \\ 1 & 10 & 5 & -1 \\ 4 & 5 & 10 & 7 \\ 0 & -1 & 7 & 9 \end{bmatrix}, \quad A_0^{-1} =\begin{bmatrix} 105& 167 & -304 & 255\\ 167 & 266 & -484 & 406\\ -304 & -484 & 881 & -739\\ 255 & 406 & -739 & 620 \end{bmatrix},$

for which $\kappa_2(A_0) = 3.57924\times 10^4$ . The matrix $A_0$ is therefore a factor 12 more ill conditioned than $W$ . Rutishauser did not give a proof of the stated bound.

Moler (2018) asked how ill-conditioned $W$ is relative to matrices in the set

$\notag \begin{aligned} \mathcal{S} &= \{\, A\in\mathbb{R}^{4\times 4}: A=A^T \mathrm{~is~nonsingular~with~integer~entries}\nonumber\\ & \hspace{2.9cm} \mathrm{between~1~and~10} \,\}. \end{aligned}$

He generated one million random matrices from $\mathcal{S}$ and found that about 0.21 percent of them had a larger condition number than $W$ . The matrix with the largest condition number was the indefinite matrix

$\notag A_1 = \begin{bmatrix} 1 & 3 & 10 & 10\\ 3 & 4 & 8 & 9\\ 10 & 8 & 3 & 9\\ 10 & 9 & 9 & 3 \end{bmatrix}, \quad A_1^{-1} = \begin{bmatrix} 573 & -804 & 159 & 25\\ -804 & 1128 & -223 & -35\\ 159 & -223 & 44 & 7\\ 25 & -35 & 7 & 1 \end{bmatrix},$

for which $\kappa_2(A_1) \approx 4.80867\times 10^4$ . How far is this matrix from being a worst case?

As the Wilson matrix is positive definite, we are also interested in how ill conditioned a matrix in the set

$\notag \begin{aligned} \mathcal{P} &= \{\, A\in\mathbb{R}^{4\times 4}: A=A^T \mathrm{~is~symmetric~positive~definite ~with~integer~entries}\nonumber\\ & \hspace{2.9cm} \mathrm{between~1~and~10} \,\} \end{aligned}$

can be.

Condition Number Bounds

We first consider bounds on $\kappa_2(A)$ for $A \in \mathcal{S}$ . It is possible to obtain a bound from first principles by using the relation $A^{-1} = \mathrm{adj}(A)/\det(A)$ , where $\mathrm{adj}(A)$ is the adjugate matrix, along with the fact that $|\det(A)| \ge 1$ since $A$ has integer entries. Higham and Lettington (2021) found that the smallest bound they could obtain came from a bound of Merikoski et al. (1997): for nonsingular $B\in\mathbb{R}^{n\times n}$ ,

$\notag \kappa_2(B) \le \left(\displaystyle\frac{1+x}{1-x}\right)^{1/2}, \quad x = \sqrt{1 - (n/\|B\|_F^2)^n |\det(B)|^2 }.$

Applying this bound to $A\in\mathcal{S}$ , using the fact that $(1+x)/(1-x)$ is monotonically increasing for $x\in(0,1)$ , gives

$\notag \kappa_2(A) \le 2.97606\dots \times 10^5 =: \beta_S, \quad A\in\mathcal{S}. \qquad (1)$

Another result from Merikoski et al. (1997) gives, for symmetric positive definite $C\in\in\mathbb{R}^{n\times n}$ ,

$\notag \kappa_2(C) \le \displaystyle\frac{1+x}{1-x}, \quad x = \sqrt{1 - (n/\mathrm{trace}(C))^n \det(C) }.$

For $A\in\mathcal{P}$ , since $\det(A) \ge 1$ we have $x \le \sqrt{1 - (1/10)^4}$ , and hence

$\notag \kappa_2(A) \le 3.99980 \times 10^4 =: \beta_P, \quad A\in\mathcal{P}. \qquad (2)$

Recall that Rutishauser’s bound is $4\times 10^4$ . The bounds (1) and (2) remain valid if we modify the definitions of $\mathcal{S}$ and $\mathcal{P}$ to allow zero elements (note that Rutishauser’s matrix $A_0$ has a zero element).

Experiment

The sets $\mathcal{S}$ and $\mathcal{P}$ are large: $\mathcal{S}$ has on the order of $10^{10}$ elements. Exhaustively searching over the sets in reasonable time is possible with a carefully optimized code. Higham and Lettington (2021) use a MATLAB code that loops over all symmetric matrices with integer elements between $1$ and $10$ and

evaluates $\det(A)$ from an explicit expression (exactly computed for such matrices) and discards $A$ if the matrix is singular;
computes the eigenvalues $\lambda_i$ of $A$ and obtains the condition number as $\kappa_2(A) = \max_i |\lambda_i|/\min_i |\lambda_i|$ (since $A$ is symmetric); and
for $\mathcal{P}$ , checks whether $A$ is positive definite by checking whether the smallest eigenvalue is positive.

The code is available at https://github.com/higham/wilson-opt.

The maximum over $\mathcal{S}$ is attained for

$\notag A_2 = \begin{bmatrix} 2 & 7 & 10 & 10\\ 7 & 10 & 10 & 9\\ 10 & 10 & 10 & 1\\ 10 & 9 & 1 & 9 \end{bmatrix}, \quad A_2^{-1} = \begin{bmatrix} 640 & -987 & 323 & 240\\ -987 & 1522 & -498 & -370\\ 323 & -498 & 163 & 121\\ 240 & -370 & 121 & 90 \end{bmatrix},$

which has $\kappa_2(A_2) \approx 7.6119 \times 10^4$ . and determinant $-1$ . The maximum over $\mathcal{P}$ is attained for

$\notag A_3 = \begin{bmatrix} 9 & 1 & 1 & 5\\ 1 & 10 & 1 & 9\\ 1 & 1 & 10 & 1\\ 5 & 9 & 1 & 10 \end{bmatrix}, \quad A_3^{-1} = \begin{bmatrix} 188 & 347 & -13 & -405\\ 347 & 641 & -24 & -748\\ -13 & -24 & 1 & 28\\ -405 & -748 & 28 & 873 \end{bmatrix}.$

which has $\kappa_2(A_3) \approx 3.5529 \times 10^4$ and determinant $1$ . Obviously, symmetric permutations of these matrices are also optimal.

The following table summarizes the condition numbers of the matrices discussed and how close they are to the bounds.

Matrix $A$	Comment	$\kappa_2(A)$	$\beta_S/\kappa_2(A)$	$\beta_P/\kappa_2(A)$
$W$	Wilson matrix	$2.98409\times 10^3$	99.73	13.40
$A_9$	Rutishauser’s matrix	$3.57924\times 10^4$	8.31	1.12
$A_1$	By random sampling	$4.80867\times 10^4$	6.19	—
$A_2$	Optimal matrices in $\mathcal{S}$	$7.61190\times 10^4$	3.91	—
$A_3$	Optimal matrices in $\mathcal{P}$	$3.55286\times 10^4$	8.38	1.13

Clearly, the bounds are reasonably sharp.

We do not know how Wilson constructed his matrix or to what extent he tried to maximize the condition number subject to the matrix entries being small integers. One possibility is that he constructed it via the factorization in the next section.

Integer Factorization

The Cholesky factor of the Wilson matrix is

$\notag R = \begin{bmatrix} \sqrt{5} & \frac{7\,\sqrt{5}}{5} & \frac{6\,\sqrt{5}}{5} & \sqrt{5}\\[\smallskipamount] 0 & \frac{\sqrt{5}}{5} & -\frac{2\,\sqrt{5}}{5} & 0\\[\smallskipamount] 0 & 0 & \sqrt{2} & \frac{3\,\sqrt{2}}{2}\\[\smallskipamount] 0 & 0 & 0 & \frac{\sqrt{2}}{2} \end{bmatrix} \quad (W = R^TR).$

Apart from the zero $(2,4)$ element, it is unremarkable. If we factor out the diagonal then we obtain the $LDL^T$ factorization, which has rational elements:

$\notag L = \begin{bmatrix}1 & 0 & 0 & 0\\ \frac{7}{5} & 1 & 0 & 0\\ \frac{6}{5} & -2 & 1 & 0\\ 1 & 0 & \frac{3}{2} & 1 \end{bmatrix}, \quad D = \begin{bmatrix}5 & 0 & 0 & 0\\ 0 & \frac{1}{5} & 0 & 0\\ 0 & 0 & 2 & 0\\ 0 & 0 & 0 & \frac{1}{2} \end{bmatrix} \quad (W = LDL^T).$

Suppose we drop the requirement of triangularity and ask whether the Wilson matrix has a factorization $W = Z^T\!Z$ with a $4\times4$ matrix $Z$ of integers. It is known that every symmetric positive definite $n\times n$ matrix $A$ of integers with determinant $1$ has a factorization $A = Z^T\!Z$ with $Z$ an $n\times n$ matrix of integers as long as $n \le 7$ , but examples are known for $n = 8$ for which the factorization does not exist. This result is mentioned by Taussky (1961) and goes back to Hermite, Minkowski, and Mordell. Higham and Lettington (2021) found the integer factor

$\notag Z_0 = \begin{bmatrix} 2 & 3 & 2 & 2\\ 1 & 1 & 2 & 1\\ 0 & 0 & 1 & 2\\ 0 & 0 & 1 & 1 \end{bmatrix}$

of $W$ , which is block upper triangular so can be thought of as a block Cholesky factor. Higham, Lettington, and Schmidt (2021) draw on recent research that links the existence of such factorizations to number-theoretic considerations of quadratic forms to show that for the existence of an integer solution $Z$ to $A = Z^TZ$ it is necessary that a certain quadratic equation in $n$ variables has an integer solution. In the case of the Wilson matrix the equation is

$2 w^2+x_1^2+x_1 x_2+x_1 x_3+x_2^2+x_2 x_3+x_3^2=952.$

The authors solve this equation computationally and find $Z_1$ and two rational factors:

$\notag Z_1=\left[ \begin{array}{cccc} \frac{1}{2} & 1 & 0 & 1 \\ \frac{3}{2} & 2 & 3 & 3 \\ \frac{1}{2} & 1 & 0 & 0 \\ \frac{3}{2} & 2 & 1 & 0 \\ \end{array} \right], \quad Z_2=\left[ \begin{array}{@{\mskip2mu}rrrr} \frac{3}{2} & 2 & 2 & 2 \\ \frac{3}{2} & 2 & 2 & 1 \\ \frac{1}{2} & 1 & 1 & 2 \\ -\frac{1}{2} & -1 & 1 & 1 \\ \end{array} \right].$

They show that these matrices are the only factors $Z\in\frac{1}{16}\mathbb{Z}$ of $W$ up to left multiplication by integer orthogonal matrices.

Conclusions

The Wilson matrix has provided sterling service throughout the digital computer era as a convenient symmetric positive definite matrix for use in textbook examples and for testing algorithms. The recent discovery of its integer factorization has led to the development of new theory on when general $n\times n$ integer matrices $A$ can be factored as $A = Z^TZ$ (when $A$ is symmetric positive definite) or $A = Z^2$ (a problem also considered in Higham, Lettington, and Schmidt (2021)), with integer $Z$ .

Olga Taussky Todd wrote in 1961 that “matrices with integral elements have been studied for a very long time and an enormous number of problems arise, both theoretical and practical.” We wonder what else can be learned from the Wilson matrix and other integer test matrices.

References

This is a minimal set of references, which contain further useful references within.

Nicholas J. Higham and Matthew C. Lettington, Optimizing and Factorizing the Wilson Matrix, to appear in Amer. Math. Monthly, 2021.
Nicholas J. Higham, Matthew C. Lettington, and Karl Michael Schmidt, Integer Matrix Factorisations, Superalgebras and the Quadratic Form Obstruction, Linear Algebra Appl. 622, 250–267, 2021.
Cleve B. Moler, Reviving Wilson’s Matrix, 2018.
H. Rutishauser, On Test Matrices, 7 (no. 165), 349-365, in: Programmation en Mathematiques Numeriques, Besancon, 1968.
Olga Taussky, Some Computational Problems Involving Integral Matrices, Journal of Research of the National Bureau of Standards Section B—Mathematical Sciences 65(1), 15–17, 1961.

Symmetrization

LU Factorization

Inverse

Eigenvalues

Notes

References

Related Blog Posts

Share this:

Block Algorithms

References

Related Blog Posts

Share this:

Mixed Subordinate Matrix Norms

Matrices With Constant -Norms

Computing Norms

A Norm Related to Grothendieck’s Inequality

Notes

References

Related Blog Posts

Share this:

References

Share this:

Examples

Inverse

Eigensystem

LU Factorization

Factorization into a Product of Bidiagonal Matrices

Testing for Total Positivity

Notes

References

Related Blog Posts

Share this:

References

Share this:

References

Related Blog Posts

Share this:

Inequalities

Nearness to Singularity and Conditioning

Notes

References

Related Blog Posts

Share this:

Determinant

Inverse

Example

Conditioning

Generalizations

Notes

References

Related Blog Posts

Share this:

Condition Number Bounds

Experiment

Integer Factorization

Conclusions

References

Related Blog Posts

Share this:

Matrices With Constant $p$ -Norms