What Is a Totally Nonnegative Matrix?

The determinant of a square submatrix of a matrix is called a minor. A matrix A\in\mathbb{R}^{n\times n} is totally positive if every minor is positive. It is totally nonnegative if every minor is nonnegative. These definitions require, in particular, that all the matrix elements must be nonnegative or positive, as must \det(A).

An important property is that total nonnegativity is preserved under matrix multiplication and hence under taking positive integer powers.

Theorem 1. If A,B\in\mathbb{R}^{n\times n} are totally nonnegative then so is AB.

Theorem 1 is a direct consequence of the Binet–Cauchy theorem on determinants (also known as the Cauchy–Binet theorem). To state it, we need a way of specifying submatrices. We say the vector \alpha = [\alpha_1,\alpha_2,\dots,\alpha_k] is an index vector of order k if its components are integers from the set \{1,2,\dots,n\} satisfying \alpha_1 < \alpha_2 < \cdots < \alpha_k. If \alpha and \beta are index vectors of order k and \ell, respectively, then A(\alpha, \beta) denotes the k\times \ell matrix with (i,j) element a_{\alpha_i,\beta_j}.

Theorem 2. (Binet–Cauchy) Let A\in\mathbb{R}^{m\times n}, B\in\mathbb{R}^{n\times p}, and C = AB. If \alpha and \beta are index vectors of order k and 1 \le k \le \min(m,n,p) then

\notag     \det(C(\alpha,\beta)) = \sum_{\gamma} \det( A(\alpha,\gamma) )                                           \det( B(\gamma,\beta) ),      \qquad (1)

where the sum is over all index vectors \gamma of order k.

Note than when k = m = n = p, (1) reduces to the well-known relation \det(AB) = \det(A)\det(B), while when k = 1, (1) reduces to the definition of matrix multiplication.

Totally nonnegative matrices have many interesting determinantal properties. For example, they satisfy Fischer’s inequality, first proved for symmetric positive definite matrices.

Theorem 3. (Fischer) If A\in\mathbb{R}^{n\times n} is totally nonnegative then for any index vector \alpha,

\notag     \det(A) \le \det(A(\alpha)) \det(A(\alpha^c)),      \qquad (2)

where \alpha^c comprises the indices not in \alpha.

By repeatedly applying (2) with \alpha containing just one element, we obtain Hadamard’s inequality for totally nonnegative A:

\notag      \det(A) \le a_{11} a_{22} \cdots a_{nn}.

Examples

We give some examples of totally positive matrices, showing how they can be generated in MATLAB. We use the Anymatrix toolbox.

A matrix well known to be positive definite, but which is also totally positive, is the Hilbert matrix H\in\mathbb{R}^{n\times n}, with h_{ij} = 1/(i+j-1). The Hilbert matrix is a particular case of a Cauchy matrix C, with c_{ij} = 1/(x_i + y_j) for given vectors x,y\in\mathbb{R}^{n\times n}. A Cauchy matrix is totally positive if 0 < x_1 < x_2 < \cdots < x_n and 0 < y_1 < y_2 < \cdots < y_n, which follows from the formula

\notag    \det(C_n) = \displaystyle\frac{\displaystyle\prod_{1\le i < j \le n} (x_j-x_i) (y_j-y_i) }                {\displaystyle\prod_{1\le i,j \le n} (x_i+y_j) }.

In MATLAB, the Hilbert matrix is hilb(n) and the Cauchy matrix can be generated by gallery('cauchy',x,y) (or anymatrix('gallery/cauchy',x,y)).

A Vandermonde matrix

\notag     V = V(x_1,x_2,\dots,x_n)       = \begin{bmatrix}                 1      &   1    & \dots & 1     \\                 x_1   & x_2   & \dots & x_n  \\                 \vdots &\vdots  &       & \vdots \\                 x_1^{n-1} & x_2^{n-1} & \dots & x_n^{n-1}    \end{bmatrix}       \in \mathbb{C}^{n\times n}

is totally positive if the points x_i satisfy 0 < x_1 < x_2 < \cdots < x_n. As a partial check, the general formula

\notag   \det(V) = \displaystyle\prod_{1\le i < j \le n}^n (x_i - x_j)

shows that every leading principal minor is positive. In MATLAB, a Vandermonde matrix can be generated by anymatrix('core/vand',x).

The Pascal matrix P_n\in\mathbb{R}^{n\times n} is defined by

p_{ij} = \displaystyle\frac{ (i+j-2)! }{ (i-1)! (j-1)! } =                    {i+j-2 \choose j-1}.

For example, in MATLAB:

>> P = pascal(5)
P =
     1     1     1     1     1
     1     2     3     4     5
     1     3     6    10    15
     1     4    10    20    35
     1     5    15    35    70

The Pascal matrix is totally positive for all n (see the section below on bidiagonal factorizations).

The one-parameter correlation matrix C_n(\theta) with off-diagonal elements given by \theta with 0 \le \theta < 1, illustrated by

\notag    C_3(\theta) =   \begin{bmatrix}         1 & \theta & \theta \\    \theta & 1      & \theta \\    \theta & \theta & 1 \\   \end{bmatrix},

is not totally positive because while the principal minors are all positive, the submatrix A([1,2],[2,3]) =    \bigl[\begin{smallmatrix}    \theta & \theta \\    1      & \theta    \end{smallmatrix}\bigr] has nonpositive determinant. However, the Kac–Murdock–Szegö matrix K_n(\theta) = (\theta^{|i-j|}), with 0 \le \theta < 1, illustrated by

\notag    K_3(\theta) =   \begin{bmatrix}         1   & \theta & \theta^2 \\    \theta   & 1      & \theta \\    \theta^2 & \theta & 1 \\   \end{bmatrix}

is totally positive thanks to the decay of the elements way from the diagonal. In MATLAB, the Kac–Murdock–Szegö matrix can be generated by gallery('kms',n,rho).

The lower Hessenberg Toeplitz matrix H_n with all elements 1 on and below the superdiagonal, illustrated for n = 4 by

\notag   H_4 =   \begin{bmatrix}   1 & 1 & 0 & 0 \\   1 & 1 & 1 & 0 \\   1 & 1 & 1 & 1 \\   1 & 1 & 1 & 1 \\   \end{bmatrix},

is totally nonnegative. It has \lfloor n/2 \rfloor zero eigenvalues, which appear in a single Jordan block, and its largest eigenvalue is 2(1+\cos(2\pi/(n+2))). In MATLAB, this matrix can be generated by anymatrix('core/hessfull01',n). This and other binary totally nonnegative matrices are studied by Brualdi and Kirkland (2010).

Finally, consider a nonnegative 4\times 4 bidiagonal matrix factorized into a product of elementary nonnegative bidiagonal matrices (nonnegative means that the elements of the matrix are nonnegative):

\notag \begin{aligned}  L = \begin{bmatrix}         1         & 0         & 0         & 0 \\         \ell_{21} & 1         & 0         & 0 \\         0         & \ell_{32} & 1         & 0 \\         0         & 9         & \ell_{43} & 1 \\      \end{bmatrix}  &=      \begin{bmatrix}         1         & 0         & 0         & 0 \\         \ell_{21} & 1         & 0         & 0 \\         0         & 0         & 1         & 0 \\         0         & 0         & 0         & 1 \\      \end{bmatrix}      \begin{bmatrix}         1         & 0         & 0         & 0 \\         0         & 1         & 0         & 0 \\         0         & \ell_{32} & 1         & 0 \\         0         & 0         & 0         & 1 \\      \end{bmatrix}      \begin{bmatrix}         1         & 0         & 0         & 0 \\         0         & 1         & 0         & 0 \\         0         & 0         & 1         & 0 \\         0         & 0         & \ell_{43} & 1 \\      \end{bmatrix}\\    &\equiv L_1(\ell_{21})            L_2(\ell_{32})            L_3(\ell_{43}). \end{aligned}

It is easy to see by inspection that L_1, L_2, and L_3 are totally nonnegative, so L is totally nonnegative by Theorem 1. With D = \mathrm{diag}(1,-1,1,1,-1), we have

\notag  \begin{aligned}   DL^{-1}D   &= (DLD)^{-1}   = (DL_1D \cdot DL_2D \cdot DL_3D )^{-1}\\   &= (DL_3D)^{-1} (DL_2D)^{-1} (DL_3D)^{-1}\\   &= L_3(\ell_{43}) L_2(\ell_{32}) L_1(\ell_{21}), \qquad (3)\ \end{aligned}

which is a product of totally nonnegative matrices and hence is totally nonnegative by Theorem 1. This example clearly generalizes to show that an n\times n nonnegative bidiagonal matrix is totally nonnegative.

Inverse

Recall that the inverse of a nonsingular A\in\mathbb{R}^{n\times n} is given by A^{-1} = \mathrm{adj}(A)/\det(A), where

\mathrm{adj}(A) = \bigl( (-1)^{i+j} \det(A_{ji}) \bigr)

and A_{pq} denotes the submatrix of A obtained by deleting row p and column q. If A is nonsingular and totally nonnegative then it follows that A^{-1} has a checkerboard (alternating) sign pattern. Indeed, we can write A^{-1} = DBD, where D = \mathrm{diag}((-1)^{i+1}) and B has nonnegative elements, and in fact it can be shown that B is totally nonnegative using Theorem 1, Theorem 6, and (3). For example, here is the inverse of the 4\times 4 Pascal matrix:

>> inv(sym(pascal(5)))
ans =
[  5, -10,  10,  -5,  1]
[-10,  30, -35,  19, -4]
[ 10, -35,  46, -27,  6]
[ -5,  19, -27,  17, -4]
[  1,  -4,   6,  -4,  1]

Eigensystem

A totally nonnegative matrix has nonnegative trace and determinant, so the sum and product of its eigenvalues are both nonnegative. In fact, all the eigenvalues are real and nonnegative. Since a Jordan block corresponding to a nonnegative eigenvalue is totally nonnegative any Jordan form with nonnegative eigenvalues is possible. More can be said of A is irreducible. Recall that a matrix A\in\mathbb{C}^{n\times n} is irreducible if there does not exist a permutation matrix P such that

\notag         P^TAP = \begin{bmatrix} A_{11} & A_{12} \\                                    0   & A_{22}                  \end{bmatrix}

where A_{11} and A_{22} are square, nonempty submatrices.

Theorem 4. If A\in\mathbb{R}^{n\times n} is totally nonnegative then its eigenvalues are all real and nonnegative. If A is also irreducible then the positive eigenvalues are distinct.

If A is nonsingular and totally nonnegative and irreducible then by the theorem we can write the eigenvalues as \lambda_1 > \lambda_2 > \cdots > \lambda_n >0. It is known that the eigenvector x_k associated with \lambda_k has k-1 sign changes, that is, (x_k)_{i+1} and (x_k)_i have opposite signs for k-1 values of i (any zero elements are deleted before counting sign changes). Note that for k=1, we already know from Perron–Frobenius theory that there is a positive eigenvector x_1. This result is illustrated by the Pascal matrix above:

>> A = pascal(5); [V,d] = eig(A,'vector'); [~,k] = sort(d,'descend');
>> evals = d', evecs = V(:,k)
evals =
   1.0835e-02   1.8124e-01   1.0000e+00   5.5175e+00   9.2290e+01
evecs =
   1.7491e-02   2.4293e-01  -7.6605e-01  -5.7063e-01   1.6803e-01
   7.4918e-02   4.8079e-01  -3.8302e-01   5.5872e-01  -5.5168e-01
   2.0547e-01   6.1098e-01   1.6415e-01   2.5292e-01   7.0255e-01
   4.5154e-01   4.1303e-01   4.3774e-01  -5.1785e-01  -4.0710e-01
   8.6486e-01  -4.0736e-01  -2.1887e-01   1.7342e-01   9.0025e-02

Note that the number of sign changes (but not the number of negative elements) increases by 1 as we go from one column to the next

The class of nonsingular totally nonnegative irreducible matrices is known as the oscillatory matrices, because such matrices arise in the analysis of small oscillations of elastic systems. An equivalent definition (in fact, the usual definition) is that an oscillatory matrix is a totally nonnegative matrix for which A^q is totally positive for some positive integer q.

LU Factorization

The next result shows that a totally nonnegative matrix has an LU factorization with special properties. We will need the following special case of Fischer’s inequality (Theorem 3):

\notag    \det(A) \le \det \bigl( A(1\colon p,1\colon p) \bigr)                \det \bigl( A(p+1\colon n,p+1\colon n) \bigr),    \quad p=1\colon n-1. \qquad (4)

Theorem 5. If A\in\mathbb{R}^{n\times n} is nonsingular and totally nonnegative then it has an LU factorization with L and U totally nonnegative and the growth factor \rho_n = 1.

Proof. Since A is nonsingular and every minor is nonnegative, (4) shows that \det(A(1\colon p,1\colon p))>0 for p=1\colon n-1, which guarantees the existence of an LU factorization. That the elements of L and U are nonnegative follows from explicit determinantal formulas for the elements of L and U. The total nonnegativity of L and U is proved by Cryer (1976). Gaussian elimination starts with A^{(1)} = A and computes a_{ij}^{(k+1)} = a_{ij}^{(k)} - m_{ik}a_{kj}^{(k)} = a_{ij}^{(k)} - \ell_{ik} u_{kj} \le a_{ij}^{(k)}, since \ell_{ik}, u_{kj} \ge 0. Thus a_{ij} = a_{ij}^{(1)} \ge a_{ij}^{(2)} \ge \cdots \ge a_{ij}^{(r)}, r = \min (i,j). For i > j, a_{ij}^{(r)} \ge a_{ij}^{(r+1)} = \cdots = a_{ij}^{(n)} = 0; for j \ge i, a_{ij}^{(r)} = \cdots = a_{ij}^{(n)} = u_{ij} \ge 0. Thus 0 \le a_{ij}^{(k)} \le a_{ij} for all i,j,k and hence \rho_n \le 1. But \rho_n\ge1, so \rho_n=1.

Theorem 5 implies that it is safe to compute the LU factorization without pivoting of a nonsingular totally nonnegativity matrix: the factorization does not break down and it is numerically stable. In fact, the computed LU factors have a strong componentwise form of stability. As shown by De Boor and Pinkus (1977), for small enough unit roundoff u the computed factors \widehat{L} and \widehat{U} will have nonnegative elements and so from the standard backward error result for LU factorization,

\notag          \widehat{L}\widehat{U} = A + \Delta A, \quad          |\Delta A| \le \gamma_n |\widehat{L}||\widehat{U}|         \quad \Bigl(\gamma_n = \displaystyle\frac{nu}{1-nu} \Bigr),

we have

\notag    |\widehat{L}||\widehat{U}| = |\widehat{L}\widehat{U}| = |A + \Delta A|    \le |A| + \gamma_n |\widehat{L}||\widehat{U}|,

which gives |\widehat{L}||\widehat{U}| \le (1 - \gamma_n)^{-1}|A| and hence

\notag        \widehat{L}\widehat{U} = A + \Delta A, \quad         |\Delta A| \le \displaystyle\frac{\gamma_n}{1-\gamma_n} |A|,

which is about as strong a backward error result as we could hope for. The significance of this result is reduced, however, by the fact that for some important classes of totally nonnegative matrices, including Vandermonde matrices and Cauchy matrices, structure-exploiting linear system solvers exist that are substantially faster, and potentially more accurate, than LU factorization.

Factorization into a Product of Bidiagonal Matrices

We showed above that any nonnegative bidiagonal matrix is totally nonnegative. The next result shows that any nonsingular totally nonnegative matrix has an LU factorization in which L and U can be factorized into a product of nonnegative bidiagonal matrices.

Theorem 6. (Gasca and Peña, 1996) A nonsingular matrix A\in\mathbb{R}^{n\times n} is totally nonnegative if and only if it it can be factorized as

\notag    A = L_{n-1} L_{n-2} \dots L_1 D U_1 U_2 \dots U_{n-1}, \qquad (5)

where D is a diagonal matrix with positive diagonal entries and L_i and U_i are unit lower and unit upper bidiagonal matrices, respectively, with the first i-1 entries along the subdiagonal of L_i and U_i^T zero and the rest nonnegative.

An analogue of Theorem 6 holds for totally positive matrices, the only difference being that the last n-i+1 subdiagonal entries of L_i and U_i^T are positive.

The factorization (5) can be computed by Neville elimination, which is a version of Gaussian elimination in which the eliminations are between adjacent rows, working from the bottom of each column upwards.

This factorization into bidiagonal factors can be used to obtain simple proofs of various properties of totally nonnegative matrices and totally positive matrices (Fallat, 2001). It also provides an efficient way to generates such matrices. If all the parameters in D and the L_i and U_i are set to 1 then the Pascal matrix is generated.

Testing for Total Positivity

An n\times n matrix has \sum_{i=1}^n {n \choose k} = 2^n-1 principal minors (ones based on submatrices centred on the diagonal) and \sum_{i=1}^n {n \choose k}^2 = {2n \choose n} -1 \approx 4^n/(n\pi)^{1/2} minors in total. However, it is not necessary to check all these minors to test for total positivity.

Theorem 7. (Gasca and Peña, 1996) The matrix A\in\mathbb{R}^{n\times n} is totally positive if and only if \det(A(\alpha,\beta)) > 0 for all index vectors \alpha and \beta such that one of \alpha and \beta is [1,2,\dots,k] and the entries of the other are k consecutive integers.

Theorem 7 shows that only n(n+1) minors need to be tested. Gasca and Peña have also show that total nonnegativity can be tested by checking about 2^{n+1} + n^2/2 minors. A more efficient way to test for total nonnegativity is to compute the factorization in Theorem 6 and check the signs of the entries.

Notes

The results we have described show that totally nonnegative and totally positive matrices are analogous in many ways to symmetric positive (semi)definite matrices. The analogies go further because totally nonnegative and totally positive matrices also satisfy eigenvalue interlacing inequalities (albeit weaker than for symmetric matrices) and the eigenvalues of an oscillatory matrix majorize the diagonal elements. See Fallat and Johnson (2011) or Fallat (2014) for details.

References

This is a minimal set of references, which contain further useful references within.

What Is the Perron–Frobenius Theorem?

A real matrix is nonnegative if all its elements are nonnegative and it is positive if all its elements are positive. Nonnegative matrices arise in a wide variety of applications, for example as matrices of probabilities in Markov processes and as adjacency matrices of graphs. Information about the eigensystem is often essential in these applications.

Perron (1907) proved results about the eigensystem of a positive matrix and Frobenius (1912) extended them to nonnegative matrices.

The following three results of increasing specificity summarize the key spectral properties of nonnegative matrices proved by Perron and Frobenius. Recall that a simple eigenvalue of an n\times n matrix is one with algebraic multiplicity 1, that is, it occurs only once in the set of n eigenvalues. We denote by \rho(A) the spectral radius of A, the largest absolute value of any eigenvalue of A.

Theorem 1. (Perron–Frobenius) If A\in\mathbb{R}^{n\times n} is nonnegative then

  1. \rho(A) is an eigenvalue of A,
  2. there is a nonnegative eigenvector x such that Ax = \rho(A)x.

A matrix A\in\mathbb{C}^{n\times n} is reducible if there is a permutation matrix P such that

\notag         P^TAP = \begin{bmatrix} A_{11} & A_{12} \\                                    0   & A_{22}                  \end{bmatrix}

where A_{11} and A_{22} are square, nonempty submatrices; it is irreducible if it is not reducible. Examples of reducible matrices are triangular matrices and matrices with a zero row or column. A positive matrix is trivially irreducible.

Theorem 2. (Perron–Frobenius) If A\in\mathbb{R}^{n\times n} is nonnegative and irreducible then

  1. \rho(A) is an eigenvalue of A,
  2. \rho(A)>0,
  3. there is a positive eigenvector x such that Ax = \rho(A) x,
  4. \rho(A) is a simple eigenvalue.

Theorem 3. (Perron) If A\in\mathbb{R}^{n\times n} is positive then Theorem 2 holds and, in addition, |\lambda| < \rho(A) for any eigenvalue \lambda with \lambda \ne \rho(A).

For nonnegative, irreducible A, the eigenvalue \rho(A) is called the Perron root of A and the corresponding positive eigenvector x, normalized so that \|x\|_1 = 1, is called the Perron vector.

It is a good exercise to apply the theorems to all binary 2\times 2 matrices. Here are some interesting cases.

  • A = \bigl[\begin{smallmatrix}0 & 1 \\ 0 & 0 \end{smallmatrix}\bigr]: Theorem 1 says that \rho(A) = 0 is an eigenvalue and and that it has a nonnegative eigenvector. Indeed [1~0]^T is an eigenvector. Note that A is reducible and 0 is a repeated eigenvalue.
  • A = \bigl[\begin{smallmatrix}0 & 1 \\ 1 & 0 \end{smallmatrix}\bigr]: A is irreducible and Theorem 2 says that \rho(A) is a simple eigenvalue with positive eigenvector. Indeed the eigenvalues are \pm 1 and [1~1]^T/2 is the Perron vector for the Perron root 1. This matrix has two eigenvalues of maximal modulus.
  • A = \bigl[\begin{smallmatrix}1 & 1 \\ 1 & 1 \end{smallmatrix}\bigr]: Theorem 3 says that \rho(A) = 2 is an eigenvalue with positive eigenvector and that the other eigenvalue has modulus less than 2. Indeed the eigenvalues are the Perron root 2, with Perron vector [1~1]^T/2, and 0.

For another example, consider the irreducible matrix

\notag   B = \begin{bmatrix} 0 & 0 & 1\\                 1 & 0 & 0\\                 0 & 1 &  0                 \end{bmatrix}, \quad      \Lambda(B) = \bigl\{ 1,             \textstyle\frac{1}{2}( -1 \pm \sqrt{3}\mskip1mu\mathrm{i} ) \bigr\}.

Note that B is a companion matrix and a permutation matrix. Theorem 2 correctly tells us that \rho(A) = 1 is an eigenvalue of A, and that it has a corresponding positive eigenvector, the Perron vector [1~1~1]^T/3. Two of the eigenvalues are complex, however, and all three eigenvalues have modulus 1, as they must because B is orthogonal.

A stochastic matrix is a nonnegative matrix whose row sums are all equal to 1. A stochastic matrix satisfies Ae = e, where e = [1,1,\dots,1]^T, which means that A has an eigenvalue 1, and so \rho(A) \ge 1. Since \rho(A) \le \|A\| for any norm, by taking the \infty-norm we conclude that \rho(A) = 1. For a stochastic matrix, Theorem 1 does not give any further information. If A is irreducible then Theorem 2 tells us that \rho(A) is a simple eigenvalue, and if A is positive Theorem 3 tells us that every other eigenvalue has modulus less than \rho(A).

The next result is easily proved using Theorem 3 together with the Jordan canonical form. It shows that the powers of a positive matrix behave like multiples of a rank-1 matrix.

Theorem 4. If A\in\mathbb{R}^{n\times n} is positive, x is the Perron vector of A, and y is the Perron vector of A^T then

\notag    \displaystyle\lim_{k\to\infty} \left( \displaystyle\frac{A}{\rho(A)} \right)^k = \displaystyle\frac{xy^T}{y^Tx}.

Note that y in the theorem is a left eigenvector of A corresponding to \rho(A), that is, y^TA = \rho(A)y^T (since \rho(A^T) = \rho(A)).

If A is stochastic and positive then Theorem 4 is applicable and x = n^{-1}e. If A also has unit column sums, so that it is doubly stochastic, then y = n^{-1}e and Theorem 4 says that \lim_{k\to\infty}A^k = n^{-1}ee^T. We illustrate this result in MATLAB using a scaled magic square matrix.

>> n = 4; M = magic(n), A = M/sum(M(1,:)) % Doubly stochastic matrix.
A =
    16     2     3    13
     5    11    10     8
     9     7     6    12
     4    14    15     1
A =
   4.7059e-01   5.8824e-02   8.8235e-02   3.8235e-01
   1.4706e-01   3.2353e-01   2.9412e-01   2.3529e-01
   2.6471e-01   2.0588e-01   1.7647e-01   3.5294e-01
   1.1765e-01   4.1176e-01   4.4118e-01   2.9412e-02

>> for k = 8:8:32, fprintf('%11.2e',norm(A^k-ones(n)/n,1)), end, disp(' ')
   3.21e-05   7.37e-10   1.71e-14   8.05e-16 

References

This is a minimal set of references, which contain further useful references within.

  • Roger A. Horn and Charles R. Johnson, Matrix Analysis, second edition, Cambridge University Press, 2013. Chapter 8. My review of the second edition.
  • Carl D. Meyer, Matrix Analysis and Applied Linear Algebra, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2000. Chapter 8.
  • Helene Shapiro, Linear Algebra and Matrices. Topics for a Second Course, American Mathematical Society, Providence, RI, USA, 2015. Chapter 17.

This article is part of the “What Is” series, available from https://nhigham.com/category/what-is and in PDF form from the GitHub repository https://github.com/higham/what-is.

What Is the Kac–Murdock–Szegö Matrix?

The Kac–Murdock–Szegö matrix is the symmetric Toeplitz matrix

\notag A_n(\rho) = \begin{bmatrix}    1          & \rho       & \rho^2 & \dots  & \rho^{n-1} \\    \rho       & 1          & \rho   & \dots  & \rho^{n-2} \\    \rho^2     & \rho       & 1      & \ddots & \vdots     \\    \vdots     & \vdots     & \ddots & \ddots & \rho       \\    \rho^{n-1} & \rho^{n-2} & \dots  & \rho   & 1 \end{bmatrix} \in\mathbb{R}^{n\times n}. \qquad(1)

It was considered by Kac, Murdock, and Szegö (1953), who investigated its spectral properties. It arises in the autoregressive AR(1) model in statistics and signal processing.

The matrix is singular for \rho=1, as A_n(1) is the rank-1 matrix ee^T, and it is also rank-1 for \rho = -1, as in this case every column is a multiple of the vector with alternating elements \pm 1. The determinant \det(A_n(\rho)) = (1-\rho^2)^{n-1}. For \rho \ne \pm 1, A_n is nonsingular and the inverse is the tridiagonal (but not Toeplitz) matrix

\notag   A_n(\rho)^{-1} = \displaystyle\frac{1}{1-\rho^2}   \begin{bmatrix}    1      & -\rho    & 0        & \dots  & \dots    & 0      \\    -\rho  & 1+\rho^2 & -\rho    & \dots  & \dots    & 0      \\    0      & -\rho    & 1+\rho^2 & \ddots & \dots    & \vdots \\    \vdots & \vdots   & \ddots   & \ddots & \ddots   & 0      \\    0      & \dots    & \dots    & -\rho  & 1+\rho^2 & -\rho  \\    0      & \dots    & \dots    & 0      & -\rho    & 1   \end{bmatrix}. \qquad (2)

For -1 < \rho < 1, A_n(\rho) is positive definite, since every leading principal submatrix has positive determinant, as can also be seen by noting that the inverse is diagonally dominant with positive diagonal, so that A_n^{-1} is positive definite and hence A_n is positive definite.

For -1 \le \rho \le 1, A_n(\rho) is positive semidefinite, so it is a correlation matrix for \rho in this range.

For 0 \le \rho \le 1, A_n(\rho) is totally nonnegative, that is. every submatrix has nonnegative determinant. For 0 < \rho < 1, we know that A_n(\rho) is nonsingular, and it is clearly irreducible, and together with the total nonnegativity these properties imply that the eigenvalues are distinct and positive (this can also be deduced from the fact that the inverse is tridiagonal with nonzero subdiagonal and superdiagonal entries).

It is straightforward to verify that A_n has a factorization A_n = LDL^* with L the inverse of a unit lower bidiagonal matrix:

\notag  L =   \begin{bmatrix}    1     &          &          &        &\\    -\rho & 1        &          &        &\\          & -\rho    & 1        &        &\\          &          & \ddots   & \ddots &\\          &          &          &-\rho   &1   \end{bmatrix}^{-1}, \quad  D = \mathrm{diag}(1, 1-\rho^2, 1-\rho^2, \dots, 1-\rho^2).  \qquad (3)

This factorization can be used to prove all the properties stated above.

From (1) and (2) we can derive the formulas

\notag    \begin{aligned}   \|A_n\|_{1,\infty} &=        2 \left(\displaystyle\frac{1-\rho^{k+1}}{1-\rho}\right)        -1 - (2k-n+1) \rho^k,    \quad k = \lfloor n/2 \rfloor, \\   \|A_n^{-1}\|_{\infty} &= (1+2\rho+\rho^2)/(1-\rho^2) = (1+\rho)/(1-\rho).    \end{aligned}

Hence we have an explicit formula for the condition number \kappa_p(A_n) = \|A_n\|_{1,\infty} \|A_n^{-1}\|_{1,\infty} for p = 1,\infty.

We can allow \rho to be complex, in which case the definition (1) is modified to conjugate the elements below the diagonal. The factorization A = LDL^* continues to hold with D in (2) replaced by \mathrm{diag}(1, 1-|\rho|^2, 1-|\rho|^2, \dots, 1-|\rho|^2).

The Kac–Murdock–Szegö matrix (for real or complex \rho) can be generated in MATLAB as gallery('kms',n,rho).

References

This is a minimal set of references, which contain further useful references within.

Related Blog Posts

This article is part of the “What Is” series, available from https://nhigham.com/category/what-is and in PDF form from the GitHub repository https://github.com/higham/what-is.

What Is the Determinant of a Matrix?

The determinant of an n\times n matrix A is defined by

\notag   \det(A) = \displaystyle\sum_j (-1)^{\mathop{\mathrm{sgn}}j}                   a_{1,j_1}a_{2,j_2} \dots a_{n,j_n}, \qquad (1)

where the sum is over all n! permutations j = (j_1,j_2,\dots,j_n) of the sequence (1,2,\dots,n) and \mathop{\mathrm{sgn}}j is the number of inversions in j, that is, the number of pairs (j_k,j_\ell) with k  j_\ell. Each term in the sum is a signed product of n entries of A and the product contains one entry taken from each row and one from each column.

The determinant is sometimes written with vertical bars, as |A|.

Three fundamental properties are

\notag \begin{aligned} \det(\alpha A) &= \alpha^n \det(A)\; \mathrm{for~any~scalar~}\alpha,\qquad(2)\\ \det(A^T) &= \det(A), \qquad(3)\\ \det(AB) &= \det(A)\det(B) \mathrm{~for~} n\times n~ A \mathrm{~and~} B.\qquad(4) \end{aligned}

The first property is immediate, the second can be proved using properties of permutations, and the third is proved in texts on linear algebra and matrix theory.

An alternative, recursive expression for the determinant is the Laplace expansion

\notag   \det(A) = \displaystyle\sum_{j=1}^n (-1)^{i+j} a_{ij} \det (A_{ij}).  \qquad(5)

for any i\in\{1,2,\dots,n\}, where A_{ij} denotes the (n-1)\times (n-1) submatrix of A obtained by deleting row i and column j, and \det(a) = a for a scalar a. This formula is called the expansion by minors because \det (A_{ij}) is a minor of A.

For some types of matrices the determinant is easy to evaluate. If T is triangular then \det(T) = \prod_{i=1}^n t_{ii}. If Q is unitary then Q^*Q = I implies |\det(Q)| = 1 on using (3) and (4). An explicit formula exists for the determinant of a Vandermonde matrix.

The determinant of A is connected with the eigenvalues \lambda_i of A via the property \det(A) = \prod_{i=1}^n \lambda_i. Since the eigenvalues are the roots of the characteristic polynomial \det(tI - A), this relation follows by setting t = 0 in the expression

\notag   \det(tI - A) = t^n + a_{n-1}t^{n-1} + \cdots + a_1 t + a_0                = \displaystyle\prod_{i=1}^n (t - \lambda_i).

For n=2, the determinant is

\notag     \det\biggl( \begin{bmatrix} a & b \\ c & d \end{bmatrix} \biggr)      = ad - bc,

but already for n=3 the determinant is tedious to write down. If one must compute \det(A), the formulas (1) and (5) are too expensive unless n is very small: they have an exponential cost. The best approach is to use a factorization of A involving factors that are triangular or orthogonal, so that the determinants of the factors are easily computed. If PA = LU is an LU factorization, with P a permutation matrix, L unit lower triangular, and U upper triangular, then \det(A) = \det(P) \prod_{i=1}^n u_{ii} = \pm \prod_{i=1}^n u_{ii}. As this expression indicates, the determinant is prone to overflow and underflow in floating-point arithmetic, so it may be preferable to compute \log(|\det(A)|) = \sum_{i=1}^n \log|u_{ii}|.

The determinant features in the formula

\notag   A^{-1} = \displaystyle\frac{\mathrm{adj}(A)}{\det(A)}

for the inverse, where \mathrm{adj}(A) is the adjugate of A (recall that \mathrm{adj}(A) has (i,j) element (-1)^{i+j} \det(A_{ji})). More generally, Cramer’s rule says that the components of the solution to a linear system Ax = b are given by x_i = \det(A_i(b))/\det(A), where A_i(b) denotes A with its ith column replaced by b. While mathematically elegant, Cramer’s rule is of no practical use, as it is both expensive and numerically unstable in finite precision arithmetic.

Inequalities

A celebrated bound for the determinant of a Hermitian positive definite matrix H\in\mathbb{C}^{n\times n} is Hadamard’s inequality. Note that for such H, \det(H) is real and positive (being the product of the eigenvalues, which are real and positive) and the diagonal elements are also real and positive (since h_{ii} = e_i^*He_i > 0).

Theorem 1 (Hadamard’s inequality). For a Hermitian positive definite matrix H\in\mathbb{C}^{n\times n},

\notag     \det(H) \le \displaystyle\prod_{i=1}^n h_{ii},

with equality if and only if H is diagonal.

Theorem 1 is easy to prove using a Cholesky factorization.

The following corollary can be obtained by applying Theorem 1 to H = A^*A or by using a QR factorization of A.

Corollary 2. For A = [a_1,a_2,\dots,a_n] \in\mathbb{C}^{n\times n},

\notag     \det(A) \le \displaystyle\prod_{i=1}^n \|a_i\|_2,

with equality if and only if the columns of A are orthogonal.

Obviously, one can apply the corollary to A^* and obtain the analogous bound with column norms replaced by row norms.

The determinant of A\in\mathbb{R}^{n\times n} can be interpreted as the volume of the parallelepiped \{\, \sum_{i=1}^n t_ia_i : 0 \le t_i \le 1, ~ i = 1\colon n\,\}, whose sides are the columns of A. Corollary 2 says that for columns of given lengths the volume is maximized when the columns are orthogonal.

Nearness to Singularity and Conditioning

The determinant characterizes nonsingularity: A is singular if and only if \det(A) = 0. It might be tempting to use |\det(A)| as a measure of how close a nonsingular matrix A is to being singular, but this measure is flawed, not least because of the sensitivity of the determinant to scaling. Indeed if Q is unitary then \det(\alpha Q) = \alpha^n \det(Q) can be given any value by a suitable choice of \alpha, yet \alpha Q is perfectly conditioned: \kappa_2(\alpha Q) = 1, where \kappa(A) = \|A\| \|A^{-1}\| is the condition number.

To deal with the poor scaling one might normalize the determinant: in view of Corollary 2,

\notag   \psi(A) = \displaystyle\frac{\prod_{i=1}^n \|a_i\|_2} {\det(A)}

satisfies \psi(A) \ge 1 and \psi(A) = 1 if and only if the columns of A are orthogonal. Birkhoff (1975) calls \psi the Hadamard condition number. In general, \psi is not related to the condition number \kappa, but if A has columns of unit 2-norm then it can be shown that \kappa_2(A) < 2\psi(A) (Higham, 2002, Prob. 14.13). Dixon (1984) shows that for classes of n\times n random matrices A_n that include matrices with elements independently drawn from a normal distribution with mean 0, the probability that the inequality

n^{1/4 - \epsilon} \mathrm{e}^{n/2}     < \psi(A_n) < n^{1/4 + \epsilon} \mathrm{e}^{n/2}

holds tends to 1 as n\to\infty for any \epsilon > 0, so \psi(A_n) \approx n^{1/4}\mathrm{e}^{n/2} for large n. This exponential growth is much faster than the growth of \kappa, for which Edelman (1998) showed that for the standard normal distribution, \mathbb{E}(\log(\kappa_2(A_n))) \approx \log n + 1.537, where \mathbb{E} denotes the mean value. This MATLAB example illustrates these points.

>> rng(1); n = 50; A = randn(n); 
>> psi = prod(sqrt(sum(A.*A)))/abs(det(A)), kappa2 = cond(A)
psi =
   5.3632e+10
kappa2 =
   1.5285e+02
>> ratio = psi/(n^(0.25)*exp(n/2))
ratio =
   2.8011e-01

The relative distance from A to the set of singular matrices is equal to the reciprocal of the condition number.

Theorem 3 (Gastinel, Kahan). For A\in\mathbb{C}^{n\times n} and any subordinate matrix norm,

\notag   \min \left\{ \displaystyle\frac{\|\Delta A\|}{\|A\|} : A+\Delta A\mathrm{~singular} \right\}  = \displaystyle\frac{1}{\kappa(A)}.

Notes

Determinants came before matrices, historically. Most linear algebra textbooks make significant use of determinants, but a lot can be done without them. Axler (1995) shows how the theory of eigenvalues can be developed without using determinants.

Determinants have little application in practical computations, but they are a useful theoretical tool in numerical analysis, particularly for proving nonsingularity.

There is a large number of formulas and identities for determinants. Sir Thomas Muir collected many of them in his five-volume magnum opus The Theory of Determinants in the Historical Order of Development, published between 1890 and 1930. Brualdi and Schneider (1983) give concise derivations of many identities using Gaussian elimination, bringing out connections between the identities.

The quantity obtained by modifying the definition (1) of determinant to remove the (-1)^{\mathop{\mathrm{sgn}}j} term is the permanent. The permanent arises in combinatorics and quantum mechanics and is much harder to compute than the determinant: no algorithm is known for computing the permanent in p(n) operations for a polynomial p.

References

This is a minimal set of references, which contain further useful references within.

Related Blog Posts

This article is part of the “What Is” series, available from https://nhigham.com/category/what-is and in PDF form from the GitHub repository https://github.com/higham/what-is.

What Is a Vandermonde Matrix?

A Vandermonde matrix is defined in terms of scalars x_1, x_2, …, x_n\in\mathbb{C} by

\notag     V = V(x_1,x_2,\dots,x_n)       = \begin{bmatrix}                 1      &   1    & \dots & 1     \\                 x_1   & x_2   & \dots & x_n  \\                 \vdots &\vdots  &       & \vdots \\                 x_1^{n-1} & x_2^{n-1} & \dots & x_n^{n-1}    \end{bmatrix}       \in \mathbb{C}^{n\times n}.

The x_i are called points or nodes. Note that while we have indexed the nodes from 1, they are usually indexed from 0 in papers concerned with algorithms for solving Vandermonde systems.

Vandermonde matrices arise in polynomial interpolation. Suppose we wish to find a polynomial p_{n-1}(x) = a_nx^{n-1} + a_{n-1}x^{n-2} + \cdots + a_1 of degree at most n-1 that interpolates to the data (x_i,f_i)_{i=1}^n, that is, p_{n-1}(x_i) = f_i, i=1\colon n. These equations are equivalent to

\notag          V^Ta = f \quad \mathrm{(dual)},

where a = [a_1,a_2,\dots,a_n]^T is the vector of coefficients. This is known as the dual problem. We know from polynomial interpolation theory that there is a unique interpolant if the x_i are distinct, so this is the condition for V to be nonsingular.

The problem

\notag          Vy = b \quad \mathrm{(primal)}

is called the primal problem, and it arises when we determine the weights for a quadrature rule: given moments b_i find weights y_i such that \sum_{j=1}^n y_j^{} x_j^{\,i-1} = b_i, i=1\colon n.

Determinant

The determinant of V is a function of the n points x_i. If x_i = x_j for some i\ne j then V has identical ith and jth columns, so is singular. Hence the determinant must have a factor x_i - x_j. Consequently, we have

\notag  \det( V(x_1,x_2,\dots,x_n) ) = c \displaystyle\prod_{i,j = 1\atop i > j}^n (x_i - x_j),

where, since both sides have degree n(n-1)/2 in the x_i, c is a constant. But \det(V) contains a term x_2 x_3^2 \dots x_n^{n-1} (from the main diagonal), so c = 1. Hence

\notag   \det(V) = \displaystyle\prod_{i,j = 1\atop i > j}^n (x_i - x_j). \qquad (1)

This formula confirms that V is nonsingular precisely when the x_i are distinct.

Inverse

Now assume that V is nonsingular and let V^{-1} = W = (w_{ij})_{i,j=1}^n. Equating elements in the ith row of WV = I gives

\sum_{j=1}^n w_{ij} x_k^{\mskip1mu j-1} = \delta_{ik},    \quad k=1\colon n,

where \delta_{ij} is the Kronecker delta (equal to 1 if i=j and 0 otherwise). These equations say that the polynomial \sum_{j=1}^n w_{ij} x^{\mskip1mu j-1} takes the value 1 at x = x_i and 0 at x = x_k, k\ne i. It is not hard to see that this polynomial is the Lagrange basis polynomial:

\notag    \sum_{j=1}^n w_{ij} x^{j-1} = \displaystyle\prod_{k=1\atop k\ne i}^n    \left( \frac{x-x_k}{x_i-x_k} \right) =: \ell_i(x). \qquad (2)

We deduce that

\notag    w_{ij} = \displaystyle\frac{ (-1)^{n-j}                    \sigma_{n-j}(x_1,\dots,x_{i-1},x_{i+1},\dots,x_n) }    { \displaystyle\prod_{k=1 \atop k\ne i}^n (x_i-x_k) }, \qquad (3)

where \sigma_k(y_1,\dots,y_n) denotes the sum of all distinct products of k of the arguments y_1,\dots,y_n (that is, \sigma_k is the kth elementary symmetric function).

From (1) and (3) we see that if the x_i are real and positive and arranged in increasing order 0 < x_1 < x_2 < \cdots  0 and V^{-1} has a checkerboard sign pattern: the (i,j) element has sign (-1)^{i+j}.

Note that summing (2) over i gives

\notag    \displaystyle\sum_{j=1}^n x^{j-1} \sum_{i=1}^n w_{ij} = \sum_{i=1}^n \ell_i(x) = 1,

where the second equality follows from the fact that \sum_{i=1}^n \ell_i(x) is a degree n-1 polynomial that takes the value 1 at the n distinct points x_i. Hence

\notag   \displaystyle\sum_{i=1}^n w_{ij} = \delta_{j1},

so the elements in the jth column of the inverse sum to 1 for j = 1 and 0 for j\ge 2.

Example

To illustrate the formulas above, here is an example, with x_i = (i-1)/(n-1) and n = 5:

\notag V = \left[\begin{array}{ccccc} 1 & 1 & 1 & 1 & 1\\ 0 & \frac{1}{4} & \frac{1}{2} & \frac{3}{4} & 1\\[\smallskipamount] 0 & \frac{1}{16} & \frac{1}{4} & \frac{9}{16} & 1\\[\smallskipamount] 0 & \frac{1}{64} & \frac{1}{8} & \frac{27}{64} & 1\\[\smallskipamount] 0 & \frac{1}{256} & \frac{1}{16} & \frac{81}{256} & 1 \end{array}\right], \quad V^{-1} = \left[\begin{array}{ccccc} 1 & -\frac{25}{3} & \frac{70}{3} & -\frac{80}{3} & \frac{32}{3}\\[\smallskipamount] 0 & 16 & -\frac{208}{3} & 96 & -\frac{128}{3}\\ 0 & -12 & 76 & -128 & 64\\[\smallskipamount] 0 & \frac{16}{3} & -\frac{112}{3} & \frac{224}{3} & -\frac{128}{3}\\[\smallskipamount] 0 & -1 & \frac{22}{3} & -16 & \frac{32}{3} \end{array}\right],

for which \det(V) = 9/32768.

Conditioning

Vandermonde matrices are notorious for being ill conditioned. The ill conditioning stems from the monomials being a poor basis for the polynomials on the real line. For arbitrary distinct points x_i, Gautschi showed that V_n = V(x_1, x_2, \dots, x_n) satisfies

\notag    \displaystyle\max_i \displaystyle\prod_{j\ne i} \frac{ \max(1,|x_j|) }{ |x_i-x_j| }    \le \|V_n^{-1}\|_{\infty}    \le  \displaystyle\max_i \prod_{j\ne i} \frac{ 1+|x_j| }{ |x_i-x_j| },

with equality on the right when x_j = |x_j| e^{\mathrm{i}\theta} for all j with a fixed \theta (in particular, when x_j\ge0 for all j). Note that the upper and lower bounds differ by at most a factor 2^{n-1}. It is also known that for any set of real points x_i,

\notag        \kappa_2(V_n) \ge         \Bigl(\displaystyle\frac{2}{n}\Bigr)^{1/2} \,         (1+\sqrt{2})^{n-2}

and that for x_i = 1/i we have \kappa_{\infty}(V_n) > n^{n+1}, where the lower bound is an extremely fast growing function of the dimension!

These exponential lower bounds are alarming, but they do not necessarily rule out the use of Vandermonde matrices in practice. One of the reasons is that there are specialized algorithms for solving Vandermonde systems whose accuracy is not dependent on the condition number \kappa, and which in some cases can be proved to be highly accurate. The first such algorithm is an O(n^2) operation algorithm for solving V_ny =b of Björck and Pereyra (1970). There is now a long list of generalizations of this algorithm in various directions, including for confluent Vandermonde-like matrices (Higham, 1990), as well as for more specialized problems (Demmel and Koev, 2005) and more general ones (Bella et al., 2009). Another important observation is that the exponential lower bounds are for real nodes. For complex nodes V_n can be much better conditioned. Indeed when the x_i are the roots of unity, V_n/\sqrt{n} is the unitary Fourier matrix and so V_n is perfectly conditioned.

Generalizations

Two ways in which Vandermonde matrices have been generalized are by allowing confluency of the points x_i and by replacing the monomials by other polynomials. Confluency arises when the x_i are not distinct. If we assume that equal x_i are contiguous then a confluent Vandermonde matrix is obtained by “differentiating” the previous column for each of the repeated points. For example, with points x_1, x_1, x_1, x_2, x_2 we obtain

\notag   \begin{bmatrix}   1      &  0     &   0     &   1          &   0    \\   x_1    &  1     &   0     &  x_2    &   1    \\   x_1^2  & 2x_1   &   2     &  x_2^2  & 2x_2  \\   x_1^3  & 3x_1^2 & 6x_1    & x_2^3   & 3x_2^2 \\   x_1^4  & 4x_1^3 & 12x_1^2 & x_2^4   & 4x_2^3   \end{bmatrix}. \qquad (4)

The transpose of a confluent Vandermonde matrix arises in Hermite interpolation; it is nonsingular if the points corresponding to the “nonconfluent columns” are distinct (that is, if x_1 \ne x_2 in the case of (4)).

A Vandermonde-like matrix is defined in terms of a set of polynomials \{p_i(x)\}_{i=0}^n with p_i having degree i:

\notag     \begin{bmatrix}     p_0(x_1) & p_0(x_2) & \dots & p_0(x_n)\\     p_1(x_1) & p_1(x_2) & \dots & p_1(x_n)\\    \vdots & \vdots & \dots & \vdots\\     p_{n-1}(x_1) & p_{n-1}(x_2) & \dots & p_{n-1}(x_n)\\    \end{bmatrix}.

Of most interest are polynomials that satisfy a three-term recurrence, in particular, orthogonal polynomials. Such matrices can be much better conditioned than general Vandermonde matrices.

Notes

Algorithms for solving confluent Vandermonde-like systems and their rounding error analysis are described in the chapter “Vandermonde systems” of Higham (2002).

Gautschi has written many papers on the conditioning of Vandermonde matrices, beginning in 1962. We mention just his most recent paper on this topic: Gautschi (2011).

References

This is a minimal set of references, which contain further useful references within.

Related Blog Posts

This article is part of the “What Is” series, available from https://nhigham.com/category/what-is and in PDF form from the GitHub repository https://github.com/higham/what-is.

What Is the Wilson Matrix?

The 4\times 4 matrix

\notag   W = \begin{bmatrix}      5 & 7  & 6  & 5 \\      7 & 10 & 8  & 7 \\      6 & 8  & 10 & 9 \\      5 & 7  & 9  & 10   \end{bmatrix}

appears in a 1946 paper by Morris, in which it is described as having been “devised by Mr. T. S. Wilson.” The matrix is symmetric positive definite with determinant 1 and inverse

\notag    W^{-1} =   \begin{bmatrix}    68 & -41 & -17 & 10\\   -41 &  25 &  10 & -6\\   -17 &  10 &   5 & -3\\    10 &  -6 &  -3 &  2    \end{bmatrix},

so it is moderately ill conditioned with \kappa_2(W) = \|W\|_2 \|W^{-1}\|_2 \approx 2.98409\times 10^3. This little matrix has been used as an example and for test purposes in many research papers and books over the years, in particular by John Todd, who described it as “the notorious matrix W of T. S. Wilson”.

Rutishauser (1968) stated that “the famous Wilson matrix is not a very striking example of an ill-conditioned matrix”, on the basis that \kappa_2(A)\le 40{,}000 for a “positive definite symmetric 4\times 4 matrix with integer elements not exceeding 10” and he gave the positive definite matrix

\notag   A_0 = \begin{bmatrix}      10 & 1  & 4  &  0 \\      1  & 10 & 5  & -1 \\      4  & 5  & 10 &  7 \\      0  & -1 & 7  &  9   \end{bmatrix}, \quad    A_0^{-1} =\begin{bmatrix}       105& 167 & -304 & 255\\       167 & 266 & -484 & 406\\      -304 & -484 & 881 & -739\\       255 & 406 & -739 & 620       \end{bmatrix},

for which \kappa_2(A_0) = 3.57924\times 10^4. The matrix A_0 is therefore a factor 12 more ill conditioned than W. Rutishauser did not give a proof of the stated bound.

Moler (2018) asked how ill-conditioned W is relative to matrices in the set

\notag \begin{aligned}    \mathcal{S} &= \{\, A\in\mathbb{R}^{4\times 4}: A=A^T       \mathrm{~is~nonsingular~with~integer~entries}\nonumber\\       & \hspace{2.9cm} \mathrm{between~1~and~10} \,\}. \end{aligned}

He generated one million random matrices from \mathcal{S} and found that about 0.21 percent of them had a larger condition number than W. The matrix with the largest condition number was the indefinite matrix

\notag    A_1 = \begin{bmatrix}             1  &  3  & 10  & 10\\             3  &  4  &  8  &  9\\             10 &   8 &   3 &  9\\             10 &   9 &   9 &  3           \end{bmatrix},    \quad      A_1^{-1} =       \begin{bmatrix}       573 & -804 &  159 &  25\\     -804 & 1128 & -223 & -35\\       159 & -223 &   44 &   7\\        25 &  -35 &    7 &   1        \end{bmatrix},

for which \kappa_2(A_1) \approx 4.80867\times 10^4. How far is this matrix from being a worst case?

As the Wilson matrix is positive definite, we are also interested in how ill conditioned a matrix in the set

\notag \begin{aligned}    \mathcal{P} &= \{\, A\in\mathbb{R}^{4\times 4}: A=A^T    \mathrm{~is~symmetric~positive~definite ~with~integer~entries}\nonumber\\       & \hspace{2.9cm} \mathrm{between~1~and~10} \,\} \end{aligned}

can be.

Condition Number Bounds

We first consider bounds on \kappa_2(A) for A \in \mathcal{S}. It is possible to obtain a bound from first principles by using the relation A^{-1} = \mathrm{adj}(A)/\det(A), where \mathrm{adj}(A) is the adjugate matrix, along with the fact that |\det(A)| \ge 1 since A has integer entries. Higham and Lettington (2021) found that the smallest bound they could obtain came from a bound of Merikoski et al. (1997): for nonsingular B\in\mathbb{R}^{n\times n},

\notag  \kappa_2(B) \le  \left(\displaystyle\frac{1+x}{1-x}\right)^{1/2}, \quad      x = \sqrt{1 - (n/\|B\|_F^2)^n |\det(B)|^2 }.

Applying this bound to A\in\mathcal{S}, using the fact that (1+x)/(1-x) is monotonically increasing for x\in(0,1), gives

\notag     \kappa_2(A) \le 2.97606\dots \times 10^5 =: \beta_S, \quad A\in\mathcal{S}.      \qquad (1)

Another result from Merikoski et al. (1997) gives, for symmetric positive definite C\in\in\mathbb{R}^{n\times n},

\notag      \kappa_2(C) \le \displaystyle\frac{1+x}{1-x}, \quad      x = \sqrt{1 - (n/\mathrm{trace}(C))^n \det(C) }.

For A\in\mathcal{P}, since \det(A) \ge 1 we have x \le \sqrt{1 - (1/10)^4}, and hence

\notag   \kappa_2(A) \le 3.99980 \times 10^4 =: \beta_P, \quad A\in\mathcal{P}.      \qquad (2)

Recall that Rutishauser’s bound is 4\times 10^4. The bounds (1) and (2) remain valid if we modify the definitions of \mathcal{S} and \mathcal{P} to allow zero elements (note that Rutishauser’s matrix A_0 has a zero element).

Experiment

The sets \mathcal{S} and \mathcal{P} are large: \mathcal{S} has on the order of 10^{10} elements. Exhaustively searching over the sets in reasonable time is possible with a carefully optimized code. Higham and Lettington (2021) use a MATLAB code that loops over all symmetric matrices with integer elements between 1 and 10 and

  • evaluates \det(A) from an explicit expression (exactly computed for such matrices) and discards A if the matrix is singular;
  • computes the eigenvalues \lambda_i of A and obtains the condition number as \kappa_2(A) = \max_i |\lambda_i|/\min_i |\lambda_i| (since A is symmetric); and
  • for \mathcal{P}, checks whether A is positive definite by checking whether the smallest eigenvalue is positive.

The code is available at https://github.com/higham/wilson-opt.

The maximum over \mathcal{S} is attained for

\notag   A_2 = \begin{bmatrix}                2 & 7  & 10 & 10\\                7 & 10 & 10 & 9\\               10 & 10 & 10 & 1\\               10 & 9  & 1  & 9             \end{bmatrix},   \quad   A_2^{-1} =   \begin{bmatrix}   640 & -987 &  323 &  240\\  -987 & 1522 & -498 & -370\\   323 & -498 &  163 &  121\\   240 & -370 &  121 &   90   \end{bmatrix},

which has \kappa_2(A_2) \approx 7.6119 \times 10^4. and determinant -1. The maximum over \mathcal{P} is attained for

\notag   A_3 =     \begin{bmatrix}      9  &   1 &    1 &    5\\      1  &  10 &    1 &    9\\      1  &   1 &   10 &    1\\      5  &   9 &    1 &   10  \end{bmatrix}, \quad    A_3^{-1} =   \begin{bmatrix}   188 &  347 & -13 & -405\\   347 &  641 & -24 & -748\\   -13 &  -24 &   1 &   28\\  -405 & -748 &  28 &  873  \end{bmatrix}.

which has \kappa_2(A_3) \approx 3.5529 \times 10^4 and determinant 1. Obviously, symmetric permutations of these matrices are also optimal.

The following table summarizes the condition numbers of the matrices discussed and how close they are to the bounds.

Matrix A Comment \kappa_2(A) \beta_S/\kappa_2(A) \beta_P/\kappa_2(A)
W Wilson matrix 2.98409\times 10^3 99.73 13.40
A_9 Rutishauser’s matrix 3.57924\times 10^4 8.31 1.12
A_1 By random sampling 4.80867\times 10^4 6.19
A_2 Optimal matrices in \mathcal{S} 7.61190\times 10^4 3.91
A_3 Optimal matrices in \mathcal{P} 3.55286\times 10^4 8.38 1.13

Clearly, the bounds are reasonably sharp.

We do not know how Wilson constructed his matrix or to what extent he tried to maximize the condition number subject to the matrix entries being small integers. One possibility is that he constructed it via the factorization in the next section.

Integer Factorization

The Cholesky factor of the Wilson matrix is

\notag R = \begin{bmatrix} \sqrt{5} & \frac{7\,\sqrt{5}}{5} & \frac{6\,\sqrt{5}}{5} & \sqrt{5}\\[\smallskipamount] 0 & \frac{\sqrt{5}}{5} & -\frac{2\,\sqrt{5}}{5} & 0\\[\smallskipamount] 0 & 0 & \sqrt{2} & \frac{3\,\sqrt{2}}{2}\\[\smallskipamount] 0 & 0 & 0 & \frac{\sqrt{2}}{2} \end{bmatrix} \quad (W = R^TR).

Apart from the zero (2,4) element, it is unremarkable. If we factor out the diagonal then we obtain the LDL^T factorization, which has rational elements:

\notag L = \begin{bmatrix}1 & 0 & 0 & 0\\ \frac{7}{5} & 1 & 0 & 0\\ \frac{6}{5} & -2 & 1 & 0\\ 1 & 0 & \frac{3}{2} & 1 \end{bmatrix}, \quad D = \begin{bmatrix}5 & 0 & 0 & 0\\ 0 & \frac{1}{5} & 0 & 0\\ 0 & 0 & 2 & 0\\ 0 & 0 & 0 & \frac{1}{2} \end{bmatrix}  \quad (W = LDL^T).

Suppose we drop the requirement of triangularity and ask whether the Wilson matrix has a factorization W = Z^T\!Z with a 4\times4 matrix Z of integers. It is known that every symmetric positive definite n\times n matrix A of integers with determinant 1 has a factorization A = Z^T\!Z with Z an n\times n matrix of integers as long as n \le 7, but examples are known for n = 8 for which the factorization does not exist. This result is mentioned by Taussky (1961) and goes back to Hermite, Minkowski, and Mordell. Higham and Lettington (2021) found the integer factor

\notag Z_0 = \begin{bmatrix}      2 &   3  &   2  &    2\\      1  &   1   &  2   &  1\\      0  &   0   &  1   &  2\\      0  &   0   &  1   &  1      \end{bmatrix}

of W, which is block upper triangular so can be thought of as a block Cholesky factor. Higham, Lettington, and Schmidt (2021) draw on recent research that links the existence of such factorizations to number-theoretic considerations of quadratic forms to show that for the existence of an integer solution Z to A = Z^TZ it is necessary that a certain quadratic equation in n variables has an integer solution. In the case of the Wilson matrix the equation is

2 w^2+x_1^2+x_1 x_2+x_1 x_3+x_2^2+x_2 x_3+x_3^2=952.

The authors solve this equation computationally and find Z_1 and two rational factors:

\notag Z_1=\left[ \begin{array}{cccc}  \frac{1}{2} & 1 & 0 & 1 \\  \frac{3}{2} & 2 & 3 & 3 \\  \frac{1}{2} & 1 & 0 & 0 \\  \frac{3}{2} & 2 & 1 & 0 \\ \end{array} \right], \quad Z_2=\left[ \begin{array}{@{\mskip2mu}rrrr}  \frac{3}{2} & 2 & 2 & 2 \\  \frac{3}{2} & 2 & 2 & 1 \\  \frac{1}{2} & 1 & 1 & 2 \\  -\frac{1}{2} & -1 & 1 & 1 \\ \end{array} \right].

They show that these matrices are the only factors Z\in\frac{1}{16}\mathbb{Z} of W up to left multiplication by integer orthogonal matrices.

Conclusions

The Wilson matrix has provided sterling service throughout the digital computer era as a convenient symmetric positive definite matrix for use in textbook examples and for testing algorithms. The recent discovery of its integer factorization has led to the development of new theory on when general n\times n integer matrices A can be factored as A = Z^TZ (when A is symmetric positive definite) or A = Z^2 (a problem also considered in Higham, Lettington, and Schmidt (2021)), with integer Z.

Olga Taussky Todd wrote in 1961 that “matrices with integral elements have been studied for a very long time and an enormous number of problems arise, both theoretical and practical.” We wonder what else can be learned from the Wilson matrix and other integer test matrices.

References

This is a minimal set of references, which contain further useful references within.

Related Blog Posts

This article is part of the “What Is” series, available from https://nhigham.com/category/what-is and in PDF form from the GitHub repository https://github.com/higham/what-is.

What Is a Rank-Revealing Factorization?

In many applications a matrix A\in\mathbb{R}^{m\times n} has less than full rank, that is, r = \mathrm{rank}(A) < \min(m,n). Sometimes, r is known, and a full-rank factorization A = GH with G\in\mathbb{R}^{m \times r} and H\in\mathbb{R}^{r \times n}, both of rank r, is given—especially when r = 1 or r = 2. Often, though, the rank r is not known. Moreover, rather than being of exact rank r, A is merely close to a rank r matrix because of errors from various possible sources.

What is usually wanted is a factorization that displays how close A is to having particular ranks and provides an approximation to the range space of a lower rank matrix. The ultimate tool for providing this information is the singular value decomposition (SVD)

\notag     A = U\Sigma V^T, \quad    \Sigma = \mathrm{diag}(\sigma_1,\dots, \sigma_p)\in\mathbb{R}^{m\times n},

where p = \min(m,n), \sigma_1\ge \sigma_2\ge \cdots \ge \sigma_p \ge 0, and U\in\mathbb{R}^{m\times m} and V\in\mathbb{R}^{n\times n} are orthogonal. The Eckart–Young theorem says that

\notag  \min_{\mathrm{rank}(B) = k} \|A-B\|_q =  \begin{cases}      \sigma_{k+1},                                & q = 2,  \\      \Bigl(\sum_{i=k+1}^r \sigma_i^2\Bigr)^{1/2}, & q = F,   \end{cases}

and that the minimum is attained at

\notag    A_k = U \Sigma_k V^T, \quad    \Sigma_k = \mathrm{diag}(\sigma_1, \dots, \sigma_k, 0, \dots, 0),

so A_k is the best rank-k approximation to A in both the 2-norm and the Frobenius norm.

Although the SVD is expensive to compute, it may not be significantly more expensive than alternative factorizations. However, the SVD is expensive to update when a row or column is added to or removed from the matrix, as happens repeatedly in signal processing applications.

Many different definitions of a rank-revealing factorization have been given, and they usually depend on a particular matrix factorization. We will use the following general definition.

Definition 1. A rank-revealing factorization (RRF) of A\in\mathbb{R}^{m\times n} is a factorization

\notag   A = XDY^T, \quad   X\in\mathbb{R}^{m\times p}, \quad   D\in\mathbb{R}^{p\times p}, \quad   Y\in\mathbb{R}^{n\times p},

where p \le \min(m,n), D is diagonal and nonsingular, and X and Y are well conditioned.

An RRF concentrates the rank deficiency and ill condition of A into the diagonal matrix D. An RRF clearly exists, because the SVD is one, with X and Y having orthonormal columns and hence being perfectly conditioned. Justification for this definition comes from a version of Ostrowski’s theorem, which shows that

\notag     \sigma_i(A) = \theta_i \sigma_i(D), \quad i = 1\colon \min(m,n),      \qquad (1)

where \sigma_p(X)\sigma_p(Y) \le \theta_i \le \sigma_1(X) \sigma_1(Y). Hence as long as X and Y are well conditioned, the singular values are good order of magnitude approximations to those of A up a scale factor.

Without loss of generality we can assume that

\notag   D = \mathrm{diag}(d_i), \quad   |d_1| \ge |d_2| \ge \cdots \ge |d_p|

(since XDY^T = XP\cdot P^TDP \cdot P^T Y^T for any permutation matrix P and the second expression is another RRF). For \widetilde{A}_k = X \mathrm{diag}(d_1,\dots,d_k,0,\dots,0)Y^T we have

\notag   \|A - \widetilde{A}_k\| \le \|X\| \|Y\|   \|\mathrm{diag}(0,\dots,0,d_{k+1},\dots,d_p)\|,

so A is within distance of order |d_{k+1}| from the rank-k matrix \widetilde{A}_k, which is the same order as the distance to the nearest rank-k matrix if |d_{k+1}| \approx \sigma_{k+1}.

Definition 2 is a strong requirement, since it requires all the singular values of A to be well approximated by the (scaled) diagonal elements of D. We will investigate below how it compares with another definition of RRF.

Numerical Rank

An RRF helps to determine the numerical rank, which we now define.

Definition 2. For a given \epsilon > 0 the numerical rank of A is the largest integer k such that \sigma_k > \epsilon.

By the Eckart–Young theorem, the numerical rank is the smallest rank attained over all A+E with \|E\|_2 \le \epsilon. For the numerical rank to be meaningful in the sense that it is unchanged if \epsilon is perturbed slightly, we need \epsilon not to be too close to \sigma_k or \sigma_{k+1}, which means that there must be a significant gap between these two singular values.

QR Factorization

One might attempt to compute an RRF by using a QR factorization A = QR, where Q\in\mathbb{R}^{m\times n} has orthonormal columns, R\in\mathbb{R}^{n\times n} is upper triangular, and we assume that m\ge n. In Definition 1, we can take

\notag   X = I, \quad D = \mathrm{diag}(R), \quad Y^T = D^{-1}R. \qquad (*)

However, it is easy to see that QR factorization in its basic form is flawed as a means for computing an RRF. Consider the matrix

\notag   A = \begin{bmatrix}        0 & 1  & 0 &0\\        0 & 0  & 1 &0 \\        0 & 0  & 0 &1\\        0 & 0  & 0 &0   \end{bmatrix},

which is a Jordan block with zero eigenvalue. This matrix is its own QR factorization (R = A), and the prescription (*) gives D = 0, so A \ne XDY^T. The essential problem is that the diagonal of R has no connection with the nonzero singular values of A. What is needed are column permutations: A\Pi = \mathrm{diag}(1,1,1,0) for the permutation matrix \Pi that reorders [a_1,a_2,a_3,a_4] to [a_2,a_3,a_4,a_1], and this is a perfect RRF with X = Y = I.

For a less trivial example, consider the matrix

\notag   A = \left[\begin{array}{rrrr}        1 & 1  &\theta &0\\        1 & -1 & 2 &1 \\        1 & 0  &1+\theta &-1\\        1 &-1  & 2 &-1   \end{array}\right], \quad \theta = 10^{-8}. \qquad (\dagger)

Computing the QR factorization we obtain

R =
  -2.0000e+00   5.0000e-01  -2.5000e+00   5.0000e-01
            0   1.6583e+00  -1.6583e+00  -1.5076e-01
            0            0  -4.2640e-09   8.5280e-01
            0            0            0  -1.4142e+00

The (3,3) element tells us that A is within distance about 4\times 10^{-9} of being rank deficient and so has a singular value bounded above by this quantity, but it does not provide any information about the next larger singular value. Moreover, in (*), \kappa_2(Y) is of order 10^{16} for this factorization. We need any small diagonal elements to be in the bottom right-hand corner, and to achieve this we need to introduce column permutations to move the “dependent columns” to the end.

QR Factorization With Column Pivoting

A common method for computing an RRF is QR factorization with column pivoting, which for a matrix A\in\mathbb{R}^{m\times n} with m\ge n computes a factorization A\Pi = QR, where \Pi is a permutation matrix, Q\in\mathbb{R}^{m\times n} has orthonormal columns, and R\in\mathbb{R}^{n\times n} is upper triangular and satisfies the inequalities

\notag    |r_{kk}|^2 \ge \displaystyle\sum_{i=k}^j |r_{ij}|^2,     \quad j=k+1\colon n, \quad k=1\colon n. \qquad (2)

In particular,

\notag   |r_{11}| \ge |r_{22}| \ge \cdots \ge |r_{nn}|.   \qquad(3)

If |r_{kk}| \ge \epsilon \ge |r_{k+1,k+1}| with \epsilon > 0 then we can write

\notag    R =    \begin{array}[b]{@{\mskip33mu}c@{\mskip-16mu}c@{\mskip-10mu}c@{}}    \scriptstyle k &    \scriptstyle n-k &    \\    \multicolumn{2}{c}{        \left[\begin{array}{c@{~}c@{~}}                  R_{11}& R_{12} \\                    0   & R_{22} \\              \end{array}\right]}    & \mskip-12mu\          \begin{array}{c}              \scriptstyle k \\              \scriptstyle n-k              \end{array}    \end{array},  \qquad(4)

with

\notag   \|R_{22}\|_2 \le \|R_{22}\|_F \le 2^{-1/2}(n-k+1)\epsilon.

Hence R is within 2-norm distance 2^{-1/2}(n-k+1)\epsilon of the rank-k matrix \left[\begin{smallmatrix} R_{11} & R_{12} \\ 0 & 0 \end{smallmatrix}\right]. Note that if Q = [Q_1~Q_2] is partitioned conformally with Q in (4) then

\notag   A\Pi =   \begin{bmatrix}    Q_1 & Q_2   \end{bmatrix}   \begin{bmatrix}    R_{11} & R_{12} \\     0     & R_{22} \\   \end{bmatrix}    = Q_1   \begin{bmatrix}    R_{11} & R_{12}   \end{bmatrix}   + \begin{bmatrix}    0 & Q_2 R_{22}   \end{bmatrix},

so \| A\Pi - Q_1 [R_{11}~R_{12}]\|_2 \le \|R_{22}\|_2, which means that Q_1 provides an O(\epsilon) approximation to the range of A.

To assess how good an RRF this factorization is (with p = n) we write it as

\notag   A = QR\Pi^T = Q D Y^T, \quad D = \mathrm{diag}(r_{ii}),                          \quad Y^T = D^{-1}R \Pi^T. \quad (\#)

Applying (1) gives

\notag     \sigma_i(A) = \theta_i \sigma_i(D), \quad i = 1\colon p, \qquad (5)

where \sigma_n(Y)\le \theta_i \le \sigma_1(Y), since Q has orthonormal columns and so has unit singular values. Now D^{-1}R has unit diagonal and, in view of (2), its off-diagonal elements are bounded by 1. Therefore \sigma_1(Y) = \|Y\|_2 \le \|Y\|_F \le (n(n+1)/2)^{1/2}. On the other hand, \sigma_p(Y)^{-1} \le 2^{n-1} by Theorem 1 in Bounds for the Norm of the Inverse of a Triangular Matrix. Therefore

\notag    2^{1-n} \le \theta_i \le (n(n+1)/2)^{1/2}.

The lower bound is an approximate equality for small \tau for the triangular matrix

\notag        R_n(\theta) = \mathrm{diag}(1,s,\dots,s^{n-1})             \begin{bmatrix} 1 & -c & -c     & \dots & -c \\                               & 1  & -c     & \dots & -c \\                               &    & \ddots &\ddots & \vdots \\                               &    &        &\ddots & -c \\                               &    &        &       &  1 \end{bmatrix},      \quad c=\cos\tau, \quad s=\sin\tau,

devised by Kahan, which is invariant under QR factorization with column pivoting. Therefore QR factorization with column pivoting is not guaranteed to reveal the rank, and indeed it can fail to do so by an exponentially large factor.

For the matrix (\dagger), QR with column pivoting reorders A to A\Pi = [a_3,~a_4,~a_2,~a_1] and yields

R =
  -3.0000e+00   3.3333e-01   1.3333e+00  -1.6667e+00
            0  -1.6997e+00   2.6149e-01   2.6149e-01
            0            0   1.0742e+00   1.0742e+00
            0            0            0   3.6515e-09

This R suggests a numerical rank of 3 for \epsilon = 10^{-8} (say). In fact, this factorization provides a very good RRF, as in (\#) we have \kappa_2(Y) \approx 3.4.

QR Factorization with Other Pivoting Choices

Consider a QR factorization A\Pi = QR with triangular factor partitioned as

\notag    R =    \begin{array}[b]{@{\mskip33mu}c@{\mskip-16mu}c@{\mskip-10mu}c@{}}    \scriptstyle k &    \scriptstyle n-k &    \\    \multicolumn{2}{c}{        \left[\begin{array}{c@{~}c@{~}}                  R_{11}& R_{12} \\                    0   & R_{22} \\              \end{array}\right]}    & \mskip-12mu\          \begin{array}{c}              \scriptstyle k \\              \scriptstyle n-k              \end{array}    \end{array}. \qquad (6)

We have

\notag    \begin{aligned}      \sigma_{\min}(R_{11}) &\le \sigma_k(A),    \quad \qquad (7)\\      \sigma_{\max}(R_{22}) &\ge \sigma_{k+1}(A), ~\qquad (8)    \end{aligned}

where (7) is from singular value interlacing inequalities and (8) follows from the Eckart-Young theorem, since setting R_{22} to zero gives a rank-k matrix. Suppose A has numerical rank k and \sigma_{k+1} \ll \sigma_k. We would like to be able to detect this situation from R, so clearly we need

\notag  \sigma_{\min}(R_{11}) \approx \sigma_k(A), \quad  \sigma_{\max}(R_{22}) \approx \sigma_{k+1}(A). \qquad (9)

In view of the inequalities (7) and (8) this means that we wish to choose \Pi maximize \sigma_{\min}(R_{11}) and minimize \sigma_{\max}(R_{22}).

Some theoretical results are available on the existence of such QR factorizations. First, we give a result that shows that for k = n-1 the approximations in (9) can hold to within a factor n^{1/2}.

Theorem 1. For A\in\mathbb{R}^{m\times n} with m\ge n there exists a permutation \Pi such that A has the QR factorization A\Pi = QR with |r_{nn}| \le n^{1/2}\sigma_n(A) and \sigma_{\min}(R_{11}) \ge n^{-1/2} \sigma_{n-1}(A), where R_{11} = R(1\colon n-1, 1\colon n-1).

Proof. Let Av = \sigma_n u, with \|v\|_2 = \|u\|_2 = 1 and let \Pi^T be such that \widetilde{v}  = \Pi^Tv satisfies |\widetilde{v}_n| = \|\widetilde{v}\|_{\infty}. Then if A\Pi = QR is a QR factorization,

\notag \begin{aligned}    \sigma_n = \| \sigma_n u \|_2             = \| Av \|_2             = \| QR\Pi^Tv \|_2             = \| R\mskip1mu \widetilde{v} \|_2             \ge | r_{nn} \widetilde{v}_n |             \ge n^{-1/2} | r_{nn} |, \end{aligned}

since \|\widetilde{v}\|_2 = 1, which yields the result.

Next, we write \Pi = [\Pi_1~\pi], where \pi\in\mathbb{R}^n, and partition

\notag   R = \begin{bmatrix}    R_{11} & R_{12} \\     0     & R_{22} \\   \end{bmatrix}

with R_{11}\in\mathbb{R}^{(n-1)\times (n-1)}. Then

A\Pi_1 = Q \begin{bmatrix} R_{11} \\ 0 \end{bmatrix}

implies \sigma_{\min}(A\Pi_1) = \sigma_{\min}(R_{11}). On the other hand, if A = U\Sigma V^T is an SVD with U\in\mathbb{R}^{m\times n}, \Sigma = \mathrm{diag}(D_1,\sigma_n)\in\mathbb{R}^{n\times n}, and V = [V_1~v] then

\notag  A\Pi_1 = U\Sigma V^T \Pi_1 =    U    \begin{bmatrix}    D_1    & 0 \\     0     & \sigma_n \\   \end{bmatrix}   \begin{bmatrix}    V_1^T \\ v^T   \end{bmatrix} \Pi_1   = U   \begin{bmatrix}    D_1V_1^T \\ \sigma_n v^T   \end{bmatrix} \Pi_1,

so

\notag   \sigma_{\min}(A\Pi_1) =   \sigma_{\min}\left(   \begin{bmatrix}    D_1V_1^T \\ \sigma_n v^T   \end{bmatrix} \Pi_1 \right)    \ge \sigma_{\min}(D_1V^T\Pi_1)    \ge \sigma_{n-1}\sigma_{\min}(V^T\Pi_1).

Finally, we note that we can partition the orthogonal matrix V^T\Pi_1 as

\notag  V^T\Pi =    \begin{bmatrix}    V_1^T\Pi_1  & V_1^T\pi \\    v^T\Pi_1    & v^T\pi   \end{bmatrix},

and the CS decomposition implies that

\notag    \sigma_{\min}(V_1^T\Pi_1) =    \sigma_{\min}(v^T\pi) =    |v^T\pi| = |\widetilde{v}_n| \ge n^{-1/2}.

Hence \sigma_{\min}(R_{11}) \ge n^{-1/2} \sigma_{n-1}, as required. ~\square

Theorem 1 is a special case of the next result of Hong and Pan (1992).

Theorem 2. For A\in\mathbb{R}^{m\times n} with m\ge n and any k there exists a permutation matrix \Pi such that A has the QR factorization A\Pi = QR where, with R partitioned as in (6),

\notag    \sigma_{\max}(R_{22}) \le f(k,n) \sigma_{k+1}(A), \quad    \sigma_{\min}(R_{11}) \ge f(k,n)^{-1} \sigma_k(A),

where f(k,n) = (k(n-k) + \min(k,n-k))^{1/2}.

The proof of Theorem 2 is constructive and chooses \Pi to move a submatrix of maximal determinant of V_2 to the bottom of V_2, where V_2 comprises the last n-k columns of the matrix of right singular vectors.

Theorem 2 shows the existence of an RRF up to the factor f(k,n) \le (n+1)/2, but it does not provide an efficient algorithm for computing one.

How do the conditions (9) relate to Definition 1? Since algorithms for computing an RRF are usually not specialized to any particular k, it is reasonable to ask that (9) holds for all k. We consider what can be said if the first condition in (9) holds as an equality for all k.

Lemma 3. If R\in\mathbb{R}^{m\times n} is upper triangular and satisfies \sigma_{\min}(R_{11}) = \sigma_k(R) for k=1\colon n-1 in the partitioning (6) then R is diagonal.

Proof. The proof is by induction. Assume that R_{k-1} = \mathrm{diag}(\sigma_1(R),\dots,\sigma_{k-1}(R)). This is clearly true for k = 2, since |r_{11}| = \sigma_1(R). Write

\notag       R_k =       \begin{bmatrix}       R_{k-1} & z \\         0     & \rho       \end{bmatrix}.

Then |\rho| \ge \sigma_{\min}(R_k) by (8). Also \sigma_i(R_k) \le \sigma_i(R) for i = 1\colon k by standard singular value inequalities. We therefore have

\notag \begin{aligned}    \sum_{i=1}^k \sigma_i(R)^2    \ge \sum_{i=1}^k \sigma_i(R_k)^2     = \|R_k\|_F^2     = \|R_{k-1}\|_F^2  + \|z\|_2^2 + \rho^2\\     = \sum_{i=1}^{k-1} \sigma_i(R)^2 + \|z\|_2^2 + \rho^2     \ge \sum_{i=1}^k \sigma_i(R)^2 + \|z\|_2^2. \end{aligned}

It follows that z = 0, and so R_k is diagonal, which completes the induction.

We conclude from the lemma that if the first condition in (9) is an equality for all k then we have a perfect RRF A\Pi = QR with R diagonal. Therefore if the approximations in (9) are reasonably good for all k we should have a reasonably good RRF.

Much work has been done on algorithms that choose the permutation matrix \Pi in a different way to column pivoting or post-process a QR factorization with column pivoting, with the aim of satisfying (9) at reasonable cost. Typically, these algorithms involve estimating singular values and singular vectors. We are not aware of any algorithm that is guaranteed to satisfy (9) and requires only O(n^3) flops.

UTV Decomposition

By applying Householder transformations on the right, a QR factorization with column pivoting can be turned into a complete orthogonal decomposition of A\in\mathbb{R}^{m\times n}, which has the form

\notag      A = U \begin{bmatrix} T & 0 \\ 0 & 0 \end{bmatrix} V^T, \qquad (10)

where T\in\mathbb{R}^{r \times r} is upper triangular and U\in\mathbb{R}^{m\times m} and V\in\mathbb{R}^{n\times n} are orthogonal. Stewart (1998) calls (6) with T upper triangular or lower triangular a UTV decomposition and he defines a rank-revealing UTV decomposition of numerical rank r by

\notag \begin{aligned}      A &= U \begin{bmatrix} T & F \\ 0 & G  \end{bmatrix} V^T,          \qquad T\in\mathbb{R}^{r \times r}, \\         &          \sigma_r(T) \approx \sigma_r(A), \quad          \|F\|_F^2 + \|G\|_F^2 \approx \sigma_{r+1}^2 + \cdots + \sigma_n^2. \end{aligned}

The UTV decomposition is easy to update (when a row is added) and downdate (when a row is removed) using Givens rotations and it is suitable for parallel implementation. Initial determination of the UTV decomposition can be done by applying the updating algorithm as the rows are brought in one at a time.

LU Factorization

Instead of QR factorization we can build an RRF from an LU factorization with pivoting. For A\in\mathbb{R}^{m\times n} with m\ge n, let

\notag     \Pi_1 A \Pi_2 = LU =     \begin{bmatrix}       L_{11} & 0 \\       L_{12} & L_{22}     \end{bmatrix}     \begin{bmatrix}     U_{11} & U_{12}\\       0    & U_{22}     \end{bmatrix},

where \Pi_1 and \Pi_2 are permutation matrices, L and U are m\times n lower and n\times n upper triangular, respectively, and L_{11} and U_{11} are k\times k. Analogously to (7) and (8), we always have \sigma_{\min}(L_{11}U_{11}) \le \sigma_k(A) and \sigma_{\max}(L_{22}U_{22}) \ge \sigma_{k+1}(A). With a suitable pivoting strategy we can hope that \sigma_{\min}(L_{11}U_{11}) \approx \sigma_k(A) and \sigma_{\max}(L_{22}U_{22}) \approx \sigma_{k+1}(A).

A result of Pan (2000) shows that an RRF based on LU factorization always exists up to a modest factor f(k,n). This is analogue for LU factorization of Theorem 2.

Theorem 3 For A\in\mathbb{R}^{m\times n} with m\ge n and any k there exist permutation matrices \Pi_1 and \Pi_2 such that

\notag     \Pi_1 A \Pi_2 = LU =     \begin{bmatrix}     L_{11} &  0    \\     L_{12} & I_{m-k,n-k}     \end{bmatrix}    \begin{array}[b]{@{\mskip33mu}c@{\mskip-16mu}c@{\mskip-10mu}c@{}}    \scriptstyle k &    \scriptstyle n-k &    \\    \multicolumn{2}{c}{        \left[\begin{array}{c@{~}c@{~}}                  U_{11}& U_{12} \\                    0   & U_{22} \\              \end{array}\right]}    & \mskip-12mu\          \begin{array}{c}              \scriptstyle k \\              \scriptstyle n-k              \end{array}    \end{array},

where L_{11} is unit lower triangular, U_{11} is upper triangular, and

\notag    \sigma_{\max}(U_{22}) \le f(k,n) \sigma_{k+1}(A), \quad    \sigma_{\min}(L_{11}U_{11}) \ge f(k,n)^{-1} \sigma_k(A),

where f(k,n) = k(n-k) + 1.

Again the proof is constructive, but the permutations it chooses are too expensive to compute. In practice, complete pivoting often yields a good RRF.

In terms of Definition 1, an RRF has

\notag     X = \Pi_1^TL D, \quad D = \mathrm{diag}(u_{ii}), \quad     Y^T = D^{-1}U\Pi_2. \qquad (\ddagger)

For the matrix (\dagger), the U factor for LU factorization without pivoting is

U =
   1.0000e+00   1.0000e+00   1.0000e-08            0
            0  -2.0000e+00   2.0000e+00   1.0000e+00
            0            0   5.0000e-09  -1.5000e+00
            0            0            0  -2.0000e+00

As for QR factorization without pivoting, an RRF is not obtained from (\ddagger).. However, with complete pivoting we obtain

U =
   2.0000e+00   1.0000e+00  -1.0000e+00   1.0000e+00
            0  -2.0000e+00            0            0
            0            0   1.0000e+00   1.0000e+00
            0            0            0  -5.0000e-09

which yields a very good RRF (\ddagger) with \kappa_2(X) = 3.5 and \kappa_2(Y) = 3.4.

Notes

QR factorization with column pivoting is difficult to implement efficiently, as the criterion for choosing the pivots requires the norms of the active parts of the remaining columns and this requires a significant amount of data movement. In recent years, randomized RRF algorithms have been developed that use projections with random matrices to make pivot decisions based on small sample matrices and thereby reduce the amount of data movement. See, for example, Martinsson et al. (2019).

References

This is a minimal set of references, which contain further useful references within.

Related Blog Posts

This article is part of the “What Is” series, available from https://nhigham.com/category/what-is and in PDF form from the GitHub repository https://github.com/higham/what-is.

What Is a Pseudo-Orthogonal Matrix?

A matrix Q\in\mathbb{R}^{n\times n} is pseudo-orthogonal if

\notag      Q^T \Sigma Q = \Sigma, \qquad (1)

where \Sigma = \mathrm{diag}(\pm 1) is a signature matrix. A matrix Q satisfying (1) is also known as a J-orthogonal matrix, where J is another notation for a signature matrix. Of course, if \Sigma = I then Q is orthogonal.

It is easy to show that Q^T is also pseudo-orthogonal. Furthermore, Q is clearly nonsingular and it satisfies

\notag      Q = \Sigma Q^{-T}\Sigma. \qquad (2)

Since \Sigma is orthogonal, this equation implies that \|Q\|_\ell = \|Q^{-T}\|_\ell = \|Q^{-1}\|_\ell and hence that

\notag   \kappa_p(Q) = \|Q\|_\ell^2, \quad \ell = 2,F. \qquad(3).

What are some examples of pseudo-orthogonal matrices? For n = 2 and \Sigma = \left[\begin{smallmatrix}1 & 0 \\ 0 & -1 \end{smallmatrix}\right], Q is of the form

\notag   Q =   \begin{bmatrix}    a & b \\ c & d    \end{bmatrix}, \quad   ab - cd = 0, \quad a^2 - c^2 = 1, \quad b^2 - d^2 = -1,

which includes the matrices

\notag     Q = \begin{bmatrix} \cosh \theta & -\sinh\theta \\                         -\sinh\theta & \cosh\theta \end{bmatrix},       \quad \theta\in\mathbb{R}. \qquad (4)

The Lorentz group, representing symmetries of the spacetime of special relativity, corresponds to 4\times 4 matrices with \Sigma = \mathrm{diag}(1,1,1,-1).

Equation (2) shows that Q is similar to the inverse of its transpose and hence (since every matrix is similar to its transpose) similar to its inverse. It follows that if \lambda is an eigenvalue of Q then \lambda^{-1} is also an eigenvalue and it has the same algebraic and geometric multiplicities as \lambda.

By permuting rows and columns in (1) we can arrange that

\notag        \Sigma = \Sigma_{p,q} := \begin{bmatrix} I_p & 0   \\                             0  & -I_q            \end{bmatrix}. \qquad (5)

We assume that \Sigma has this form throughout the rest of this article. This form of \Sigma allows us to conveniently characterize matrices that are both orthogonal and pseudo-orthogonal. Such a matrix must satisfy \Sigma Q = Q\Sigma, which means that Q = \mathrm{diag}(Q_{11},Q_{22}), and any such orthogonal matrix is pseudo-orthogonal.

Applications

Pseudo-orthogonal matrices arise in hyperbolic problems, that is, problems where there is an underlying indefinite scalar product or weight matrix. An example is the problem of downdating the Cholesky factorization, where in the simplest case we have the Cholesky factorization C = R^T\!R of a symmetric positive definite C\in\mathbb{R}^{n\times n} and want the Cholesky factorization of \widetilde{C} = C - zz^T, which is assumed to be symmetric positive definite. A more general downdating problem is that we are given

\notag   A = \begin{array}[b]{cc}        \left[\begin{array}{@{}c@{}}                  A_1\\                  A_2              \end{array}\right]        & \mskip-22mu\          \begin{array}{l}              \scriptstyle p \\              \scriptstyle q          \end{array}    \end{array},    \quad p\ge n,

and the Cholesky factorization A^T\!A = R^T\!R and wish to obtain the Cholesky factor S of A_1^TA_1  = R^T\!R - A_2^TA_2. Note that R and S are n\times n. This task arises when we solve a regression problem after the observations corresponding to A_2 have been removed. The simple case above corresponds to removing one row (q = 1). Assuming that q \ll p, we would like to obtain S cheaply from R, and numerical stability considerations dictate that we should avoid explicitly forming A_1^TA_1. If we can find a pseudo-orthogonal matrix Q such that

\notag       Q \begin{bmatrix} R \\ A_2 \end{bmatrix}         =         \begin{bmatrix} S \\ 0 \end{bmatrix}, \qquad (6)

with \Sigma given by (5) and S\in\mathbb{R}^{n\times n} upper triangular, then

\notag     A_1^TA_1       = \begin{bmatrix} R   \\ A_2 \end{bmatrix}^T \Sigma         \begin{bmatrix} R   \\ A_2 \end{bmatrix}       = \begin{bmatrix} R   \\ A_2 \end{bmatrix}^T Q^T \Sigma Q         \begin{bmatrix} R   \\ A_2 \end{bmatrix}       = \begin{bmatrix} S   \\ 0   \end{bmatrix}^T \Sigma         \begin{bmatrix} S   \\ 0   \end{bmatrix}       = S^T\!S,

so S is the desired Cholesky factor.

The factorization (6) is called a hyperbolic QR factorization and it can be computed by using hyperbolic rotations to zero out the elements of A_2. A 2\times2 hyperbolic rotation has the form (4), and an n\times n hyperbolic rotation is an identity matrix with a 2\times 2 hyperbolic rotation embedded in it at the intersection of rows and columns i and j, for some i and j.

In general, a hyperbolic QR factorization of A\in\mathbb{R}^{m\times n} with m = p+q and p\ge n has the form QA = \left[\begin{smallmatrix} R \\ 0 \end{smallmatrix}\right] with Q pseudo-orthogonal with respect to \Sigma = \Sigma_{p,q} and R \in\mathbb{R}^{n\times n} upper triangular. The factorization exists if A^T\Sigma A is positive definite.

Another hyperbolic problem is the indefinite least squares problem

\notag        \min_x \,(b-Ax)^T \Sigma (b-Ax), \qquad (7)

where A\in\mathbb{R}^{m\times n}, m\ge n, and b\in\mathbb{R}^m are given, and \Sigma = \Sigma_{p,q} with m = p + q. For p=0 or q=0 we have the standard least squares (LS) problem and the quadratic form is definite, while for pq>0 the problem is to minimize a genuinely indefinite quadratic form. This problem arises, for example, in the area of optimization known as H^{\infty} smoothing.

The normal equations for (7) are A^T\Sigma Ax = A^T\Sigma b, and since the Hessian matrix of the quadratic objective function in (7) is A^T\Sigma A it follows that the indefinite least squares problem has a unique solution if and only if A^T\Sigma A is positive definite. To solve the problem we can use a hyperbolic QR factorization QA = \left[\begin{smallmatrix} R \\ 0 \end{smallmatrix}\right] to write

\notag \begin{aligned}     A^T\Sigma A &= A^T Q^T \Sigma Q A     = \begin{bmatrix} R \\ 0 \end{bmatrix}^T           \begin{bmatrix} I_p & 0   \\                           0  & -I_q                           \end{bmatrix}      \begin{bmatrix} R \\ 0 \end{bmatrix}      = R^T\!R, \\   A^T\Sigma b &= A^T Q^T\Sigma Q b          = \begin{bmatrix} R \\ 0 \end{bmatrix}^T \! \Sigma Q b. \end{aligned}

Solving the problem now reduces to solving the triangular system Rx = d, where d comprises the first n components of Qb. The same equation can also be obtained without using the normal equations by substituting the hyperbolic QR factorization into (7).

The Exchange Operator

A simple technique exists for converting pseudo-orthogonal matrices into orthogonal matrices and vice versa. Let A\in\mathbb{R}^{n\times n} with n = p + q, partition

\notag   A  = \mskip5mu    \begin{array}[b]{@{\mskip-20mu}c@{\mskip0mu}c@{\mskip-1mu}c@{}}    & \mskip10mu\scriptstyle p & \scriptstyle q \\       \mskip15mu          \begin{array}{r}              \scriptstyle p \\              \scriptstyle q          \end{array}~    &       \multicolumn{2}{c}{\mskip-15mu          \left[\begin{array}{c@{~}c@{~}}                  A_{11} & A_{12}\\                  A_{21} & A_{22}                \end{array}\right]       }    \end{array}, \qquad (8)

and assume A_{11} is nonsingular. The exchange operator is defined by

\notag    \mathrm{exc}(A) =       \begin{bmatrix}            A_{11}^{-1} & -A_{11}^{-1}A_{12} \\            A_{21}A_{11}^{-1} & A_{22} -A_{21}A_{11}^{-1}A_{12}      \end{bmatrix}.

It is easy to see that the exchange operator is involutory, that is,

\notag   \mathrm{exc}(\mathrm{exc}(A)) = A,

and moreover (recalling that \Sigma is given by (5)) that

\notag     \mathrm{exc}(\Sigma A\Sigma) = \Sigma \mathrm{exc}(A)\Sigma = \mathrm{exc}(A^T)^T.     \qquad(9)

The next result gives a formula for the inverse of \mathrm{exc}(A).

Lemma 1. Let A\in\mathbb{R}^{n\times n} with A_{11} nonsingular. If A is nonsingular and \mathrm{exc}(A^{-1}) exists then \mathrm{exc}(A) is nonsingular and \mathrm{exc}(A)^{-1} = \mathrm{exc}(A^{-1}).

Proof. Consider the equation

\notag    y =    \begin{bmatrix}    y_1 \\ y_2    \end{bmatrix}      =      \begin{bmatrix}            A_{11} & A_{12} \\            A_{21} & A_{22}      \end{bmatrix}    \begin{bmatrix}    x_1 \\ x_2    \end{bmatrix}  =  Ax.

By solving the first equation for x_1 and then eliminating x_1 from the second equation we obtain

\notag   \begin{bmatrix}    x_1 \\ y_2   \end{bmatrix}   =   \mathrm{exc}(A)   \begin{bmatrix}    y_1 \\ x_2   \end{bmatrix}. \qquad (10)

By the same argument applied to x = A^{-1}y, we have

\notag   \begin{bmatrix}    y_1 \\ x_2   \end{bmatrix}   =   \mathrm{exc}(A^{-1})   \begin{bmatrix}    x_1 \\ y_2   \end{bmatrix}.

Hence for any x_1 and y_2 there is a unique y_1 and x_2, which implies by (10) that \mathrm{exc}(A) is nonsingular and that \mathrm{exc}(A)^{-1} = \mathrm{exc}(A^{-1}). ~\square

Now we will show that the exchange operator maps pseudo-orthogonal matrices to orthogonal matrices and vice versa.

Theorem 2. Let A\in\mathbb{R}^{n\times n}. If A is pseudo-orthogonal then \mathrm{exc}(A) is orthogonal. If A is orthogonal and A_{11} is nonsingular then \mathrm{exc}(A) is pseudo-orthogonal.

Proof. If A is pseudo-orthogonal then A_{11}^TA_{11}  = I + A_{21}^TA_{21}, which implies that A_{11} is nonsingular. Since \Sigma A^T\Sigma = A^{-1}, it follows that A^{-1} also has a nonsingular (1,1) block and so \mathrm{exc}(A^{-1}) exists. Furthermore, using Lemma 1, \mathrm{exc}(\Sigma A^T\Sigma) = \mathrm{exc}(A^{-1}) = \mathrm{exc}(A)^{-1}. But (9) shows that \mathrm{exc}(\Sigma A^T\Sigma) = \mathrm{exc}(A)^T, and we conclude that \mathrm{exc}(A) is orthogonal.

Assume now that A is orthogonal with A_{11} nonsingular. Then \mathrm{exc}(A^T) = \mathrm{exc}(A^{-1}) exists and Lemma 1 shows that \mathrm{exc}(A) is nonsingular and \mathrm{exc}(A)^{-1} = \mathrm{exc}(A^{-1}) = \mathrm{exc}(A^T). Hence, using (9),

I = \mathrm{exc}(A^T) \mathrm{exc}(A) =         \Sigma\mathrm{exc}(A)^T\Sigma \cdot \mathrm{exc}(A),

which shows that \mathrm{exc}(A) is pseudo-orthogonal. ~\square

This MATLAB example uses the exchange operator to convert an orthogonal matrix obtained from a Hadamard matrix into a pseudo-orthogonal matrix.

>> p = 2; n = 4;
>> A = hadamard(n)/sqrt(n), Sigma = blkdiag(eye(p),-eye(n-p))
A =
   5.0000e-01   5.0000e-01   5.0000e-01   5.0000e-01
   5.0000e-01  -5.0000e-01   5.0000e-01  -5.0000e-01
   5.0000e-01   5.0000e-01  -5.0000e-01  -5.0000e-01
   5.0000e-01  -5.0000e-01  -5.0000e-01   5.0000e-01
Sigma =
     1     0     0     0
     0     1     0     0
     0     0    -1     0
     0     0     0    -1
>> Q = exc(A,p), Q'*Sigma*Q
Q =
     1     1    -1     0
     1    -1     0    -1
     1     0    -1    -1
     0     1    -1     1
ans =
     1     0     0     0
     0     1     0     0
     0     0    -1     0
     0     0     0    -1

The code uses the function

function X = exc(A,p)
%EXC     Exchange operator.
%   EXC(A,p) is the result of applying the exchange operator to 
%   the square matrix A, which is regarded as a block 2-by-2 
%   matrix with leading block of dimension p.  
%   p defaults to floor(n)/2.

[m,n] = size(A);
if m ~= n, error('Matrix must be square.'), end
if nargin < 2, p = floor(n/2); end

A11 = A(1:p,1:p);
A12 = A(1:p,p+1:n);
A21 = A(p+1:n,1:p);
A22 = A(p+1:n,p+1:n);

X21 = A11\A12;
X = [inv(A11) -X21;
     A21/A11  A22-A21*X21];

Hyperbolic CS Decomposition

For an orthogonal matrix expressed in block 2\times 2 form there is a close relationship between the singular value decompositions (SVDs) of the blocks, as revealed by the CS decomposition (see What Is the CS Decomposition?). An analogous decomposition holds for a pseudo-orthogonal matrix. Let Q\in\mathbb{R}^{n \times n} be pseudo-orthogonal with respect to \Sigma in (5), and suppose that Q is partitioned as

\notag    Q =    \begin{array}[b]{@{\mskip33mu}c@{\mskip-16mu}c@{\mskip-10mu}c@{}}    \scriptstyle p &    \scriptstyle n-p &    \\    \multicolumn{2}{c}{        \left[\begin{array}{c@{~}c@{~}}                  Q_{11}& Q_{12} \\                  Q_{21}& Q_{22} \\              \end{array}\right]}    & \mskip-12mu\          \begin{array}{c}              \scriptstyle p \\              \scriptstyle n-p              \end{array}    \end{array}, \quad p \le \displaystyle\frac{n}{2}.

Then there exist orthogonal matrices U_1,V_1\in\mathbb{R}^{p \times p} and U_2,V_2\in\mathbb{R}^{q \times q} such that

\notag    \begin{bmatrix}  U_1^T & 0\\                          0   & U_2^T    \end{bmatrix}    \begin{bmatrix}  Q_{11} & Q_{12}\\                          Q_{21} & Q_{22}    \end{bmatrix}    \begin{bmatrix}  V_1 & 0\\                          0   & V_2    \end{bmatrix}    =    \begin{array}[b]{@{\mskip35mu}c@{\mskip30mu}c@{\mskip-10mu}c@{}c}    \scriptstyle p &    \scriptstyle p &    \scriptstyle n-2p &    \\    \multicolumn{3}{c}{    \left[\begin{array}{c@{~}|c@{~}c}    C &   -S      & 0   \\    \hline   -S &    C      & 0   \\    0 &    0      & I_{n-2p}    \end{array}\right]}    & \mskip-12mu    \begin{array}{c}    \scriptstyle p \\    \scriptstyle p \\    \scriptstyle n-2p    \end{array}    \end{array}, \qquad (11)

where C = \mathrm{diag}(c_i), S = \mathrm{diag}(s_i), and C^2 - S^2  = I, with c_i > s_i \ge 0 for all i. This is the hyperbolic CS decomposition, and it can be proved by applying the CS decomposition of an orthogonal matrix to \mathrm{exc}(Q).

The leading principal submatrix \left[\begin{smallmatrix}C & -S \\ -S & C \end{smallmatrix}\right] in (11) generalizes the 2\times 2 matrix (4), and in fact it can be permuted into a direct sum of such matrices.

Note that the matrix on the right in (11) is symmetric positive definite. Therefore the singular values of Q are the eigenvalues of that matrix, namely

\notag    c_1 \pm s_1, \dots,  c_p \pm s_p; \quad    1~\mathrm{with~multiplicity~}n - 2p.

Since c_i^2 - s_i^2 = 1 for all i, the first 2p singular values occur in reciprocal pairs, hence the largest and smallest singular values satisfy \sigma_1 = \sigma_n^{-1}\ge 1 (with strict inequality unless p = 0). This gives another proof of (3).

Numerical Stability

While an orthogonal matrix is perfectly conditioned, a pseudo-orthogonal matrix can be arbitrarily ill conditioned, as follows from (3). For example, the MATLAB function gallery('randjorth') produces a random pseudo-orthogonal matrix with a default condition number of sqrt(1/eps).

>> rng(1); A = gallery('randjorth',2,2) % p = 2, n = 4
A =
   2.9984e+03  -4.2059e+02   1.5672e+03  -2.5907e+03
   1.9341e+03  -2.6055e+03   3.1565e+03  -7.5210e+02
   3.1441e+03  -6.2852e+02   1.8157e+03  -2.6427e+03
   1.6870e+03  -2.5633e+03   3.0204e+03  -5.4157e+02
>> cond(A)
ans =
   6.7109e+07

This means that algorithms that use pseudo-orthogonal matrices are potentially numerically unstable. Therefore algorithms need to be carefully constructed and rounding error analysis must be done to ensure that an appropriate form of numerical stability is obtained.

Notes

Pseudo-orthogonal matrices form the automorphism group of the scalar product defined by \langle x,y\rangle = x^T\Sigma y for x,y\in\mathbb{R}^n. More results for pseudo-orthogonal matrices can be obtained as special cases of results for automorphism groups of general scalar products. See, for example, Mackey, Mackey, and Tisseur (2006).

For \Sigma \ne \pm I the set of pseudo-orthogonal matrices is known to have four connected components, a topological property that can be proved using the hyperbolic CS decomposition (Motlaghian, Armandnejad, and Hall, 2018).

One can define pseudo-unitary matrices in an analogous way, as Q\in\mathbb{C}^{n\times n} such that Q^*\Sigma Q = \Sigma. These correspond to the automorphism group of the scalar product \langle x,y\rangle = x^*\Sigma y for x,y\in\mathbb{C}^n. The results we have discussed generalize in a straightforward way to pseudo-unitary matrices.

The exchange operator is also known as the principal pivot transform and as the sweep operator in statistics. Tsatsomeros (2000) gives a survey of its properties

The hyperbolic CS decomposition was derived by Lee (1948) and, according to Lee, was present in work of Autonne (1912).

References

This is a minimal set of references, which contain further useful references within.

Related Blog Posts

This article is part of the “What Is” series, available from https://nhigham.com/category/what-is and in PDF form from the GitHub repository https://github.com/higham/what-is.

What Is an LU Factorization?

An LU factorization of an n\times n matrix A is a factorization A = LU, where L is unit lower triangular and U is upper triangular. “Unit” means that L has ones on the diagonal. Example:

\notag    \left[\begin{array}{rrrr}      3 & -1 & 1 & 1\\     -1 & 3 & 1 & -1\\     -1 & -1 & 3 & 1\\      1 & 1 & 1 & 3    \end{array}\right]   =    \left[\begin{array}{rrrr}     1 & 0 & 0 & 0\\    -\frac{1}{3} & 1 & 0 & 0\\    -\frac{1}{3} & -\frac{1}{2} & 1 & 0\\     \frac{1}{3} & \frac{1}{2} & 0 & 1    \end{array}\right]    \left[\begin{array}{rrrr}    3 & -1 & 1 & 1\\    0 & \frac{8}{3} & \frac{4}{3} & -\frac{2}{3}\\    0 & 0 & 4 & 1\\    0 & 0 & 0 & 3    \end{array}\right]. \qquad (1)

An LU factorization simplifies the solution of many problems associated with linear systems. In particular, solving a linear system Ax = b reduces to solving the triangular systems Ly = b and Ux = y, since then b = L(Ux).

For a given A, an LU factorization may or may not exist, and if it does exist it may not be unique. Conditions for existence and uniqueness are given in the following result (see Higham, 2002, Thm. 9.1 for a proof). Denote by A_k = A(1\colon k,1\colon k) the leading principal submatrix of A of order k.

Theorem 1. The matrix A\in\mathbb{R}^{n\times n} has a unique LU factorization if and only if A_k is nonsingular for k=1\colon n-1. If A_k is singular for some 1\le k \le n-1 then the factorization may exist, but if so it is not unique.

Note that the (non)singularity of A plays no role in Theorem 1. However, if A is nonsingular and has an LU factorization then the factorization is unique. Indeed if A has LU factorizations A = L_1U_1 = L_2U_2 then the U_i are necessarily nonsingular and so L_2^{-1}L_1 = U_2U_1^{-1}. The left side of this equation is unit lower triangular and the right side is upper triangular; therefore both sides must equal the identity matrix, which means that L_1 = L_2 and U_1 = U_2, as required.

Equating leading principal submatrices in A = LU gives A_k = L_k U_k, which implies that \det(A_k) = \det(U_k) = u_{11} u_{22} \dots u_{kk}. Hence u_{kk} = \det(A_k)/\det(A_{k-1}). In fact, such determinantal formulas hold for all the elements of L and U:

\notag    \begin{aligned}    \ell_{ij} &= \frac{ \det\bigl( A( [1:j-1, \, i], 1:j ) \bigr) }{ \det( A_j ) },             \quad i > j, \\    u_{ij} &= \frac{ \det\bigl( A( 1:i, [1:i-1, \, j] ) \bigr) }                   { \det( A_{i-1} ) },             \quad i \le j.    \end{aligned}

Here, A(u,v), where u and v are vectors of subscripts, denotes the submatrix formed from the intersection of the rows indexed by u and the columns indexed by v.

Relation with Gaussian Elimination

LU factorization is intimately connected with Gaussian elimination. Recall that Gaussian elimination transforms a matrix A^{(1)} = A\in\mathbb{R}^{n\times n} to upper triangular form U = A^{(n)} in n-1 stages. At the kth stage, multiples of row k are added to the later rows to eliminate the elements below the diagonal in column k, using the formulas

\notag     a_{ij}^{(k+1)} = a_{ij}^{(k)} - m_{ik} a_{kj}^{(k)}, \quad                        i = k+1 \colon n, \; j = k+1 \colon n,

where the quantities m_{ik} = a_{ik}^{(k)} / a_{kk}^{(k)} are the multipliers. Of course each a_{kk}^{(k)} must be nonzero for these formulas to be defined, and this is connected with the conditions of Theorem 1, since u_{kk} = a_{kk}^{(k)}. The final U is the upper triangular LU factor, with u_{ij} = a_{ij}^{(i)} for j\ge i, and \ell_{ij} = m_{ij} for i > j, that is, the multipliers make up the L factor (for a proof of these properties see any textbook on numerical linear algebra).

The matrix factorization viewpoint is well established as a powerful paradigm for thinking and computing. Separating the computation of LU factorization from its application is beneficial. For example, given A = LU we saw above how to solve Ax = b. If we need to solve for another right-hand side b_2 we can just solve Ly_2 = b_2 and Ux_2 = y_2, re-using the LU factorization. Similarly, solving A^Tz = c reduces to solving the triangular systems U^T w = c and L^Tz = w.

Computation

An LU factorization can be computed by directly solving for the components of L and U in the equation A = LU. Indeed because L has unit diagonal the first row of U is the same as the first row of A, and a_{k1} = \ell_{k1} u_{11} = \ell_{k1} a_{11} then determines the first column of L. One can go on to determine the kth row of U and the kth row of L, for k = 2\colon n. This leads to the Doolittle method, which involves inner products of partial rows of L and partial columns of U.

Given the equivalence between LU factorization and Gaussian elimination we can also employ the Gaussian elimination equations:

\notag \begin{array}{l} \%~kji~\mathrm{form~of~LU~factorization.}\\ \mbox{for}~k=1:n-1  \\ \qquad \mbox{for}~  j=k+1:n \\ \qquad \qquad \mbox{for}~  i=k+1:n \\ \qquad\qquad\qquad a_{ij}^{(k+1)} = a_{ij}^{(k)} - a_{ik}^{(k)}a_{kj}^{(k)} / a_{kk}^{(k)}\\ \qquad\qquad\mbox{end}\\ \qquad\mbox{end}\\ \mbox{end}\\ \end{array}

This kji ordering of the loops in the factorization is the basis of early Fortran implementations of LU factorization, such as that in LINPACK. The inner loop travels down the columns of A, accessing contiguous elements of A since Fortran stores arrays by column. Interchanging the two inner loops gives the kij ordering, which updates the matrix a row at a time, and is appropriate for a language such as C that stores arrays by row.

The ijk and jik orderings correspond to the Doolittle method. The last two of the 3! = 6 orderings are the ikj and jki orderings, to which we will return later.

Schur Complements

For A\in\mathbb{R}^{n\times n} with \alpha = a_{11} \ne 0 we can write

\notag   A =  \begin{bmatrix}         \alpha & b^T \\           c & D        \end{bmatrix}    =        \begin{bmatrix}         1 & 0 \\           c/\alpha & I_{n-1}        \end{bmatrix}       \begin{bmatrix}         \alpha  & b^T \\          0 & D - cb^T/\alpha        \end{bmatrix} = : L_1U_1. \qquad (2)

The (n-1)\times (n-1) matrix S = D - cb^T/\alpha is called the Schur complement of \alpha in A.

The first row and column of L_1 and U_1 have the correct forms for a unit lower triangular matrix and an upper triangular matrix, respectively. If we can find an LU factorization S = L_2U_2 then

\notag      A =        \begin{bmatrix}         1 & 0 \\           c/\alpha & L_2        \end{bmatrix}       \begin{bmatrix}         \alpha  & b^T \\          0 & U_2        \end{bmatrix}

is an LU factorization of A. Note that this is simply another way to express the kji algorithm above.

For several matrix structures it is immediate that \alpha \ne 0. If we can show that the Schur complement inherits the same structure then it follows by induction that we can compute the factorization for S, and so an LU factorization of A exists. Classes of matrix for which a_{11} \ne 0 and A being in the class implies the Schur complement S is also in the class include

  • symmetric positive definite matrices,
  • M-matrices,
  • matrices (block) diagonally dominant by rows or columns.

(The proofs of these properties are nontrivial.) Note that the matrix (1) is row diagonally dominant, as is its U factor, as must be the case since its rows are contained in Schur complements.

We say that A has upper bandwidth q if a_{ij} = 0 for j>i+q and lower bandwidth p if a_{ij} = 0 for i>j+p. Another use of (2) is to show that L and U inherit the bandwidths of A.

Theorem 2. Let A\in\mathbb{R}^{n\times n} have lower bandwidth p and upper bandwidth q. If A has an LU factorization then L has lower bandwidth p and U has upper bandwidth q.

Proof. In (2), the first column of L_1 and the first row of U_1 have the required structure and S has upper bandwidth q and lower bandwidth p, since c and b have only p and q nonzero components, respectively. The result follows by induction.

Block Implementations

In order to achieve high performance on modern computers with their hierarchical memories, LU factorization is implemented in a block form expressed in terms of matrix multiplication and the solution of multiple right-hand side triangular systems. We describe two block forms of LU factorization. First, consider a block form of (2) with block size p, where A_{11} is p \times p:

\notag   A =  \begin{bmatrix}          A_{11} & A_{12}\\          A_{21} & A_{22}        \end{bmatrix}    =        \begin{bmatrix}         L_{11} & 0 \\           L_{21}& I_{n-p}        \end{bmatrix}       \begin{bmatrix}         U_{11} & U_{12} \\          0 & S        \end{bmatrix}.

Here, S is the Schur complement of A_{11} in A, given by S = A_{22} - A_{21}A_{11}^{-1}A_{12}. This leads to the following algorithm:

  1. Factor A_{11} = L_{11}U_{11}.
  2. Solve L_{11}U_{12} = A_{12} for U_{12}.
  3. Solve L_{21}U_{11} = A_{21} for L_{21}.
  4. Form S = A_{22}-L_{21}U_{12}.
  5. Repeat steps 1–4 on S to obtain S = L_{22}U_{22}.

The factorization on step 1 can be done by any form of LU factorization. This algorithm is known as a right-looking algorithm, since it accesses data to the right of the block being worked on (in particular, at each stage lines 2 and 4 access the last few columns of the matrix).

An alternative algorithm can derived by considering a block 3\times 3 partitioning, in which we assume we have already computed the first block column of L and U:

\notag   A =  \begin{bmatrix}          A_{11} & A_{12} & A_{13}\\          A_{21} & A_{22} & A_{23}\\          A_{31} & A_{32} & A_{33}        \end{bmatrix}    =        \begin{bmatrix}         L_{11} & 0     & 0 \\         L_{21} & L_{22}& 0 \\         L_{31} & L_{32}  & I        \end{bmatrix}       \begin{bmatrix}         U_{11} & U_{12} & \times \\              0 & U_{22} & \times \\              0 &  0     & \times        \end{bmatrix}.

We now compute the middle block column of L and U, comprising p columns, by

  1. Solve L_{11}U_{12} = A_{12} for U_{12}.
  2. Factor A_{22}-L_{21}U_{12} = L_{22}U_{22}.
  3. Solve L_{32}U_{22} = A_{32} - L_{31}U_{12} for L_{32}.
  4. Repartition so that the first two block columns become a single block column and repeat steps 1–4.

This algorithm corresponds to the jki ordering. Note that the Schur complement is updated only a block column at a time. Because the algorithm accesses data only to the left of the block column being worked on, it is known as a left-looking algorithm.

Our description of these block algorithms emphasizes the mathematical ideas. The implementation details, especially for the left-looking algorithm, are not trivial. The optimal choice of block size p will depend on the machine, but p is typically in the range 64512.

An important point is that all these different forms of LU factorization, no matter which ijk ordering or which value of p, carry out the same operations. The only difference is the order in which the operations are performed (and the order in which the data is accessed). Even the rounding errors are the same for all versions (assuming the use of “plain vanilla” floating-point arithmetic).

Rectangular Matrices

Although it is most commonly used for square matrices, LU factorization is defined for rectangular matrices, too. If A\in\mathbb{R}^{m\times n} then the factorization has the form A = LU with L\in\mathbb{R}^{m\times m} lower triangular and U\in\mathbb{R}^{m\times n} upper trapezoidal. The conditions for existence and uniqueness of an LU factorization of A are the same as those for A(1\colon p, 1\colon p), where p = \min(m,n).

Block LU Factorization

Another form of LU factorization relaxes the structure of L and U from triangular to block triangular, with L having identity matrices on the diagonal:

\notag  L = \begin{bmatrix} I     &        &            &    \\                     L_{21} &  I     &            &    \\                     \vdots &        &  \ddots    &    \\                     L_{m1} &  \dots &  L_{m,m-1} &  I \end{bmatrix}, \quad   U = \begin{bmatrix} U_{11} & U_{12} & \dots      &  U_{1m}   \\                            & U_{22} &            &  \vdots   \\                            &        &  \ddots    & U_{m-1,m} \\                            &        &            & U_{mm}    \end{bmatrix}.

Note that U is not, in general, upper triangular.

An example of a block LU factorization is

\notag     A =      \left[ \begin{array}{rr|rr}      0  &  1  &  1  &  1  \\     -1  &  1  &  1  &  1  \\\hline     -2  &  3  &  4  &  2  \\     -1  &  2  &  1  &  3  \\             \end{array}      \right]      =      \left[ \begin{array}{cc|cc}      1  &  0  &  0  &  0  \\      0  &  1  &  0  &  0  \\\hline      1  &  2  &  1  &  0  \\      1  &  1  &  0  &  1  \\             \end{array}      \right]      \left[ \begin{array}{rr|rr}      0  &  1  &  1  &  1  \\     -1  &  1  &  1  &  1  \\\hline      0  &  0  &  1  & -1  \\      0  &  0  & -1  &  1  \\             \end{array}      \right].

LU factorization fails on A because of the zero (1,1) pivot. This block LU factorization corresponds to using the leading 2\times 2 principal submatrix of A to eliminate the elements in the (3\colon 4,1\colon 2) submatrix. In the context of a linear system Ax=b, we have effectively solved for the variables x_1 and x_2 in terms of x_3 and x_4 and then substituted for x_1 and x_2 in the last two equations.

Conditions for the existence of a block LU factorization are analogous to, but less stringent than, those for LU factorization in Theorem 1.

Theorem 3. The matrix A\in\mathbb{R}^{n\times n} has a unique block LU factorization if and only if the first m-1 leading principal block submatrices of A are nonsingular.

The conditions in Theorem 3 can be shown to be satisfied if A is block diagonally dominant by rows or columns.

Note that to solve a linear system Ax = b using a block LU factorization we need to solve Ly = b and Ux = y, but the latter system is not triangular and requires the solution of systems U_{ii}x_i = y_i involving the diagonal blocks of U, which would normally be done by (standard) LU factorization.

Sensitivity

If A has a unique LU factorization then for a small enough perturbation \Delta A an LU factorization A + \Delta A = (L + \Delta L)(U + \Delta U) exists. To first order, this equation is \Delta A = \Delta L U + L \Delta U, which gives

\notag   L^{-1}\Delta A \mskip2mu U^{-1} = L^{-1}\Delta L  + \Delta U \mskip2mu  U^{-1}.

Since \Delta L is strictly lower triangular and \Delta U is upper triangular, we have, to first order,

\notag         \Delta L = L \mskip 1mu \mathrm{tril}( L^{-1}\Delta A U^{-1} ), \quad         \Delta U = \mathrm{triu}( L^{-1}\Delta A U^{-1} )U,

where \mathrm{tril} denotes the strictly lower triangular part and \mathrm{triu} the strictly upper triangular part. Clearly, the sensitivity of the LU factors depends on the inverses of L and U. However, in most situations, such as when we are solving a linear system Ax = b, it is the backward stability of the LU factors, not their sensitivity, that is relevant.

Pivoting and Numerical Stability

Since not all matrices have an LU factorization, we need the option of applying row and column interchanges to ensure that the pivots are nonzero unless the column in question is already in triangular form.

In finite precision computation it is important that computed LU factors \widehat L and \widehat U are numerically stable in the sense that \widehat L \widehat U = A + \Delta A with \|\Delta A\|\le c_n u \|A\|, where c_n is a constant and u is the unit roundoff. For certain matrix properties, such as diagonal dominance by rows or columns, numerical stability is guaranteed, but in general it is necessary to incorporate row interchanges, or row or column interchanges, in order to obtain a stable factorization.

See What Is the Growth Factor for Gaussian Elimination? for details of pivoting strategies and see Randsvd Matrices with Large Growth Factors for some recent research on growth factors.

References

This is a minimal set of references, which contain further useful references within.

Related Blog Posts

Fifty “What Is” Articles

elisa-rYrawNU0wH0-unsplash_square.jpg
Photo by Elisa on Unsplash

Last week I posted the fiftieth in my “What Is” series of articles. I began the series just over a year ago, in March 2020. The original aim was to provide “brief descriptions of important concepts in numerical analysis and related areas, with a focus on topics that arise in my research”, and the articles were meant to be short, widely accessible, and contain a minimum of mathematical symbols, equations, and citations. I have largely kept to these aims, though for some topics there is a lot to say and I have been more lengthy.

The articles are also available in PDF form on GitHub.

Below is a list of all the “What Is” articles published at the time of writing, in alphabetical order.

If there is a topic you would like me to cover, please put it in the comments below.

  1. What Is a Block Matrix?
  2. What Is a Cholesky Factorization?
  3. What Is a Companion Matrix?
  4. What Is a Condition Number?
  5. What Is a Correlation Matrix?
  6. What is a Diagonally Dominant Matrix?
  7. What Is a Fractional Matrix Power?
  8. What Is a Fréchet Derivative?
  9. What Is a Generalized Inverse?
  10. What Is a Hadamard Matrix?
  11. What Is a Householder Matrix?
  12. What Is a Matrix Function?
  13. What Is a Matrix Square Root?
  14. What Is a Matrix?
  15. What Is a Modified Cholesky Factorization?
  16. What Is a (Non)normal Matrix?
  17. What Is a QR Factorization?
  18. What Is a Random Orthogonal Matrix?
  19. What is a Sparse Matrix?
  20. What Is a Symmetric Positive Definite Matrix?
  21. What Is a Unitarily Invariant Norm?
  22. What Is an M-Matrix?
  23. What Is an Orthogonal Matrix?
  24. What Is Backward Error?
  25. What Is Bfloat16 Arithmetic?
  26. What Is Floating-Point Arithmetic?
  27. What Is IEEE Standard Arithmetic?
  28. What is Numerical Stability?
  29. What Is Rounding?
  30. What Is Stochastic Rounding?
  31. What Is the Adjugate of a Matrix?
  32. What is the Cayley–Hamilton Theorem?
  33. What Is the Complex Step Approximation?
  34. What Is the CS Decomposition?
  35. What Is the Gerstenhaber Problem?
  36. What Is the Growth Factor for Gaussian Elimination?
  37. What Is the Hilbert Matrix?
  38. What is the Kronecker Product?
  39. What Is the Log-Sum-Exp Function?
  40. What Is the Matrix Exponential?
  41. What Is the Matrix Logarithm?
  42. What Is the Matrix Sign Function?
  43. What Is the Matrix Unwinding Function?
  44. What Is the Nearest Positive Semidefinite Matrix?
  45. What Is the Nearest Symmetric Matrix?
  46. What is the Polar Decomposition?
  47. What Is the Sherman–Morrison–Woodbury Formula?
  48. What Is the Singular Value Decomposition?
  49. What Is the Softmax Function?
  50. What Is the Sylvester Equation?