What Is the Spectral Radius of a Matrix?

The spectral radius $\rho(A)$ of a square matrix $A\in\mathbb{C}^{n\times n}$ is the largest absolute value of any eigenvalue of $A$ :

$\notag \rho(A) = \max\{\, |\lambda|: \lambda~ \mbox{is an eigenvalue of}~ A\,\}.$

For Hermitian matrices (or more generally normal matrices, those satisfying $AA^* = A^*A$ ) the spectral radius is just the $2$ -norm, $\rho(A) = \|A\|_2$ . What follows is most interesting for nonnormal matrices.

Two classes of matrices for which the spectral radius is known are as follows.

Unitary matrices ( $U^*U=I$ ): these have all their eigenvalues on the unit circle and so $\rho(U) = 1$ .
Nilpotent matrices ( $A^p = 0$ for some positive integer $p$ ): such matrices have only zero eigenvalues, so $\rho(A) = 0$ , even though $\|A\|$ can be arbitrarily large.

The spectral radius of $A$ is not necessarily an eigenvalue, though it is if $A$ is nonnegative (see below).

Bounds

The spectral radius is bounded above by any consistent matrix norm (one satisfying $\|AB\| \le \|A\|\|B\|$ for all $A$ and $B$ ), for example any matrix $p$ -norm.

Theorem 1. For any $A\in\mathbb{C}^{n\times n}$ and any consistent matrix norm,

$\notag \rho(A) \le \|A\|.$

However, the spectral radius is not a norm and it can be zero when the $p$ -norms are of order $1$ , as illustrated by the nilpotent matrix

$\notag A = \begin{bmatrix} 0 & 1 \\ 0 & 0 \end{bmatrix} \;\Rightarrow\; \rho(A) = 0, \quad \|A\|_p = 1, \; 1 \le p \le \infty.$

Limit of Norms of Powers

The spectral radius can be expressed as a limit of norms of matrix powers.

Theorem 2 (Gelfand). For $A\in\mathbb{C}^{n\times n}$ and any matrix norm, $\rho(A) = \lim_{k\to\infty}\|A^k\|^{1/k}$ .

The theorem implies that for large enough $k$ , $\|A^k\|^{1/k}$ is a good approximation to $\rho(A)$ . The use of this formula with repeated squaring of $A$ has been analyzed by Friedland (1991) and Wilkinson (1965). However, squaring requires $O(n^3)$ flops for a full matrix, so this approach is expensive.

Spectral Radius of a Product

Little can be said about the spectral radius of a product of two matrices. The example

$\notag A = \begin{bmatrix} 1 & 0 \\ 0 & 0 \end{bmatrix}, \quad B = \begin{bmatrix} 0 & 0 \\ 0 & 1 \end{bmatrix}, \quad AB = 0$

shows that we can have $\rho(AB) = 0$ when $\rho(A) = \rho(B) = 1$ . On the other hand, for

$\notag A = \begin{bmatrix} 0 & 1 \\ 0 & 1 \end{bmatrix}, \quad B = \begin{bmatrix} 0 & 0 \\ 1 & 1 \end{bmatrix}, \quad AB = \begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix}$

we have $1 = \rho(A)\rho(B) < \rho(AB) = 2$ .

Condition Number Lower Bound

For a nonsingular matrix $A$ , by applying Theorem 1 to $A$ and $A^{-1}$ we obtain

$\notag \begin{aligned}\notag \kappa(A) &= \|A\| \|A^{-1}\| \ge \rho(A) \rho(A^{-1})\\ &= \frac{ \max_i |\lambda_i| } { \min_i |\lambda_i| }. \end{aligned}$

This bound can be very weak for nonnormal matrices.

Power Boundedness

In many situations we wish to know whether the powers of a matrix converge to zero. The spectral radius gives a necessary and sufficient condition for convergence.

Theorem 3. For $A\in\mathbb{C}^{n\times n}$ , $A^k\to0$ as $k\to\infty$ if and only if $\rho(A) < 1$ .

The proof of Theorem 3 is straightforward if $A$ is diagonalizable and it can be done in general using the Jordan canonical form.

Computing the Spectral Radius

In this pseudocode, norm(x) denotes the $2$ -norm of x.

Choose n-vector q_0 such that norm(q_0) = 1.
for k=1,2,...                                         
    z_k = A q_{k-1}            % Apply A.
    q_k = z_k / norm(z_k)      % Normalize.     
    mu_k = q_k^*Aq_k           % Rayleigh quotient.
end

The normalization is to avoid overflow and underflow and the $|\mu_k|$ are approximations to $\rho(A)$ .

If $q_0$ has a nontrivial component in the direction of the eigenvector corresponding to the dominant eigenvalue then the power method converges linearly, with a constant that depends on the ratio of the spectral radius to the magnitude of the next largest eigenvalue in magnitude.

Here is an example where the power method converges quickly, thanks to the substantial ratio of $6.49$ between the spectral radius and next largest eigenvalue in magnitude.

>> rng(1); A = rand(4); eig_abs = abs(eig(A)), q = rand(4,1); 
>> for k = 1:5, q = A*q; q = q/norm(q); mu = q'*A*q; 
>>              fprintf('%1.0f %7.4e\n',k,mu)
>> end
eig_abs =
   1.3567e+00
   2.0898e-01
   2.5642e-01
   1.9492e-01
1 1.4068e+00
2 1.3559e+00
3 1.3580e+00
4 1.3567e+00
5 1.3567e+00

Nonnegative Matrices

For real matrices with nonnegative elements, much more is known about the spectral radius. Perron–Frobenius theory says that if $A\in\mathbb{R}^{n\times n}$ is nonnegative then $\rho(A)$ is an eigenvalue of $A$ and there is a nonnegative eigenvector $x$ such that $Ax = \rho(A)x$ . This is why if you generate a random matrix in MATLAB using the rand function, which produces random matrices with elements on $(0,1)$ , there is always an eigenvalue equal to the spectral radius:

>> rng(1);  A = rand(4); e = sort(eig(A))
e =
  -2.0898e-01
   1.9492e-01
   2.5642e-01
   1.3567e+00

If $A$ is stochastic, that is, it is nonnegative and has unit row sums, then $\rho(A) = 1$ . Indeed $e = [1,1,\dots,1]^T$ is an eigenvector corresponding to the eigenvalue $1$ , so $\rho(A) \ge 1$ , but $\rho(A) \le \|A\|_{\infty} = 1$ , by Theorem 1.

References

Shmuel Friedland, Revisiting Matrix Squaring, Linear Algebra Appl., 154–156, 59-63, 1991.
J. H. Wilkinson, The Algebraic Eigenvalue Problem, Oxford University Press, 1965, pp.~615–617.

What Is a Hessenberg Matrix?

An $n\times n$ upper Hessenberg matrix $H$ has the property that $h_{ij} = 0$ for $i>j+1$ . For $n=4$ , the structure is

$\notag H = \begin{bmatrix} \times & \times & \times & \times \\ \times & \times & \times & \times \\ 0 & \times & \times & \times \\ 0 & 0 & \times & \times \end{bmatrix}.$

$H$ is an upper triangular matrix with an extra subdiagonal.

A lower Hessenberg matrix $H$ is the transpose of an upper Hessenberg matrix. In the rest of this article, the Hessenberg matrices are upper Hessenberg.

Hessenberg matrices play a key role in the QR algorithm for computing the eigenvalues of a general matrix. The first step of the algorithm is to reduce $A$ to Hessenberg form by unitary Householder transformations and then carry out the QR iteration on the Hessenberg form.

Because $H$ is so close to being triangular, its LU factorization and QR factorization can both be computed in $O(n^2)$ flops. For example, the first stage of LU factorization needs to eliminate just one element, by adding a multiple of row $1$ to row $2$ (assuming $h_{11}\ne 0$ ):

$\notag H = \begin{bmatrix} \times & \times & \times & \times \\ \times & \times & \times & \times \\ 0 & \times & \times & \times \\ 0 & 0 & \times & \times \end{bmatrix} \quad \to \quad \begin{bmatrix} \times & \times & \times & \times \\ \fbox{0} & \times & \times & \times \\ 0 & \times & \times & \times \\ 0 & 0 & \times & \times \end{bmatrix}.$

Partial pivoting can be used and adds no extra flops. For QR factorization, Givens rotations are used and just two rows need to be rotated on each step.

If the subdiagonal is nonzero, that is, $h_{i+1,i}\ne 0$ for all $i$ , then $H$ is said to be unreduced. In this case, $\mathrm{rank}(H - \lambda I ) \ge n-1$ for any $\lambda$ , which means that there is one linearly independent eigenvector associated with each eigenvalue of $H$ (equivalently, no eigenvalue appears in more than one Jordan block in the Jordan canonical form of $H$ ).

Unitary Hessenberg Matrices

A unitary Hessenberg matrix $H$ with positive subdiagonal elements can be represented in terms of $2n-1$ real parameters (the Schur parametrization), which are essentially the angles in the Givens QR factorization (note that in the QR factorization $H = QR$ , $R$ is triangular and unitary and hence diagonal). The MATLAB function gallery('randhess',n) generates random real orthogonal Hessenberg matrices by choosing the Schur parameters randomly:

>> n = 4; rng(1), H = gallery('randhess',n)
H =
  -8.6714e-01  -9.2330e-02  -4.8943e-01   3.5172e-04
  -4.9807e-01   1.6075e-01   8.5211e-01  -6.1236e-04
            0   9.8267e-01  -1.8538e-01   1.3322e-04
            0            0  -7.1864e-04  -1.0000e+00
>> check_unitary = norm(H'*H-eye(n))
check_unitary =
   1.1757e-16

Examples

A famous Hessenberg matrix with some interesting properties is the Frank Matrix. For example, the eigenvalues appear in reciprocal pairs $(\lambda,1/\lambda)$ . For $n = 5$ :

>> F = gallery('frank',5)
F =
     5     4     3     2     1
     4     4     3     2     1
     0     3     3     2     1
     0     0     2     2     1
     0     0     0     1     1

>> e = eig(F); sort([e 1./e])
ans =
   9.9375e-02   9.9375e-02
   2.8117e-01   2.8117e-01
   1.0000e+00   1.0000e+00
   3.5566e+00   3.5566e+00
   1.0063e+01   1.0063e+01

Another well-known Hessenberg matrix is the Grcar matrix, a Toeplitz matrix of the form

>> G = gallery('grcar',5)
G =
     1     1     1     1     0
    -1     1     1     1     1
     0    -1     1     1     1
     0     0    -1     1     1
     0     0     0    -1     1

It has interesting pseudospectra.

Companion matrices are upper Hessenberg with a unit subdiagonal:

$\notag C = \begin{bmatrix} a_{n-1} & a_{n-2} & \dots &\dots & a_0 \\ 1 & 0 & \dots &\dots & 0 \\ 0 & 1 & \ddots & & \vdots \\ \vdots & & \ddots & 0 & 0 \\ 0 & \dots & \dots & 1 & 0 \end{bmatrix}.$

References

W. B. Gragg, The QR Algorithm for Unitary Hessenberg Matrices, J. Comp. Appl. Math., 16 (1986), pp. 1-8.
Lloyd Trefethen and Mark Embree, Spectra and Pseudospectra: The Behavior of Nonnormal Matrices and Operators, Princeton University Press, Princeton, NJ, USA, 2005. For the Grcar matrix.

What Is an Invariant Subspace?

A subspace $S$ of $\mathbb{C}^n$ is an invariant subspace for $A\in\mathbb{C}^{n \times n}$ if $AS \subseteq S$ , that is, if $x\in S$ implies $Ax \in S$ .

Here are some examples of invariant subspaces.

$\{0\}$ and $\mathbb{C}^n$ are trivially invariant subspaces of any $A$ .
The null space $\mathrm{null}(A) = \{ x: Ax = 0\}$ is an invariant subspace of $A$ because $x\in\mathrm{null}(A)$ implies $Ax = 0\in\mathrm{null}(A)$ .
If $x$ is an eigenvector of $A$ then $\mathrm{span}(x) = \{\, \alpha x: \alpha\in\mathbb{C} \,\}$ is a $1$ -dimensional invariant subspace, since $A\alpha x = \lambda \alpha x \in S$ , where $\lambda$ is the eigenvalue corresponding to $x$ .
The matrix

$\notag \begin{bmatrix} 1 & 1 & 1 \\ 0 & 1 & 1 \\ 0 & 0 & 1 \end{bmatrix}$

has a one-dimensional invariant subspace $\mathrm{span}(e_1)$ and a two-dimensional invariant subspace $\mathrm{span}(e_1, e_2)$ , where $e_i$ denotes the $i$ th column of the identity matrix.

Let $x_1,x_2, \dots,x_p\in\mathbb{C}^n$ be linearly independent vectors. Then $S = \mathrm{span}(x_1,x_2, \dots,x_p)$ is an invariant subspace of $A$ if and only if $Ax_i\in S$ for $i=1\colon p$ . Writing $X = [x_1,x_2,\dots,x_p]\in\mathbb{C}^{n \times p}$ , this condition can be expressed as

$\notag AX = XB, \qquad(1)$

for some $B\in\mathbb{C}^{p \times p}$ .

If $p = n$ in (1) then $AX = XB$ with $X$ square and nonsingular, so $X^{-1}AX = B$ , that is, $A$ and $B$ are similar.

Eigenvalue Relations

We denote by $\Lambda(A)$ the spectrum (set of eigenvalues) of $A$ and by $A^+$ the pseudoinverse of $A$ .

Theorem.

Let $A\in\mathbb{C}^{n\times n}$ and suppose that (1) holds for some full-rank $X\in\mathbb{C}^{n\times p}$ and $B\in\mathbb{C}^{p\times p}$ . Then $\Lambda(B)\subseteq\Lambda(A)$ . Furthermore, if $(\lambda,x)$ is an eigenpair of $A$ with $x\in\mathrm{range}(X)$ then $(\lambda,X^+x)$ is an eigenpair of $B$ .

Proof. If $(\lambda,z)$ is an eigenpair of $B$ then $AXz = XBz = \lambda Xz$ , and $X z\ne 0$ since the columns of $X$ are independent, so $(\lambda,Xz)$ is an eigenpair of $A$ .

If $(\lambda,x)$ is an eigenpair of $A$ with $x\in\mathrm{range}(X)$ then $x = Xz$ for some $z\ne0$ , and $z = X^+x$ , since $X$ being full rank implies that $X^+X = I$ . Hence

$\notag \lambda x = Ax = AXz = XBz.$

Multiplying on the left by $X^+$ gives $\lambda z = Bz$ , so $(\lambda,z)$ is an eigenpair of $B$ .

Block Triangularization

Assuming that $X$ in (1) has full rank $p$ we can choose $Y\in\mathbb{C}^{p \times (n-p)}$ so that $W = [X,\,Y]$ is nonsingular. Let $W^{-1} = \left[\begin{smallmatrix} G \\ H \end{smallmatrix}\right]$ and note that $W^{-1}W = I$ implies $GX = I$ and $HX = 0$ . We have

$\notag W^{-1}AW = \begin{bmatrix} G \\H \end{bmatrix} [AX,\, AY] = \begin{bmatrix} G \\H \end{bmatrix} [XB,\, AY] = \begin{bmatrix} B & GAY\\ 0 & HAY \end{bmatrix}, \qquad (2)$

which is block upper triangular. This construction is used in the proof of the Schur decomposition with $p=1$ , $x$ an eigenvector of unit $2$ -norm, and $W$ chosen to be unitary.

The Schur Decomposition

Suppose $A\in\mathbb{C}^{n \times n}$ has the Schur decomposition $Q^*AQ = R$ , where $Q$ is unitary and $R$ is upper triangular. Then $AQ = QR$ and writing $Q = [Q_1,\,Q_2]$ , where $Q_1$ is $n\times p$ , and

$\notag R = \begin{bmatrix} R_{11} & R_{12} \\ 0 & R_{22} \\ \end{bmatrix},$

where $R_{11}$ is $p\times p$ , we have $A Q_1 = Q_1 R_{11}$ . Hence $Q_1$ is an invariant subspace of $A$ corresponding to the eigenvalues of $A$ that appear on the diagonal of $R_{11}$ . Since $p$ can take any value from $1$ to $n$ , the Schur decomposition provides a nested sequence of invariant subspaces of $A$ .

Notes and References

Many books on numerical linear algebra or matrix analysis contain material on invariant subspaces, for example

David S. Watkins. Fundamentals of Matrix Computations Third edition, Wiley, New York, USA, 2010.

The ultimate reference is perhaps the book by Gohberg, Lancaster, and Rodman, which has an accessible introduction but is mostly at the graduate textbook or research monograph level.

Israel Gohberg, Peter Lancaster, and Leiba Rodman, Invariant Subspaces of Matrices with Applications, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2006 (unabridged republication of book first published by Wiley in 1986).

What Is a Submatrix?

A submatrix of a matrix is another matrix obtained by forming the intersection of certain rows and columns, or equivalently by deleting certain rows and columns. More precisely, let $A$ be an $m\times n$ matrix and let $1\le i_1 < i_2 < \cdots < i_p \le m$ and $1\le j_1 < j_2 < \cdots < j_q \le n$ . Then the $p\times q$ matrix $B$ with $b_{rs} = a_{i_rj_s}$ is the submatrix of $A$ comprising the elements at the intersection of the rows indexed by $i_1,\dots,i_p$ and the columns indexed by $j_1,\dots,j_q$ . For example, for the matrix

$\notag \begin{bmatrix} ~\fbox{1} & \fbox{2} & 3\mskip2mu \\ ~~4 & 5 & 6\mskip2mu~ \\ ~\fbox{7} & \fbox{8} & 9~\mskip2mu \end{bmatrix} = \begin{bmatrix} ~\fbox{1} & \fbox{2} & 3\mskip2mu~ \\[1pt] ~\fbox{4} & 5 & 6\mskip2mu~ \\ 7 & \fbox{8} & 9\mskip2mu \end{bmatrix},$

shown with four elements highlighted in two different ways,

$\notag \begin{bmatrix} 1 & 2 \\ 7 & 8 \\ \end{bmatrix}$

is a submatrix (the intersection of rows $1$ and $3$ and columns $1$ and $2$ , or what is left after deleting row $2$ and column $3$ ), but

$\notag \begin{bmatrix} 1 & 2 \\ 4 & 8 \\ \end{bmatrix}$

is \emph{not} a submatrix.

Submatrices include the $mn$ matrix elements and the matrix itself, but there are many of intermediate size: an $m\times n$ matrix has $(2^m-1)(2^n-1)$ submatrices in total (counting both square and nonsquare submatrices).

If $p = q$ and $i_k = j_k$ , $k = 1\colon p$ , then $B$ is a principal submatrix of $A$ , which is a submatrix symmetrically located about the diagonal. If, in addition, $i_k = k$ , $k=1\colon p$ , then $B$ is a leading principal submatrix of $A$ , which is one situated in the top left corner of $A$ .

The determinant of a square submatrix is called a minor. The Laplace expansion of the determinant expresses the determinant as a weighted sum of minors.

The Colon Notation

In various programming languages, notably MATLAB, and in numerical linear algebra, a colon notation is used to denote submatrices consisting of contiguous rows and columns.

For integers $p$ and $q$ we denote by $p\colon q$ the sequence $p,p+1,\dots,q$ . Thus $i = 1\colon n$ is another way of writing $i = 1,2,\dots,n$ .

We write $A(p\colon q,r \colon s)$ for the submatrix of $A$ comprising the intersection of rows $p$ to $q$ and columns $r$ to $s$ , that is,

$A(p\colon q,r \colon s) = \begin{bmatrix}a_{pr} & \cdots & a_{ps} \\ \vdots & \ddots & \vdots\cr a_{qr} & \cdots & a_{qs} \end{bmatrix}.$

We can think of $A(p\colon q,r \colon s)$ as a projection of $A$ using the corresponding rows and columns of the identity matrix:

$\notag A(p\colon q,r \colon s) = I(p\colon q,:) \,A \,I(:,r\colon s).$

As special cases, $A(k,\colon)$ denotes the $k$ th row of $A$ and $A(\colon,k)$ the $k$ th column of $A$ .

Here are some examples of using the colon notation to extract submatrices in MATLAB. Rows and columns can be indexed by a range using the colon notation or by specifying the required indices in a vector. The matrix used is from the Anymatrix collection.

>> A = anymatrix('core/beta',5)
A =
     1     2     3     4     5
     2     6    12    20    30
     3    12    30    60   105
     4    20    60   140   280
     5    30   105   280   630

>> A(3:4, [2 4 5]) % Rectangular submatrix.
ans =
    12    60   105
    20   140   280

>> A(1:2,4:5)      % Square, but nonprincipal, submatrix.
ans =
     4     5
    20    30

>> A([3 5],[3 5])  % Principal submatrix.
ans =
    30   105
   105   630

Block Matrices

Submatrices are intimately associated with block matrices, which are matrices in which the elements are themselves matrices. For example, a $4\times 4$ matrix $A$ can be regarded as a block $2\times 2$ matrix, where each element is a $2\times 2$ submatrix of $A$ :

$\notag A = \left[\begin{array}{cc|cc} a_{11} & a_{12} & a_{13} & a_{14}\\ a_{21} & a_{22} & a_{23} & a_{24}\\\hline a_{31} & a_{32} & a_{33} & a_{34}\\ a_{41} & a_{42} & a_{43} & a_{44}\\ \end{array}\right] = \begin{bmatrix} A_{11} & A_{12} \\ A_{21} & A_{22} \end{bmatrix},$

where

$\notag A_{11} = A(1\colon2,1\colon2) = \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix},$

and likewise for the other three blocks.

What Is a Flop?

A flop is one of the elementary arithmetic operations $+$ , $-$ , $*$ , $/$ carried out on floating-point numbers. For example, evaluating the expression $(b - ax)/c$ takes three flops. A square root, which occurs infrequently in numerical computation, is also counted as one flop.

As an example, the computation of the inner product $x^Ty$ of two $n$ -vectors $x$ and $y$ can be written

s = x(1)*y(1)
for i = 2:n
    s = s + x(i)*y(i)
end

which requires $n$ multiplications and $n-1$ additions, or $2n-1$ flops. A matrix multiplication $AB$ of two $n\times n$ matrices involves computing the inner products of every row of $A$ with every column of $B$ , that is $n^2$ inner products, costing $2n^3 - n^2$ flops. As we are usually interested in flop counts for large dimensions we retain only the highest order terms, so we regard $AB$ as costing $2n^3$ flops.

The number of flops required by a numerical computation is one measure of its cost. However, it ignores the cost of data movement, which can be as large as, or even larger, than the cost of floating-point arithmetic for large computations implemented on modern machines with hierarchical memories. Nevertheless, flop counts provide a useful way to compare the cost of competing algorithms when most of the flops occur in similar types of operations, such as matrix multiplications, and they can be used to choose between algorithm variants (López et al., 2022).

This table summarizes some flop counts, where $\alpha$ is a scalar, $x,y$ are $n$ -vectors, and $A,B$ are $n\times n$ matrices, and $A^{-1}$ is computed via LU factorization.

Computation	Cost
$x + \alpha y$	$2n$ flops
$x^Ty$	$2n$ flops
$Ax$	$2n^2$ flops
$AB$	$2n^3$ flops
$A^{-1}$	$2n^3$ flops

As the table suggests, most standard problems involving $n\times n$ matrices can be solved with a cost of order $n^3$ flops or less, so the interest is in the exponent ( $1$ , $2$ , or $3$ ) of the dominant term and the multiplicative constant. For $m\times n$ matrices there can be several potentially dominant terms $m^kn^\ell$ and comparing competing algorithms is more complicated.

Early Usage

Moler and Van Loan give a different definition of “flop” in their classic paper “Nineteen Dubious Ways to Compute the Exponential of a Matrix” (1978), and their definition was adopted in the flop command in the original Fortran version of MATLAB, which counts the number of flops executed in the most recent command or since MATLAB started. In the 1981 MATLAB Users’ Guide, Cleve Moler explains that

One flop is one execution of a Fortran statement like S = S + X(I)*Y(I) or Y(I) = Y(I) + T*X(I). In other words, it consists of one floating point multiplication, together with one floating point addition and the associated indexing and storage reference operations.

This original definition attempted to account for indexing and storage costs in the computer implementation of an algorithm as well as the cost of the floating-point arithmetic. It was motivated by the fact that in linear algebra computations most arithmetic appears in statements of the form s = s + x*y, which appear in evaluating inner products and in adding one multiple of a vector to another.

In the LINPACK Users’ Guide (1979, App. B) flops were used to normalize timing data, but the word “flop” was not used.

The widespread adoption of the flop was ensured by its use in Golub and Van Loan’s book Matrix Computations (1981). In the 1989 second edition of the book a flop was redefined as in our first paragraph and this definition quickly became the standard usage. The Fortran statements given by Moler now involve two flops.

The MATLAB flop function survived until around 2000, when MATLAB started using LAPACK in place of LINPACK for its core linear algebra computations and it became no longer feasible to keep track of the number of flops executed.

It is interesting to note that an operation of the form S + X(I)*Y(I) that was used in the original definition of flop is now supported in the hardware of certain processors. The operation is called a fused multiply-add and is done in the same time as one multiplication or addition and with one rounding error.

Flops as a Measure of Computer Performance

Another usage of “flop”, in the plural, is in measuring computer performance, with “flops” meaning floating-point operations per second and having a prefix specifying a power of ten. Thus we have the terms megaflops, gigaflops, through to exaflops ( $10^{18}$ floating-point operations per second) for today’s fastest computers.

The earliest citation given in the Oxford English Dictionary for this use of the word “flops” is in a 1977 paper by Calahan, who writes

The most common vector performance measure is the number of floating point operations per second (FLOPS), obtained by dividing the number of floating point operations—known from the algorithmic complexity—by the computation time.

The term “MFLOPS” for “million floating point operations per second” is used in an earlier Cray-1 evaluation report (Keller, 1976).

If you are aware of any earlier references please put them in the comments below.

Dictionary Definitions

The word “flop” appears in general dictionaries, but there is some variation in whether it appears under “flop” or “flops”, in capitalization, and whether it means an operation or an execution rate.

Dictionary	Term	Definition
Oxford English Dictionary (online, 2023)	FLOP	“A floating-point operation per second”
Concise Oxford English Dictionary (11th ed., 2004)	flop	“Floating-point operations per second”
The Chambers Dictionary (13th ed., 2014)	flop	“A floating-point operation”
Collins English Dictionary (13th ed., 2018)	flops or FLOPS	“Floating-point operations per second”
The American Heritage Dictionary of the English Language (5th ed., 2018)	flops or flop	“A measure of the speed of a computer in operations per second, especially operations involving floating-point numbers”
Merriam-Webster (online, 2023)	flops	“A unit of measure for calculating the speed of a computer equal to one floating-point operation per second”

Notes and References

I thank Jack Dongarra, Cleve Moler, and Charlie Van Loan for comments on the history of flops.

D. A. Calahan, Algorithmic and Architectural Issues Related to Vector Processors, in: Large Engineering Systems. Proceedings of the International Symposium held at the University of Manitoba, Winnipeg, Manitoba, Canada August 9–12, 1976, Alvin Wexler, ed., Pergamon Press, Oxford, 1977.
T. Keller, CRAY-1 Evaluation. Final Report, Report LA-6456-MS, Los Alamos Scientific Laboratory, Los Alamos, New Mexico, USA, December 1976.
Francisco López, Lars Karlsson, and Paolo Bientinesi, FLOPs as a Discriminant for Dense Linear Algebra Algorithms, in Proceedings of the 51st International Conference on Parallel Processing, ACM Press, 2022.
Cleve B. Moler and Charles F. Van Loan, Nineteen Dubious Ways to Compute the Exponential of a Matrix, SIAM Rev. 20(4), 801–836. 1978.

What Is the Pseudoinverse of a Matrix?

The pseudoinverse is an extension of the concept of the inverse of a nonsingular square matrix to singular matrices and rectangular matrices. It is one of many generalized inverses, but the one most useful in practice as it has a number of special properties.

The pseudoinverse of a matrix $A\in\mathbb{C}^{m\times n}$ is an $n \times m$ matrix $X$ that satisfies the Moore–Penrose conditions

$\notag \begin{array}{rcrc} \mathrm{(1)} & AXA = A, \; & \mathrm{(2)} & XAX=X,\\ \mathrm{(3)} & (AX)^* = AX, \; & \mathrm{(4)} & (XA)^* = XA. \end{array}$

Here, the superscript $*$ denotes the conjugate transpose. It can be shown that there is a unique $X$ satisfying these equations. The pseudoinverse is denoted by $A^+$ ; some authors write $A^\dagger$ .

The most important property of the pseudoinverse is that for any system of linear equations $Ax = b$ (overdetermined or underdetermined) $x = A^+b$ minimizes $\|Ax - b\|_2$ and has the minimum $2$ -norm over all minimizers. In other words, the pseudoinverse provides the minimum $2$ -norm least squares solution to $Ax = b$ .

The pseudoinverse can be expressed in terms of the singular value decomposition (SVD). If $A = U\Sigma V^*$ is an SVD, where the $m\times m$ matrix $U$ and $n\times n$ matrix $V$ are unitary and $\Sigma = \mathrm{diag}(\sigma_1,\dots , \sigma_p)$ with $\sigma_1 \ge \sigma_2 \ge \cdots \ge \sigma_r > \sigma_{r+1} = \cdots =\sigma_p = 0$ (so that $\mathrm{rank}(A) = r$ ) with $p = \min(m,n)$ , then

$\notag A^+ = V\mathrm{diag}(\sigma_1^{-1},\dots,\sigma_r^{-1},0,\dots,0)U^*, \qquad (1)$

where the diagonal matrix is $n\times m$ . This formula gives an easy way to prove many identities satisfied by the pseudoinverse. In MATLAB, the function pinv computes $A^+$ using this formula.

From the Moore–Penrose conditions or (1) it can be shown that $(A^+)^+ = A$ and $(A^*)^+ = (A^+)^*$ .

For full rank $A$ we have the concise formulas

$\notag A^+ = \begin{cases} (A^*A) ^{-1}A^*, & \textrm{if}~\mathrm{rank}(A) = n \le m, \\ A^*(AA^*)^{-1}, & \textrm{if}~ \mathrm{rank}(A) = m \le n. \end{cases} \qquad (2)$

Consequently,

$\notag A^+A = I_n ~~\mathrm{if}~ \mathrm{rank}(A) = n, \qquad AA^+ = I_m ~~\mathrm{if}~ \mathrm{rank}(A) = m.$

Some special cases are worth noting.

The pseudoinverse of a zero $m\times n$ matrix is the zero $n\times m$ matrix.
The pseudoinverse of a nonzero vector $x\in\mathbb{C}^n$ is $x^*/(x^*x)$ .
For $x,y\in\mathbb{C}^n$ , $(xy^*)^+ = (y^*)^+ x^+$ and if $x$ and $y$ are nonzero then $(xy^*)^+ = yx^*/ (x^*x\cdot y^*y)$ .
The pseudoinverse of a Jordan block with eigenvalue $0$ is the transpose:

$\notag \begin{bmatrix} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \\ \end{bmatrix}^+ = \begin{bmatrix} 0 & 0 & 0 \\ 1 & 0 & 0 \\ 0 & 1 & 0 \end{bmatrix}.$

The pseudoinverse differs from the usual inverse in various respects. For example, the pseudoinverse of a triangular matrix is not necessarily triangular (here we are using MATLAB with the Symbolic Math Toolbox):

>> A = sym([1 1 1; 0 0 1; 0 0 1]), X = pinv(A)
A =
[1, 1, 1]
[0, 0, 1]
[0, 0, 1]
X =
[1/2, -1/4, -1/4]
[1/2, -1/4, -1/4]
[  0,  1/2,  1/2]

Furthermore, it is not generally true that $(AB)^+ = B^+A^+$ for $A\in\mathbb{C}^{m\times n}$ and $B\in\mathbb{C}^{n\times p}$ . A sufficient condition for this equality to hold is that $\mathrm{rank}(A) = \mathrm{rank}(B) = n$ .

It is not usually necessary to compute the pseudoinverse, but if it is required it can be obtained using (1) or (2) or from the Newton–Schulz iteration

$\notag X_{k+1} = 2X_k - X_kAX_k, \quad X_0 = \alpha A^*,$

for which $X_k \to A^+$ as $k\to\infty$ if $0 < \alpha < 2/\|A\|_2^2$ . The convergence is at an asymptotically quadratic rate. However, about $2\log_2\kappa_2(A)$ iterations are required to reach the asymptotic phase, where $\kappa_2(A) = \|A\|_2 \|A^+\|_2$ , so the iteration is slow to converge when $A$ is ill conditioned.

Notes and References

The pseudoinverse was first introduced by Eliakim Moore in 1920 and was independently discovered by Roger Penrose in 1955. For more on the pseudoinverse see, for example, Ben-Israel and Greville (2003) or Campbell and Meyer (2009). For analysis of the Newton–Schulz iteration see Pan and Schreiber (1991).

Adi Ben-Israel and Thomas N. E. Greville, Generalized Inverses: Theory and Applications, second edition, Springer-Verlag, New York, 2003
Stephen Campbell and Carl Meyer, Generalized Inverses of Linear Transformations, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2009. published (Originally published by Pitman in 1979.)
Victor Pan and Robert Schreiber, An Improved Newton Iteration for the Generalized Inverse of a Matrix, with Applications, SIAM J. Sci. Statist. Comput. 12 (5), 1109–1130, 1991.

What Is the Numerical Range of a Matrix?

The numerical range of a matrix $A\in\mathbb{C}^{n\times n}$ , also known as the field of values, is the set of complex numbers

$W(A) = \biggl\{\, \displaystyle\frac{z^*Az}{z^*z}: 0\ne z\in\mathbb{C}^n \,\biggr\}.$

The set $W(A)$ is compact and convex (a nontrivial property proved by Toeplitz and Hausdorff), and it contains all the eigenvalues of $A$ . For normal matrices it is the convex hull of the eigenvalues. For a Hermitian matrix, $W(A)$ is a segment of the real axis, while for a skew-Hermitian matrix it is a segment of the imaginary axis.

The following figure plots in blue the boundaries of the the numerical ranges of nine $16\times 16$ matrices, with the eigenvalues shown as black dots. They were plotted using the function fv in the Matrix Computation Toolbox. The matrices are from Anymatrix.

A method for computing the boundary of the numerical range is based on the observation that for any $z\in\mathbb{C}$ ,

$\notag z^*Az = \underbrace{z^* \frac{1}{2}(A+A^*) z}_{\mathrm{\phantom{1234}real\phantom{1234}}} + \underbrace{z^* \frac{1}{2}(A-A^*) z}_{\mathrm{pure~imaginary}} \qquad (1)$

This implies that the real part of $z^*Az/z^*z$ lies between the largest and smallest eigenvalues of the Hermitian matrix $(A+A^*)/2$ , which define a vertical strip in which the numerical range lies. Since $W(\mathrm{e}^{\mathrm{i}\theta} A) = \mathrm{e}^{\mathrm{i}\theta} W(A)$ , we can apply the same reasoning to the rotated matrix $A(\theta) = \mathrm{e}^{\mathrm{i}\theta} A$ , and taking a range of $\theta\in[0,\pi]$ we obtain an approximation the boundary of the numerical range.

The quantity

$\notag \omega(A) = \max \bigl\{\, \mathrm{Re}\, w : w \in W(A) \,\bigr\}$

is called the numerical abscissa, and by (1), $\omega(A)$ is the largest eigenvalue of the Hermitian matrix $(A+A^*)/2$ . The numerical abscissa determines the rate of growth of $\|\mathrm{e}^{tA}\|$ for small positive $t$ .

Associated with the numerical range is the numerical radius

$\notag r(A) = \max\{\, |w| : w \in W(A) \,\}.$

Note that $r(A^*) = r(A)$ . Also, $r(A) \ge \rho(A)$ , where $\rho$ is the spectral radius (the largest absolute value of any eigenvalue), since $W(A)$ contains the eigenvalues of $A$ .

The numerical radius differs by at most a factor $2$ from the $2$ -norm:

$\notag \frac{1}{2} \|A\|_2 \le r(A) \le \|A\|_2. \qquad (2)$

When $A$ is normal, $r(A) = \|A\|_2$ .

The numerical radius is a matrix norm, but not a consistent norm (that is, $r(AB) \le r(A)r(B)$ does not hold in general). However, it is it true that

$\notag r(A^k) \le r(A)^k, \quad k\ge 1.$

Combining this with with the lower bound in (2) gives

$\notag \|A^k\|_2 \le 2 r(A)^k, \quad k \ge 1,$

so if we know $r(A)$ then we can bound $\|A^k\|_2$ for all $k$ .

The numerical radius can be characterized as the solution of an optimization problem over the Hermitian matrices:

$\notag r(A) = - \max\biggl\{\, \lambda_{\min}\biggl( \begin{bmatrix} Z\!\! & A \\ A^*\!\! & -Z \end{bmatrix} \biggr): Z=Z^*\in\mathbb{C}^{n\times n} \,\biggr\}.$

Notes and References

For proofs of the results given here see Horn and Johnson (2013) or Horn and Johnson (1991), and see the latter for details of the algorithm for computing the numerical range. See Benzi (2020) for a discussion of applications of the numerical range in numerical analysis.

Michele Benzi, Some Uses of the Field of Values in Numerical Analysis, Bollettino dell’Unione Matematica Italiana 14, 159–177, 2020.
Roger Horn and Charles Johnson, Topics in Matrix Analysis, Cambridge University Press, 1991. Chapter 1.
Roger A. Horn and Charles R. Johnson, Matrix Analysis, second edition, Cambridge University Press, 2013. My review of the second edition.

What Is the Schur Complement of a Matrix?

For an $n\times n$ matrix

$\notag A = \begin{bmatrix} A_{11} & A_{12} \\ A_{21} & A_{22} \end{bmatrix} \qquad (1)$

with nonsingular $(1,1)$ block $A_{11}$ the Schur complement is $A_{22} - A_{21}A_{11}^{-1}A_{12}$ . It is denoted by $A/A_{11}$ . The block with respect to which the Schur complement is taken need not be the $(1,1)$ block. Assuming that $A_{22}$ is nonsingular, the Schur complement of $A_{22}$ in $A$ is $A_{11} - A_{12}A_{22}^{-1}A_{21}$ .

More generally, for an index vector $\alpha = [i_1,i_2,\dots,i_k]$ , where $1\le i_1 < i_2 < \cdots < i_p \le n$ , the Schur complement of $A(\alpha,\alpha)$ in $A$ is $A(\alpha^c,\alpha^c) - A(\alpha^c,\alpha) A(\alpha,\alpha)^{-1} A(\alpha,\alpha^c)$ , where $\alpha^c$ is the complement of $\alpha$ (the vector of indices not in $\alpha$ ). Most often, it is the Schur complement of $A_{11}$ in $A$ that is of interest, and the general case can be reduced to it by row and column permutations that move $A(\alpha,\alpha)$ to the $(1,1)$ block of $A$ .

Schur Complements in Gaussian Elimination

The reduced submatrices in Gaussian elimination are Schur complements. Write an $n\times n$ matrix $A$ with nonzero $(1,1)$ element $\alpha$ as

$\notag A = \begin{bmatrix} \alpha & a^* \\ b & C \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ b/\alpha & I_{n-1} \end{bmatrix} \begin{bmatrix} \alpha & a^* \\ 0 & C - ba^*/\alpha \end{bmatrix}.$

This factorization represents the first step of Gaussian elimination.

After $k$ stages of Gaussian elimination we have computed the following factorization, in which the $(1,1)$ blocks are $k\times k$ :

$\notag \begin{bmatrix} L_{11} & 0 \\ L_{21} & I \end{bmatrix} \begin{bmatrix} A_{11} & A_{12} \\ A_{21} & A_{22} \end{bmatrix} = \begin{bmatrix} U_{11} & U_{12} \\ 0 & S \end{bmatrix},$

where the first matrix on the left is the product of the elementary transformations that reduce the first $k$ columns to upper triangular form. Equating $(2,1)$ and $(2,2)$ blocks in this equation gives $L_{21}A_{11} + A_{21} = 0$ and $L_{21}A_{12} + A_{22} = S$ . Hence $S= A_{22} + L_{21}A_{12} = A_{22} + (-A_{21}A_{11}^{-1})A_{12}$ , which is the Schur complement of $A_{11}$ in $A$ .

The next step of the elimination, which zeros out the first column of $S$ below the diagonal, succeeds as long as the $(1,1$ ) element of $S$ is nonzero (or if the whole first column of $S$ is zero, in which case there is nothing to do, and $A$ is singular in this case).

For a number of matrix structures, such as

Hermitian (or real symmetric) positive definite matrices,
totally positive matrices,
matrices diagonally dominant by rows or columns,
$M$ -matrices,

one can show that the Schur complement inherits the structure. For these four structures the $(1,1)$ element of the matrix is nonzero, so the success of Gaussian elimination (or equivalently, the existence of an LU factorization) is guaranteed. In the first three cases the preservation of structure is also the basis for a proof of the numerical stability of Gaussian elimination.

Inverse of a Block Matrix

The Schur complement arises in formulas for the inverse of a block $2\times 2$ matrix. If $A_{11}$ is nonsingular, we can write

$\notag A = \begin{bmatrix}A_{11} & A_{12} \\ A_{21} & A_{22} \end{bmatrix} = \begin{bmatrix}I & 0 \\ A_{21}A_{11}^{-1} & I \end{bmatrix} \begin{bmatrix}A_{11} & A_{12} \\ 0 & S \end{bmatrix}, \quad S = A_{22} - A_{21}A_{11}^{-1}A_{12}. \qquad (2)$

If $S$ is nonsingular then inverting gives

$\notag \begin{aligned} A^{-1} &= \begin{bmatrix}A_{11}^{-1} & -A_{11}^{-1}A_{12}S^{-1} \\ 0 & S^{-1} \end{bmatrix} \begin{bmatrix}I & 0 \\ -A_{21}A_{11}^{-1} & I \end{bmatrix} \notag\\ &= \begin{bmatrix}A_{11}^{-1} + A_{11}^{-1}A_{12}S^{-1}A_{21}A_{11}^{-1} & -A_{11}^{-1}A_{12}S^{-1} \\ -S^{-1}A_{21}A_{11}^{-1} & S^{-1} \end{bmatrix}. \end{aligned}$

So $S^{-1}$ is the $(2,2)$ block of $A^{-1}$ .

One can obtain an analogous formula for $A^{-1}$ in terms of the Schur complement of $A_{22}$ in $A$ , in which the inverse of the Schur complement is the $(1,1)$ block of $A^{-1}$ . More generally, we have $(A/A(\alpha,\alpha))^{-1} = A^{-1}(\alpha^c,\alpha^c)$ .

Test for Positive Definiteness

For Hermitian matrices the Schur complement provides a test for positive definiteness. Suppose

$\notag A = \begin{bmatrix} A_{11} & A_{12} \\ A_{12}^* & A_{22} \end{bmatrix}$

with $A_{11}$ positive definite. The factorization

$\notag A = \begin{bmatrix} I & 0 \cr A_{12}^*A_{11}^{-1} & I \end{bmatrix} \begin{bmatrix} A_{11} & 0 \cr 0 & S \end{bmatrix} \begin{bmatrix} I & A_{11}^{-1}A_{12} \cr 0 & I \end{bmatrix}, \quad S = A_{22} - A_{12}^*A_{11}^{-1}A_{12},$

shows that $A$ is congruent to the block diagonal matrix $\mathrm{diag}(A_{11},S)$ , and since congruences preserve inertia (the number of positive, zero, and negative eigenvalues), $A$ is positive definite if and only if $\mathrm{diag}(A_{11},S)$ is positive definite, that is, if and only if $S$ is positive definite. The same equivalence holds with “definite” replace by “semidefinite” (with $A_{11}$ still required to be positive definite).

Generalized Schur Complement

For $A$ in (1) with a possibly singular $(1,1)$ block $A_{11}$ we can define the generalized Schur complement $A_{22} - A_{21}A_{11}^+A_{12}$ , where “+” denotes the Moore-Penrose pseudo-inverse. The generalized Schur complement is useful in the context of Hermitian positive semidefinite matrices, as the following result shows.

Theorem 1.

If

$\notag A = \begin{bmatrix} A_{11} & A_{12} \\ A_{12}^* & A_{22} \end{bmatrix} \in\mathbb{C}^{n\times n}$

is Hermitian then $A$ is positive semidefinite if and only if $\mathrm{range}(A_{12}) \subseteq \mathrm{range}(A_{11})$ and $A_{11}$ and the generalized Schur complement $S = A_{22} - A_{12}^*A_{11}^+ A_{12}$ are both positive semidefinite.

If $A_{11}$ is positive definite then the range condition is trivially satisfied.

Notes and References

The term “Schur complement” was coined by Haynsworth in 1968, thereby focusing attention on this important form of matrix and spurring many subsequent papers that explore its properties and applications. The name “Schur” was chosen because a 1917 determinant lemma of Schur says that if $A_{11}$ is nonsingular then

$\notag \det\left( \begin{bmatrix} A_{11} & A_{12} \\ A_{21} & A_{22} \end{bmatrix} \right) = \det(A_{11}) \det(A_{22} - A_{21}A_{11}^{-1} A_{12}) = \det(A_{11}) \det(A/A_{11}),$

which is obtained by taking determinants in (2).

For much more about the Schur complement see Zhang (2005) or Horn and Johnson (2013).

Emilie V. Haynsworth, Determination of the Inertia of a Partitioned Matrix, Linear Algebra Appl. 1(1), 73–81, 1968.
Roger A. Horn and Charles R. Johnson, Matrix Analysis, second edition, Cambridge University Press, 2013. My review of the second edition.
Fuzhen Zhang, editor, The Schur Complement and Its Applications. Springer-Verlag, New York, 2005, xvi+295 pp.

What Is the Minimal Polynomial of a Matrix?

For a polynomial

$\notag \phi(t) = a_kt^k + \cdots + a_1t + a_0,$

where $a_k\in\mathbb{C}$ for all $k$ , the matrix polynomial obtained by evaluating $\phi$ at $A\in\mathbb{C}^{n \times n}$ is

$\notag \phi(A) = a_kA^k + \cdots + a_1A + a_0 I.$

(Note that the constant term is $a_0 A^0 = a_0 I$ ). The polynomial $\phi$ is monic if $a_k = 1$ .

The characteristic polynomial of a matrix $A\in\mathbb{C}^{n \times n}$ is $p(t) = \det(t I - A)$ , a degree $n$ monic polynomial whose roots are the eigenvalues of $A$ . The Cayley–Hamilton theorem tells us that $p(A) = 0$ , but $p$ may not be the polynomial of lowest degree that annihilates $A$ . The monic polynomial $\psi$ of lowest degree such that $\psi(A) = 0$ is the minimal polynomial of $A$ . Clearly, $\psi$ has degree at most $n$ .

The minimal polynomial divides any polynomial $\phi$ such that $\phi(A) = 0$ , and in particular it divides the characteristic polynomial. Indeed by polynomial long division we can write $\phi(t) = q(t)\psi(t) + r(t)$ , where the degree of $r$ is less than the degree of $\psi$ . Then

$\notag 0 = \phi(A) = q(A) \psi(A) + r(A) = r(A).$

If $r\ne 0$ then we have a contradiction to the minimality of the degree of $\psi$ . Hence $r = 0$ and so $\psi$ divides $\phi$ .

The minimal polynomial is unique. For if $\psi_1$ and $\psi_2$ are two different monic polynomials of minimum degree $m$ such that $\psi_i(A) = 0$ , $i = 1,2$ , then $\psi_3 = \psi_1 - \psi_2$ is a polynomial of degree less than $m$ and $\psi_3(A) = \psi_1(A) - \psi_2(A) = 0$ , and we can scale $\psi_3$ to be monic, so by the minimality of $m$ , $\psi_3 = 0$ , or $\psi_1 = \psi_2$ .

If $A$ has distinct eigenvalues then the characteristic polynomial and the minimal polynomial are equal. When $A$ has repeated eigenvalues the minimal polynomial can have degree less than $n$ . An extreme case is the identity matrix, for which $\psi(t) = t - 1$ , since $\psi(I) = I - I = 0$ . On the other hand, for the Jordan block

$\notag J = \begin{bmatrix} \lambda & 1 & 0 \\ 0 & \lambda & 1 \\ 0 & 0 & \lambda \end{bmatrix}$

the characteristic polynomial and the minimal polynomial are both equal to $(\lambda - 1)^3$ .

The minimal polynomial has degree less than $n$ when in the Jordan canonical form of $A$ an eigenvalue appears in more than one Jordan block. Indeed it is not hard to show that the minimal polynomial can be written

$\notag \psi(t) = \displaystyle\prod_{i=1}^s (t-\lambda_i)^{n_i},$

where $\lambda_1,\lambda_2,\dots,\lambda_s$ are the distinct eigenvalues of $A$ and $n_i$ is the dimension of the largest Jordan block in which $\lambda_i$ appears. This expression is composed of linear factors (that is, $n_i = 1$ for all $i$ ) if and only if $A$ is diagonalizable.

To illustrate, for the matrix

$\notag A = \left[\begin{array}{cc|c|c|c} \lambda & 1 & & & \\ & \lambda & & & \\\hline & & \lambda & & \\\hline & & & \mu & \\\hline & & & & \mu \end{array}\right]$

in Jordan form (where blank elements are zero), the minimal polynomial is $\psi(t) = (t-\lambda)^2(t-\mu)$ , while the characteristic polynomial is $p(t) = (t-\lambda)^3(t-\mu)^2$ .

What is the minimal polynomial of a rank- $1$ matrix, $A = xy^* \ne 0$ ? Since $A^2 = (y^*x) xy^*$ , we have $q(A) = 0$ for $q(t) = t^2 - (y^*x) t = t^2 - \mathrm{trace}(A) t$ . For any linear polynomial $p(t) = t - a_0$ , $p(A) = xy^* - a_0 I$ , which is nonzero since $xy^*$ has rank $1$ and $I$ has rank $n$ . Hence the minimal polynomial is $q$ .

The minimal polynomial is important in the theory of matrix functions and in the theory of Krylov subspace methods. One does not normally need to compute the minimal polynomial in practice.

What Is Iterative Refinement?

Iterative refinement is a method for improving the quality of an approximate solution $\widehat{x}$ to a linear system of equations $Ax = b$ , where $A$ is an $n\times n$ nonsingular matrix. The basic iterative refinement algorithm is very simple.

Compute the residual $r = b - A\widehat{x}$ .
Solve $Ad = r$ .
Update $\widehat{x} \gets \widehat{x} + d$ .
Repeat from step 1 if necessary.

At first sight, this algorithm seems as expensive as solving for the original $\widehat{x}$ . However, usually the solver is LU factorization with pivoting, $A = LU$ (where we include the permutations in $L$ ). Most of the work is in the LU factorization, which costs $O(n^3)$ flops, and each iteration requires a multiplication with $A$ for the residual and two substitutions to compute $d$ , which total only $O(n^2)$ flops. If the refinement converges quickly then it is inexpensive.

Turning to the error, with a stable LU factorization the initial $\widehat{x}$ computed in floating-point arithmetic of precision $u$ satisfies (omitting constants)

$\notag \displaystyle\frac{\|x- \widehat{x}\|}{\|x\|}\lesssim \kappa_{\infty}(A) u, \qquad (1)$

where $\kappa_{\infty}(A) = \|A\|_{\infty} \|A^{-1}\|_{\infty} \ge 1$ is the matrix condition number in the $\infty$ -norm. We would like refinement to produce a solution accurate to precision $u$ :

$\notag \displaystyle\frac{\|x- \widehat{x}\|}{\|x\|}\approx u. \qquad (2)$

But if the solver cannot compute the initial $\widehat{x}$ accurately when $A$ is ill-conditioned why should it be able to produce an update $d$ that improves $\widehat{x}$ ?

The simplest answer is that when iterative refinement was first used on digital computers the residual $r$ was computed at twice the working precision, which could be done at no extra cost in the hardware. If $\widehat{x}$ is a reasonable approximation then we expect cancellation in forming $r = b - A\widehat{x}$ , so using extra precision in forming $r$ ensures that $r$ has enough correct digits to yield a correction $d$ that improves $\widehat{x}$ . This form of iterative refinement produces a solution satisfying (2) as long as $\kappa_{\infty}(A) < u^{-1}$ .

Here is a MATLAB example, where the working precision is single and residuals are computed in double precision.

n = 8; A = single(gallery('frank',n)); xact = ones(n,1);
b = A*xact; % b is formed exactly for small n.
x = A\b;
fprintf('Initial_error = %4.1e\n', norm(x - xact,inf))
r = single( double(b) - double(A)*double(x) );
d = A\r;
x = x + d;
fprintf('Second error  = %4.1e\n', norm(x - xact,inf))

The output is

Initial_error = 9.1e-04
Second error  = 6.0e-08

which shows that after just one step the error has been brought down from $10^{-4}$ to the level of $u_s\approx 5\times 10^{-8}$ , the unit roundoff for IEEE single precision arithmetic.

Fixed Precision Iterative Refinement

By the 1970s, computers had started to lose the ability to cheaply accumulate inner products in extra precision, and extra precision could not be programmed portably in software. It was discovered, though, that even if iterative refinement is run entirely in one precision it can bring benefits when $\kappa_{\infty}(A) < u^{-1}$ . Specifically,

if the solver is somewhat numerically unstable the instability is cured by the refinement, in that a relative residual satisfying
$\notag \displaystyle\frac{\|b-A\widehat{x}\|_{\infty}}{\|A\|_{\infty} \|\widehat{x}\|_{\infty} + \|b\|_{\infty}} \approx u \qquad(3)$

is produced, and
a relative error satisfying
$\notag \displaystyle\frac{\|x- \widehat{x}\|_{\infty}}{\|x\|_{\infty}} \lesssim \mathrm{cond}(A,x) u \qquad (4)$

is produced, where

$\notag \mathrm{cond}(A,x) = \displaystyle\frac{ \|\, |A^{-1}| |A| |x| \,\|_{\infty} } {\|x\|_{\infty}}.$

The bound (4) is stronger than (1) because $\mathrm{cond}(A,x)$ is no larger than $\kappa_{\infty}(A)$ and can be much smaller, especially if $A$ has badly scaled rows.

Low Precision Factorization

In the 2000s processors became available in which 32-bit single precision arithmetic ran at twice the speed of 64-bit double precision arithmetic. A new usage of iterative refinement was developed in which the working precision is double precision and a double precision matrix $A$ is factorized in single precision.

Factorize $A = LU$ in single precision.
Solve $LUx = b$ by substitution in single precision, obtaining $\widehat{x}$ .
Compute the residual $r = b - A\widehat{x}$ in double precision.
Solve $LUd = r$ in single precision.
Update $\widehat{x} \gets \widehat{x} + d$ in double precision.
Repeat from step 3 if necessary.

Since most of the work is in the single precision factorization, this algorithm is potentially twice as fast as solving $Ax=b$ entirely in double precision arithmetic. The algorithm achieves the same limiting accuracy (4) and limiting residual (3) provided that $\kappa_{\infty}(A) < u_s^{-1} \approx 2\times 10^7$ .

GRMES-IR

A way to weaken this restriction on $\kappa_{\infty}(A)$ is to use a different solver on step 4: solve

$\notag \widetilde{A}d := U^{-1}L^{-1}Ad = U^{-1}L^{-1}r$

by GMRES (a Krylov subspace iterative method) in double precision. Within GMRES, $\widetilde{A}d$ is not formed but is applied to a vector as a multiplication with $A$ followed by substitutions with $L$ and $U$ . As long as GMRES converges quickly for this preconditioned system the speed again from the fast single precision factorization will not be lost. Moreover, this different step 4 results in convergence for $\kappa_{\infty}(A)$ a couple of orders magnitude larger than before, and if residuals are computed in quadruple precision then a limiting accuracy (2) is achieved.

This GMRES-based iterative refinement becomes particularly advantageous when the fast half precision arithmetic now available in hardware is used within the LU factorization, and one can use three or more precisions in the algorithm in order to balance speed, accuracy, and the range of problems that can be solved.

Finally, we note that iterative refinement can also be applied to least squares problems, eigenvalue problems, and the singular value decomposition. See Higham and Mary (2022) for details and references.

References

We give five references, which contain links to the earlier literature.

Erin Carson and Nicholas J. Higham. Accelerating the Solution of Linear Systems by Iterative Refinement in Three Precisions. SIAM J. Sci. Comput., 40(2):A817–A847,2018.
Azzam Haidar, Harun Bayraktar, Stanimire Tomov, Jack Dongarra, and Nicholas J. Higham. Mixed-Precision Iterative Refinement Using Tensor Cores on GPUs to Accelerate Solution Of linear systems. Proc. Roy. Soc. London A, 476(2243):20200110, 2020.
Nicholas J. Higham and Theo Mary. Mixed Precision Algorithms in Numerical Linear Algebra. Acta Numerica, 31:347–414, 2022.
Nicholas J. Higham and Dennis Sherwood, How to Boost Your Creativity, SIAM News, 55(5):1, 3, 2022. (Explains how developments in iterative refinement 1948–2022 correspond to asking “how might this be different” about each aspect of the algorithm.)
Julie Langou, Julien Langou, Piotr Luszczek, Jakub Kurzak, Alfredo Buttari, and Jack Dongarra. Exploiting the Performance of 32 Bit Floating Point Arithmetic in Obtaining 64 Bit Accuracy (Revisiting Iterative Refinement for Linear Systems). In Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, IEEE, November 2006.

Bounds

Limit of Norms of Powers

Spectral Radius of a Product

Condition Number Lower Bound

Power Boundedness

Computing the Spectral Radius

Nonnegative Matrices

References

Related Blog Posts

Share this:

Unitary Hessenberg Matrices

Examples

References

Related Blog Posts

Share this:

Eigenvalue Relations

Block Triangularization

The Schur Decomposition

Notes and References

Related Blog Posts

Share this:

The Colon Notation

Block Matrices

Related Blog Posts

Share this:

Early Usage

Flops as a Measure of Computer Performance

Dictionary Definitions

Notes and References

Related Blog Posts

Share this:

Notes and References

Related Blog Posts

Share this:

Notes and References

Related Blog Posts

Share this:

Schur Complements in Gaussian Elimination

Inverse of a Block Matrix

Test for Positive Definiteness

Generalized Schur Complement

Notes and References

Related Blog Posts

Share this:

Related Blog Posts

Share this:

Fixed Precision Iterative Refinement

Low Precision Factorization

GRMES-IR

References

Related Blog Posts

Share this: