The Big Six Matrix Factorizations

Six matrix factorizations dominate in numerical linear algebra and matrix analysis: for most purposes one of them is sufficient for the task at hand. We summarize them here.

For each factorization we give the cost in flops for the standard method of computation, stating only the highest order terms. We also state the main uses of each factorization.

For full generality we state factorizations for complex matrices. Everything translates to the real case with “Hermitian” and “unitary” replaced by “symmetric” and “orthogonal”, respectively.

The terms “factorization” and “decomposition” are synonymous and it is a matter of convention which is used. Our list comprises three factorization and three decompositions.

Recall that an upper triangular matrix is a matrix of the form

\notag   R =   \begin{bmatrix}   r_{11} & r_{12} & \dots & r_{1n}\\          & r_{22} & \dots & r_{2n}\\          &        & \ddots& \vdots\\          &        &       & r_{nn}   \end{bmatrix},

and a lower triangular matrix is the transpose of an upper triangular one.

Cholesky Factorization

Every Hermitian positive definite matrix A\in\mathbb{C}^{n\times n} has a unique Cholesky factorization A = R^*R, where R\in\mathbb{C}^{n\times n} is upper triangular with positive diagonal elements.

Cost: n^3/3 flops.

Use: solving positive definite linear systems.

LU Factorization

Any matrix A\in\mathbb{C}^{n\times n} has an LU factorization PA = LU, where P is a permutation matrix, L is unit lower triangular (lower triangular with 1s on the diagonal), and U is upper triangular. We can take P = I if the leading principal submatrices A(1\colon k, 1\colon k), k = 1\colon n-1, of A are nonsingular, but to guarantee that the factorization is numerically stable we need A to have particular properties, such as diagonal dominance. In practical computation we normally choose P using the partial pivoting strategy, which almost always ensures numerically stable.

Cost: 2n^3/3 flops.

Use: solving general linear systems.

QR Factorization

Any matrix A\in\mathbb{C}^{m\times n} with m\ge n has a QR factorization A = QR, where Q\in \mathbb{C}^{m\times m} is unitary and R is upper trapezoidal, that is, R = \left[\begin{smallmatrix} R_1 \\ 0\end{smallmatrix}\right], where R_1\in\mathbb{C}^{n\times n} is upper triangular.

Partitioning Q = [Q_1~Q_2], where Q_1\in\mathbb{C}^{m\times n} has orthonormal columns, gives A = Q_1R_1, which is the reduced, economy size, or thin QR factorization.

Cost: 2(n^2m-n^3/3) flops for Householder QR factorization. The explicit formation of Q (which is not usually necessary) requires a further 4(m^2n-mn^2+n^3/3) flops.

Use: solving least squares problems, computing an orthonormal basis for the range space of A, orthogonalization.

Schur Decomposition

Any matrix A\in\mathbb{C}^{n\times n} has a Schur decomposition A = QTQ^*, where Q is unitary and T is upper triangular. The eigenvalues of A appear on the diagonal of T. For each k, the leading k columns of Q span an invariant subspace of A.

For real matrices, a special form of this decomposition exists in which all the factors are real. An upper quasi-triangular matrix R is a block upper triangular with whose diagonal blocks R_{ii} are either 1\times1 or 2\times2. Any A\in\mathbb{R}^{n\times n} has a real Schur decomposition A = Q R Q^T, where Q is real orthogonal and R is real upper quasi-triangular with any 2\times2 diagonal blocks having complex conjugate eigenvalues.

Cost: 25n^3 flops for Q and T (or R) by the QR algorithm; 10n^3 flops for T (or R) only.

Use: computing eigenvalues and eigenvectors, computing invariant subspaces, evaluating matrix functions.

Spectral Decomposition

Every Hermitian matrix A\in\mathbb{C}^{n\times n} has a spectral decomposition A = Q\Lambda Q^*, where Q is unitary and \Lambda = \mathrm{diag}(\lambda_i). The \lambda_i are the eigenvalues of A, and they are real. The spectral decomposition is a special case of the Schur decomposition but is of interest in its own right.

Cost: 9n^3 for Q and D by the QR algorithm, or 4n^3\!/3 flops for D only.

Use: any problem involving eigenvalues of Hermitian matrices.

Singular Value Decomposition

Any matrix A\in\mathbb{C}^{m\times n} has a singular value decomposition (SVD)

\notag   A = U\Sigma V^*, \quad   \Sigma = \mathrm{diag}(\sigma_1,\sigma_2,\dots,\sigma_p)             \in \mathbb{R}^{m\times n},   \quad   p = \min(m,n),

where U\in\mathbb{C}^{m\times m} and V\in\mathbb{C}^{n\times n} are unitary and \sigma_1\ge\sigma_2\ge\cdots\ge\sigma_p\ge0. The \sigma_i are the singular values of A, and they are the nonnegative square roots of the p largest eigenvalues of A^*A. The columns of U and V are the left and right singular vectors of A, respectively. The rank of A is equal to the number of nonzero singular values. If A is real, U and V can be taken to be real. The essential SVD information is contained in the compact or economy size SVD A = U\Sigma V^*, where U\in\mathbb{C}^{m\times r}, \Sigma = \mathrm{diag}(\sigma_1,\dots,\sigma_r), V\in\mathbb{C}^{n\times r}, and r = \mathrm{rank}(A).

Cost: 14mn^2+8n^3 for P(:,1\colon n), \Sigma, and Q by the Golub–Reinsch algorithm, or 6mn^2+20n^3 with a preliminary QR factorization.

Use: determining matrix rank, solving rank-deficient least squares problems, computing all kinds of subspace information.

Discussion

Pivoting can be incorporated into both Cholesky factorization and QR factorization, giving \Pi^T A \Pi = R^*R (complete pivoting) and A\Pi = QR (column pivoting), respectively, where \Pi is a permutation matrix. These pivoting strategies are useful for problems that are (nearly) rank deficient as they force R to have a zero (or small) (2,2) block.

The big six factorizations can all be computed by numerically stable algorithms. Another important factorization is that provided by the Jordan canonical form, but while it is a useful theoretical tool it cannot in general be computed in a numerically stable way.

For further details of these factorizations see the articles below.

These factorizations are precisely those discussed by Stewart (2000) in his article The Decompositional Approach to Matrix Computation, which explains the benefits of matrix factorizations in numerical linear algebra.

What Is a Schur Decomposition?

A Schur decomposition of a matrix A\in\mathbb{C}^{n\times n} is a factorization A = QTQ^*, where Q is unitary and T is upper triangular. The diagonal entries of T are the eigenvalues of A, and they can be made to appear in any order by choosing Q appropriately. The columns of Q are called Schur vectors.

A subspace \mathcal{X} of \mathbb{C}^{n\times n} is an invariant subspace of A if Ax\in\mathcal{X} for all x\in\mathcal{X}. If we partition Q and T conformably we can write

\notag   A [Q_1,~Q_2] = [Q_1,~Q_2]   \begin{bmatrix}   T_{11} & T_{12} \\       0  & T_{22} \\   \end{bmatrix},

which gives A Q_1 = Q_1 T_{11}, showing that the columns of Q_1 span an invariant subspace of A. Furthermore, Q_1^*AQ_1 = T_{11}. The first column of Q is an eigenvector of A corresponding to the eigenvalue \lambda_1 = t_{11}, but the other columns are not eigenvectors, in general. Eigenvectors can be computed by solving upper triangular systems involving T - \lambda I, where \lambda is an eigenvalue.

Write T = D+N, where D = \mathrm{diag}(\lambda_i) and N is strictly upper triangular. Taking Frobenius norms gives \|A\|_F^2 = \|D\|_F^2 + \|N\|_F^2, or

\notag        \|N\|_F^2 = \|A\|_F^2 - \displaystyle\sum_{i=1}^n |\lambda_i|^2.

Hence \|N\|_F is independent of the particular Schur decomposition and it provides a measure of the departure from normality. The matrix A is normal (that is, A^*A = AA^*) if and only if N = 0. So a normal matrix is unitarily diagonalizable: A = QDQ^*.

An important application of the Schur decomposition is to compute matrix functions. The relation f(A) = Qf(T)Q^* shows that computing f(A) reduces to computing a function of a triangular matrix. Matrix functions illustrate what Van Loan (1975) describes as “one of the most basic tenets of numerical algebra”, namely “anything that the Jordan decomposition can do, the Schur decomposition can do better!”. Indeed the Jordan canonical form is built on a possibly ill conditioned similarity transformation while the Schur decomposition employs a perfectly conditioned unitary similarity, and the full upper triangular factor of the Schur form can do most of what the Jordan form’s bidiagonal factor can do.

An upper quasi-triangular matrix is a block upper triangular matrix

\notag   R =   \begin{bmatrix}   R_{11} & R_{12} & \dots & R_{1m}\\          & R_{22} & \dots & R_{2m}\\          &        & \ddots& \vdots\\          &        &       &  R_{mm}   \end{bmatrix}

whose diagonal blocks R_{ii} are either 1\times1 or 2\times2. A real matrix A\in\mathbb{R}^{n \times n} has a real Schur decomposition A = QRQ^T in which in which all the factors are real, Q is orthogonal, and R is upper quasi-triangular with any 2\times2 diagonal blocks having complex conjugate eigenvalues. If A is normal then the 2\times 2 blocks R_{ii} have the form

R_{ii} = \left[\begin{array}{@{}rr@{\mskip2mu}} a & b \\                -b & a \end{array}\right], \quad b \ne 0,

which has eigenvalues a \pm \mathrm{i}b.

The Schur decomposition can be computed by the QR algorithm at a cost of about 25n^3 flops for Q and T, or 10n^3 flops for T only.

In MATLAB, the Schur decomposition is computed with the schur function: the command [Q,T] = schur(A) returns the real Schur form if A is real and otherwise the complex Schur form. The complex Schur form for a real matrix can be obtained with [Q,T] = schur(A,'complex'). The rsf2csf function converts the real Schur form to the complex Schur form. The= ordschur function takes a Schur decomposition and modifies it so that the eigenvalues appear in a specified order along the diagonal of T.

References

Related Blog Posts

This article is part of the “What Is” series, available from https://nhigham.com/index-of-what-is-articles/ and in PDF form from the GitHub repository https://github.com/higham/what-is.