What Is a QR Factorization?

A QR factorization of a rectangular matrix A\in\mathbb{R}^{m\times n} with m\ge n is a factorization A = QR with Q\in\mathbb{R}^{m\times m} orthonormal and R\in\mathbb{R}^{m\times n} upper trapezoidal. The R factor has the form R = \left[\begin{smallmatrix}R_1\\ 0\end{smallmatrix}\right], where R_1 is n\times n and upper triangular. Partitioning Q conformably with R we have

\notag    A = QR     =    \begin{array}[b]{@{\mskip-20mu}c@{\mskip0mu}c@{\mskip-1mu}c@{}}    & \mskip10mu\scriptstyle n & \scriptstyle m-n  \\       \mskip15mu          \begin{array}{r}              \scriptstyle m          \end{array}~    &       \multicolumn{2}{c}{\mskip-15mu          \left[\begin{array}{c@{~}c@{~}}                  Q_1 & Q_2                \end{array}\right]       }    \end{array} \mskip-10mu    \begin{array}[b]{@{\mskip-25mu}c@{\mskip-20mu}c@{}}    \scriptstyle n    \\    \multicolumn{1}{c}{        \left[\begin{array}{@{}c@{}}                  R_1\\                  0              \end{array}\right]}    & \mskip-12mu\          \begin{array}{l}              \scriptstyle n \\              \scriptstyle m-n          \end{array}    \end{array}    = Q_1 R_1.

There are therefore two forms of QR factorization:

  • A = QR is the full QR factorization,
  • A = Q_1R_1 is the reduced (also called economy-sized, or thin) QR factorization.

To prove the existence of a QR factorization note that if A has full rank then A^T\!A is symmetric positive definite. Since A = Q_1R_1 implies A^T\!A = R_1^TQ_1^TQ_1R_1 = R_1^TR_1, we can take R_1 to be the Cholesky factor of A^T\!A and then define Q_1 = AR_1^{-1}. The resulting Q_1 has orthonormal columns because

\notag        Q_1^TQ_1 = R_1^{-T} A^T A R_1^{-1}                 = R_1^{-T} R_1^T R_1 R_1^{-1}                  = I.

Therefore when A has full rank there is a unique reduced QR factorization if we require R_1 to have positive diagonal elements. (Without this requirement we can multiply the ith column of Q and the ith row of R by -1 and obtain another QR factorization.)

When A has full rank the columns of Q_1 span the column space (or range) of A. Indeed Ax = Q_1R_1x = Q_1(R_1x) implies \mathrm{range}(A) \subseteq \mathrm{range}(Q_1) while Q_1x = Q_1R_1\cdot R_1^{-1}x =: Ay implies \mathrm{range}(Q_1) \subseteq \mathrm{range}(A), so \mathrm{range}(Q_1) = \mathrm{range}(A). Furthermore, Q^TA = R gives Q_2^TA = 0, so the columns of Q_2 span the null space of A^T.

The QR factorization provides a way of orthonormalizing the columns of a matrix. An alternative is provided by the polar decomposition A = UH, where U has orthonormal columns and H is positive semidefinite. The orthogonal polar factor U is the closest matrix with orthonormal columns to A in any unitarily invariant norm, but it is more expensive to compute than the Q factor.

There are three standard ways of computing a QR factorization.

Gram–Schmidt orthogonalization computes the reduced factorization. It has the disadvantage that in floating-point arithmetic the computed \widehat{Q} is not guaranteed to be orthonormal to the working precision. The modified Gram–Schmidt method (a variation of the classical method) is better behaved numerically in that \|\widehat{Q}^T\widehat{Q} - I\|_F \le c_1(m,n)\kappa_2(A)u for some constant c_1(m,n), where u is the unit roundoff, so the loss of orthogonality is bounded.

Householder QR factorization and Givens QR factorization both construct Q^T as a product of orthogonal matrices that are chosen to reduce A to upper trapezoidal form. In both methods, at the start of the kth stage we have

\notag   \qquad\qquad\qquad\qquad    A^{(k)} = Q_{k-1}^T A =    \begin{array}[b]{@{\mskip35mu}c@{\mskip20mu}c@{\mskip-5mu}c@{}c}    \scriptstyle k-1 &    \scriptstyle 1 &    \scriptstyle n-k &    \\    \multicolumn{3}{c}{    \left[\begin{array}{c@{\mskip10mu}cc}    R_{k-1} & y_k  & B_k \\        0   & z_k  & C_k    \end{array}\right]}    & \mskip-12mu    \begin{array}{c}    \scriptstyle k-1 \\    \scriptstyle m-k+1    \end{array}    \end{array},   \qquad\qquad\qquad\qquad (*)

where R_{k-1} is upper triangular and Q_{k-1} is a product of Householder transformations or Givens rotations. Working on A^{(k)}(k:m,k:n) we now apply a Householder transformation or n-k Givens rotations in order to zero out the last n-k elements of z_k and thereby take the matrix one step closer to upper trapezoidal form.

Householder QR factorization is the method of choice for general matrices, but Givens QR factorization is preferred for structured matrices with a lot of zeros, such as upper Hessenberg matrices and tridiagonal matrices.

Both these methods produce Q in factored form and if the product is explicitly formed they yield a computed \widehat{Q} that is orthogonal to the working precision, that is, \|\widehat{Q}^T\widehat{Q} - I\|_F \le c_2(m,n)u, for some constant c_2.

Modified Gram–Schmidt, Householder QR, and Givens QR all have the property that there exists an exactly orthogonal Q such that the computed \widehat{R} satisfies

\notag   A + \Delta A = Q \widehat{R}, \quad   \|\Delta A\|_F \le c_3(m,n)u \|A\|_F,

for some constant c_3.

Another way of computing a QR factorization is by the technique in the existence proof above, via Cholesky factorization of A^T\!A. This is known as the Cholesky QR algorithm and it has favorable computational cost when m \gg n. In its basic form, this method is not recommended unless A is extremely well conditioned, because the computed \widehat{Q} is far from orthonormal for ill conditioned matrices. The method can be made competitive with the others either by using extra precision or by iterating the process.

Column Pivoting and Rank-Revealing QR Factorization

In practice, we often want to compute a basis for the range of A when A is rank deficient. The basic QR factorization may not do so. Householder QR factorization with column pivoting reveals rank deficiency by incorporating column interchanges. At the kth stage, before applying a Householder transformation to (*), the column of largest 2-norm of C_k, the jth say, is determined, and if its norm exceeds that of z_K then the kth and (k+j)th columns of A^{(k)} are interchanged. The result is a factorization A\Pi = QR, where \Pi is a permutation matrix and R satisfies the inequalities

\notag    |r_{kk}|^2 \ge \displaystyle\sum_{i=k}^j |r_{ij}|^2,     \quad j=k+1\colon n, \quad k=1\colon n.

In particular,

\notag   \qquad\qquad\qquad\qquad\qquad\qquad   |r_{11}| \ge |r_{22}| \ge \cdots \ge |r_{nn}|.   \qquad\qquad\qquad\qquad\qquad\qquad (\dagger)

If A is rank deficient then R has the form

\notag R = \begin{bmatrix}R_{11} & R_{12} \\ 0 & 0 \end{bmatrix},

with R_{11} nonsingular, and the rank of A is the dimension of R_{11}.

Near rank deficiency of A to tends to be revealed by a small trailing diagonal block of R, but this is not guaranteed. Indeed for the Kahan matrix

\notag        U_n(\theta) = \mathrm{diag}(1,s,\dots,s^{n-1})                   \begin{bmatrix} 1 & -c & -c     & \dots & -c \\                               & 1  & -c     & \dots & -c \\                               &    & \ddots &\ddots & \vdots \\                               &    &        &\ddots & -c \\                               &    &        &       &  1                    \end{bmatrix}

where c =\cos\theta and s = \sin\theta, u_{nn} is of order 2^n times larger than the smallest singular value for small \theta and U_n(\theta) is invariant under QR factorization with column pivoting.

In practice, column pivoting reduces the efficiency of Householder QR factorization because it limits the amount of the computation that can be expressed in terms of matrix multiplication. This has motivated the development of methods that select the pivot columns using randomized projections. These methods gain speed and produce factorizations of similar rank-revealing quality, though they give up the inequalities (\dagger).

It is known that at least one permutation matrix \Pi exists such that the QR factorization of A\Pi is rank-revealing. Computing such a permutation is impractical, but heuristic algorithms for producing an approximate rank-revealing factorization are available.

References

This is a minimal set of references, which contain further useful references within.

Related Blog Posts

This article is part of the “What Is” series, available from https://nhigham.com/category/what-is and in PDF form from the GitHub repository https://github.com/higham/what-is.

Leave a comment