# What Is a QR Factorization?

A QR factorization of a rectangular matrix $A\in\mathbb{R}^{m\times n}$ with $m\ge n$ is a factorization $A = QR$ with $Q\in\mathbb{R}^{m\times m}$ orthonormal and $R\in\mathbb{R}^{m\times n}$ upper trapezoidal. The $R$ factor has the form $R = \left[\begin{smallmatrix}R_1\\ 0\end{smallmatrix}\right]$, where $R_1$ is $n\times n$ and upper triangular. Partitioning $Q$ conformably with $R$ we have

$\notag A = QR = \begin{array}[b]{@{\mskip-20mu}c@{\mskip0mu}c@{\mskip-1mu}c@{}} & \mskip10mu\scriptstyle n & \scriptstyle m-n \\ \mskip15mu \begin{array}{r} \scriptstyle m \end{array}~ & \multicolumn{2}{c}{\mskip-15mu \left[\begin{array}{c@{~}c@{~}} Q_1 & Q_2 \end{array}\right] } \end{array} \mskip-10mu \begin{array}[b]{@{\mskip-25mu}c@{\mskip-20mu}c@{}} \scriptstyle n \\ \multicolumn{1}{c}{ \left[\begin{array}{@{}c@{}} R_1\\ 0 \end{array}\right]} & \mskip-12mu\ \begin{array}{l} \scriptstyle n \\ \scriptstyle m-n \end{array} \end{array} = Q_1 R_1.$

There are therefore two forms of QR factorization:

• $A = QR$ is the full QR factorization,
• $A = Q_1R_1$ is the reduced (also called economy-sized, or thin) QR factorization.

To prove the existence of a QR factorization note that if $A$ has full rank then $A^T\!A$ is symmetric positive definite. Since $A = Q_1R_1$ implies $A^T\!A = R_1^TQ_1^TQ_1R_1 = R_1^TR_1$, we can take $R_1$ to be the Cholesky factor of $A^T\!A$ and then define $Q_1 = AR_1^{-1}$. The resulting $Q_1$ has orthonormal columns because

$\notag Q_1^TQ_1 = R_1^{-T} A^T A R_1^{-1} = R_1^{-T} R_1^T R_1 R_1^{-1} = I.$

Therefore when $A$ has full rank there is a unique reduced QR factorization if we require $R_1$ to have positive diagonal elements. (Without this requirement we can multiply the $i$th column of $Q$ and the $i$th row of $R$ by $-1$ and obtain another QR factorization.)

When $A$ has full rank the columns of $Q_1$ span the column space (or range) of $A$. Indeed $Ax = Q_1R_1x = Q_1(R_1x)$ implies $\mathrm{range}(A) \subseteq \mathrm{range}(Q_1)$ while $Q_1x = Q_1R_1\cdot R_1^{-1}x =: Ay$ implies $\mathrm{range}(Q_1) \subseteq \mathrm{range}(A)$, so $\mathrm{range}(Q_1) = \mathrm{range}(A)$. Furthermore, $Q^TA = R$ gives $Q_2^TA = 0$, so the columns of $Q_2$ span the null space of $A^T$.

The QR factorization provides a way of orthonormalizing the columns of a matrix. An alternative is provided by the polar decomposition $A = UH$, where $U$ has orthonormal columns and $H$ is positive semidefinite. The orthogonal polar factor $U$ is the closest matrix with orthonormal columns to $A$ in any unitarily invariant norm, but it is more expensive to compute than the $Q$ factor.

There are three standard ways of computing a QR factorization.

Gram–Schmidt orthogonalization computes the reduced factorization. It has the disadvantage that in floating-point arithmetic the computed $\widehat{Q}$ is not guaranteed to be orthonormal to the working precision. The modified Gram–Schmidt method (a variation of the classical method) is better behaved numerically in that $\|\widehat{Q}^T\widehat{Q} - I\|_F \le c_1(m,n)\kappa_2(A)u$ for some constant $c_1(m,n)$, where $u$ is the unit roundoff, so the loss of orthogonality is bounded.

Householder QR factorization and Givens QR factorization both construct $Q^T$ as a product of orthogonal matrices that are chosen to reduce $A$ to upper trapezoidal form. In both methods, at the start of the $k$th stage we have

$\notag \qquad\qquad\qquad\qquad A^{(k)} = Q_{k-1}^T A = \begin{array}[b]{@{\mskip35mu}c@{\mskip20mu}c@{\mskip-5mu}c@{}c} \scriptstyle k-1 & \scriptstyle 1 & \scriptstyle n-k & \\ \multicolumn{3}{c}{ \left[\begin{array}{c@{\mskip10mu}cc} R_{k-1} & y_k & B_k \\ 0 & z_k & C_k \end{array}\right]} & \mskip-12mu \begin{array}{c} \scriptstyle k-1 \\ \scriptstyle m-k+1 \end{array} \end{array}, \qquad\qquad\qquad\qquad (*)$

where $R_{k-1}$ is upper triangular and $Q_{k-1}$ is a product of Householder transformations or Givens rotations. Working on $A^{(k)}(k:m,k:n)$ we now apply a Householder transformation or $n-k$ Givens rotations in order to zero out the last $n-k$ elements of $z_k$ and thereby take the matrix one step closer to upper trapezoidal form.

Householder QR factorization is the method of choice for general matrices, but Givens QR factorization is preferred for structured matrices with a lot of zeros, such as upper Hessenberg matrices and tridiagonal matrices.

Both these methods produce $Q$ in factored form and if the product is explicitly formed they yield a computed $\widehat{Q}$ that is orthogonal to the working precision, that is, $\|\widehat{Q}^T\widehat{Q} - I\|_F \le c_2(m,n)u$, for some constant $c_2$.

Modified Gram–Schmidt, Householder QR, and Givens QR all have the property that there exists an exactly orthogonal $Q$ such that the computed $\widehat{R}$ satisfies

$\notag A + \Delta A = Q \widehat{R}, \quad \|\Delta A\|_F \le c_3(m,n)u \|A\|_F,$

for some constant $c_3$.

Another way of computing a QR factorization is by the technique in the existence proof above, via Cholesky factorization of $A^T\!A$. This is known as the Cholesky QR algorithm and it has favorable computational cost when $m \gg n$. In its basic form, this method is not recommended unless $A$ is extremely well conditioned, because the computed $\widehat{Q}$ is far from orthonormal for ill conditioned matrices. The method can be made competitive with the others either by using extra precision or by iterating the process.

## Column Pivoting and Rank-Revealing QR Factorization

In practice, we often want to compute a basis for the range of $A$ when $A$ is rank deficient. The basic QR factorization may not do so. Householder QR factorization with column pivoting reveals rank deficiency by incorporating column interchanges. At the $k$th stage, before applying a Householder transformation to $(*)$, the column of largest $2$-norm of $C_k$, the $j$th say, is determined, and if its norm exceeds that of $z_K$ then the $k$th and $(k+j)$th columns of $A^{(k)}$ are interchanged. The result is a factorization $A\Pi = QR$, where $\Pi$ is a permutation matrix and $R$ satisfies the inequalities

$\notag |r_{kk}|^2 \ge \displaystyle\sum_{i=k}^j |r_{ij}|^2, \quad j=k+1\colon n, \quad k=1\colon n.$

In particular,

$\notag \qquad\qquad\qquad\qquad\qquad\qquad |r_{11}| \ge |r_{22}| \ge \cdots \ge |r_{nn}|. \qquad\qquad\qquad\qquad\qquad\qquad (\dagger)$

If $A$ is rank deficient then $R$ has the form

$\notag R = \begin{bmatrix}R_{11} & R_{12} \\ 0 & 0 \end{bmatrix},$

with $R_{11}$ nonsingular, and the rank of $A$ is the dimension of $R_{11}$.

Near rank deficiency of $A$ to tends to be revealed by a small trailing diagonal block of $R$, but this is not guaranteed. Indeed for the Kahan matrix

$\notag U_n(\theta) = \mathrm{diag}(1,s,\dots,s^{n-1}) \begin{bmatrix} 1 & -c & -c & \dots & -c \\ & 1 & -c & \dots & -c \\ & & \ddots &\ddots & \vdots \\ & & &\ddots & -c \\ & & & & 1 \end{bmatrix}$

where $c =\cos\theta$ and $s = \sin\theta$, $u_{nn}$ is of order $2^n$ times larger than the smallest singular value for small $\theta$ and $U_n(\theta)$ is invariant under QR factorization with column pivoting.

In practice, column pivoting reduces the efficiency of Householder QR factorization because it limits the amount of the computation that can be expressed in terms of matrix multiplication. This has motivated the development of methods that select the pivot columns using randomized projections. These methods gain speed and produce factorizations of similar rank-revealing quality, though they give up the inequalities $(\dagger)$.

It is known that at least one permutation matrix $\Pi$ exists such that the QR factorization of $A\Pi$ is rank-revealing. Computing such a permutation is impractical, but heuristic algorithms for producing an approximate rank-revealing factorization are available.

## References

This is a minimal set of references, which contain further useful references within.