What Is a Cholesky Factorization?

The Cholesky factorization of a symmetric positive definite matrix $A$ is the factorization $A = R^T\!R$ , where $R$ is upper triangular with positive diagonal elements. It is a generalization of the property that a positive real number has a unique positive square root. The Cholesky factorization always exists and the requirement that the diagonal of $R$ be positive ensures that it is unique.

As an example, the Cholesky factorization of the $4\times 4$ matrix with $(i,j)$ element $\gcd(i,j)$ (gallery('gcdmat',4) in MATLAB) is

$G_4 = \left[\begin{array}{cccc} 1 & 1 & 1 & 1\\ 1 & 2 & 1 & 2\\ 1 & 1 & 3 & 1\\ 1 & 2 & 1 & 4 \end{array}\right] = \left[\begin{array}{cccc} 1 & 0 & 0 & 0\\ 1 & 1 & 0 & 0\\ 1 & 0 & \sqrt{2} & 0\\ 1 & 1 & 0 & \sqrt{2} \end{array}\right] \left[\begin{array}{cccc} 1 & 1 & 1 & 1\\ 0 & 1 & 0 & 1\\ 0 & 0 & \sqrt{2} & 0\\ 0 & 0 & 0 & \sqrt{2} \end{array}\right].$

The Cholesky factorization of an $n\times n$ matrix contains $n-1$ other Cholesky factorizations within it: $A_k = R_k^TR_k$ , $k=1:n-1$ , where $A_k = A(1\colon k,1\colon k)$ is the leading principal submatrix of order $k$ . For example, for $G_4$ with $k = 2$ , $\bigl[\begin{smallmatrix}1 & 1\\ 1& 2\end{smallmatrix}\bigr]= \bigl[\begin{smallmatrix}1 & 0\\ 1& 1\end{smallmatrix}\bigr] \bigl[\begin{smallmatrix}1 & 1\\ 0& 1\end{smallmatrix}\bigr]$ .

Inverting the Cholesky equation gives $A^{-1} = R^{-1} R^{-T}$ , which implies the interesting relation that the $(n,n)$ element of $A^{-1}$ is $r_{nn}^{-2}$ . So $G_4^{-1}$ has $(4,4)$ element $1/2$ . We also have $\det(A) = \det(R)^2 = (r_{11}\dots r_{nn})^2$ , so $\det(G_4) = 4$ for this matrix.

The Cholesky factorization is named after André-Louis Cholesky (1875–1918), a French military officer involved in geodesy and surveying in Crete and North Africa, who developed it for solving the normal equations arising in least squares problems.

The existence and uniqueness of the factorization can be proved by induction on $n$ , and the proof also serves as a method for computing it. Partition $A\in\mathbb{R}^{n\times n}$ as

$A = \begin{bmatrix} A_{n-1} & c \\ c^T & \alpha \end{bmatrix}.$

We know that $A_{n-1}$ is positive definite (any principal submatrix of a positive definite matrix is easily shown to be positive definite). Assume that $A_{n-1}$ has a unique Cholesky factorization $A_{n-1} = R_{n-1}^TR_{n-1}$ and define the upper triangular matrix

$R = \begin{bmatrix} R_{n-1} & r \\ 0 & \beta \end{bmatrix}.$

Then

$R^T\!R = \begin{bmatrix} R_{n-1}^T R_{n-1} & R_{n-1}^T r \\ r^T R_{n-1} & r^Tr + \beta^2 \end{bmatrix},$

which equals $A$ if and only if

$\begin{aligned} R_{n-1}^T r &= c, \\ r^Tr + \beta^2 &= \alpha. \end{aligned}$

The first equation has a unique solution since $R_{n-1}$ is nonsingular. Then the second equation gives $\beta^2 = \alpha - r^Tr$ . It remains to check that there is a unique real, positive $\beta$ satisfying this equation. From the inequality

$0 < \det(A) = \det(R^T) \det(R) = \det(R_{n-1})^2 \beta^2$

we see that $\beta^2>0$ , hence there is a unique $\beta>0$ . This completes the inductive step.

The quantity $\alpha - r^Tr = \alpha - c^T A_{n-1}^{-1}c$ is the Schur complement of $A_{n-1}$ in $A$ . The essential reason why Cholesky factorization works is that the Schur complements of a positive definite matrix are themselves positive definite. Indeed we have the congruence

$\begin{bmatrix} I & 0 \\ -c^T A_{n-1}^{-1} & 1 \end{bmatrix} \begin{bmatrix} A_{n-1} & c \\ c^T & \alpha \end{bmatrix} \begin{bmatrix} I & -A_{n-1}^{-1}c\\ 0 & 1 \end{bmatrix} = \begin{bmatrix} A_{n-1} & 0 \\ 0 & \alpha - c^T A_{n-1}^{-1} c \end{bmatrix},$

and since congruences preserve definiteness it follows that $\alpha - c^T A_{n-1}^{-1} c > 0$ .

In textbooks it is common to see element-level equations for computing the Cholesky factorization, which come from directly solving the matrix equation $A = R^T\!R$ . If we equate elements on both sides, taking them a column at a time, starting with $a_{11} = r_{11}^2$ , then the following algorithm is obtained:

$\begin{array}{l} 1~ \mbox{for}~j=1:n \\ 2~ \quad \mbox{for}~ i=1:j-1 \\ 3~ \qquad r_{ij} = \bigl(a_{ij} - \sum_{k=1}^{i-1} r_{ki}r_{kj} \bigr) /r_{ii} \\ 4~ \quad \mbox{end} \\ 5~ \quad r_{jj} = \bigl(a_{jj} - \sum_{k=1}^{j-1} r_{kj}^2\bigr)^{1/2} \\ 6~ \mbox{end} \end{array}$

What happens if this algorithm is executed on a general (indefinite) symmetric matrix that is, one that has both positive and negative eigenvalues? Line 5 will either attempt to take the square root of a negative number for some $j$ or it will produce $r_{jj} = 0$ and on the next iteration of the loop we will have a division by zero. In floating-point arithmetic it is possible for the algorithm to fail for a positive definite matrix, but only if it is numerically singular: the algorithm is guaranteed to run to completion if the condition number $\kappa_2(A) = \|A\|_2 \|A^{-1}\|_2$ is safely less than $u^{-1}$ , where $u$ is the unit roundoff.

The MATLAB function chol normally returns an error message if the factorization fails. But a second output argument can be requested, which is set to the number of the stage on which the factorization failed, or to zero if the factorization succeeded. In the case of failure, the partially computed $R$ factor, returned in the first argument, can be used to compute a direction of negative curvature for $A$ , which is a vector $z$ such that $z^T\!Az < 0$ . Here is an example. The code

n = 8;
A = gallery('lehmer',n) - 0.3*eye(n); % Indefinite matrix.
[R,p] = chol(A)
z = [-R\(R'\A(1:p-1,p)); 1; zeros(n-p,1)];
neg_curve = z'*A*z

produces the output

R =
   8.3666e-01   5.9761e-01   3.9841e-01
            0   5.8554e-01   7.3193e-01
            0            0   7.4536e-02
p =
     4
neg_curve =
  -9.1437e+00

Cholesky factorization has excellent numerical stability. The computed factor $\widehat{R}$ satisfies

$A + \Delta A = \widehat{R}^T\widehat{R}, \quad \|\Delta A\|_2 \le c_nu\|A\|_2,$

where $c_n$ is a constant. Unlike for LU factorization there is no possibility of element growth; indeed for $i\le j$ ,

$r_{ij}^2 \le \displaystyle \sum_{k=1}^j r_{kj}^2 = a_{ij},$

so the elements of $R$ are nicely bounded relative to those of $A$ .

Once we have a Cholesky factorization we can use it to solve a linear system $Ax = b$ , by solving the lower triangular system $R^Ty = b$ and then the upper triangular system $Rx = y$ . The computed solution $\widehat{x}$ can be shown to satisfy

$(A+\Delta A)\widehat{x} = b, \quad \|\Delta A\|_2 \le d_nu\|A\|_2$

where $d_n$ is another constant. Thus $\widehat{x}$ has a small backward error.

Finally, what if $A$ is only symmetric positive semidefinite, that is, $x^T\!Ax \ge 0$ for all $x$ , so that $A$ is possibly singular? There always exists a Cholesky factorization (we can take the $R$ factor in the QR factorization of the positive semidefinite square root of $A$ ) but it may not be unique. For example, we have

$A = \left[\begin{array}{cccc} 1 & 1 & 1 & 1\\ 1 & 1 & 1 & 1\\ 1 & 1 & 2 & 2\\ 1 & 1 & 2 & 4\\ \end{array}\right] = \left[\begin{array}{cccc} 1 & 0 & 0 & 0\\ 1 & 0 & 0 & 0\\ 1 & 1 & 0 & 0\\ 1 & 1 & x & y\\ \end{array}\right] \left[\begin{array}{cccc} 1 & 1 & 1 & 1\\ 0 & 0 & 1 & 1\\ 0 & 0 & 0 & x\\ 0 & 0 & 0 & y\\ \end{array}\right] = R^T\!R$

for any $x$ and $y$ such that $x^2 + y^2 = 2$ . Note that $A$ has rank 3 but $R$ has two zero diagonal elements. When $A$ is positive semidefinite of rank $r$ what is usually wanted is a Cholesky factorization in which $R$ is zero in its last $n-r$ rows. Such a factorization can be obtained by using complete pivoting, which at each stage permutes the largest remaining diagonal element into the pivot position, which gives a factorization

$\Pi^T\mskip-5mu A\Pi = R^T\!R, \quad R = \begin{bmatrix} R_{11} & R_{12} \\ 0 & 0 \end{bmatrix}, \quad R_{11}\in\mathbb{R}^{r \times r},$

where $\Pi$ is a permutation matrix. For example, with $\Pi$ the identity matrix with its columns in reverse order,

$\Pi^T\mskip-5mu A\Pi = \left[\begin{array}{cccc} 4 & 2 & 1 & 1\\ 2 & 2 & 1 & 1\\ 1 & 1 & 1 & 1\\ 1 & 1 & 1 & 1 \end{array}\right] = \left[\begin{array}{cccc} 2 & 0 & 0 & 0\\ 1 & 1 & 0 & 0\\ \frac{1}{2} & \frac{1}{2} & \frac{\sqrt{2}}{2} & 0\\[2pt] \frac{1}{2} & \frac{1}{2} & \frac{\sqrt{2}}{2} & 0 \end{array}\right] \left[\begin{array}{cccc} 2 & 1 & \frac{1}{2} & \frac{1}{2}\\[2pt] 0 & 1 & \frac{1}{2} & \frac{1}{2}\\[2pt] 0 & 0 & \frac{\sqrt{2}}{2} & \frac{\sqrt{2}}{2}\\ 0 & 0 & 0 & 0 \end{array}\right],$

which clearly displays the rank of $A$ .

References

This is a minimal set of references, which contain further useful references within.

Claude Brezinski, The Life and Work of André Cholesky, Numer. Algorithms 43, 279–288, 2006.
Nicholas J. Higham, Cholesky factorization, WIREs Comp. Stat. 1(2), 251–254, 2009.
Nicholas J. Higham, Accuracy and Stability of Numerical Algorithms, second edition, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2002. Chapter 10.

One thought on “What Is a Cholesky Factorization?”

Zhibin says:

May 19, 2023 at 8:47 am

Thank you for your blogger and article, it’s been a great help to me.
This blog says ” the cholesky is guaranteed to run to completion if the condition number k2(A) is safely less than u^-1″. But you have used Wilkinson’s bound in other articles and books, namely 20n^(3/2)*k2(A)*u^-1 < 1. Did I miss something?

What Is a Cholesky Factorization?

References

Related Blog Posts

One thought on “What Is a Cholesky Factorization?”

Leave a comment Cancel reply

References

Related Blog Posts

Share this:

Related

One thought on “What Is a Cholesky Factorization?”

Leave a comment Cancel reply