Newton methods for minimizing a function generate a sequence of points , where the step from to is along a search direction determined from a linear system , where is the gradient and is an approximation to the Hessian matrix . The equation shows that is a descent direction if , and in order to guarantee that this condition holds for all we need to be positive definite. But even if is the exact Hessian, positive definiteness is not guaranteed far from a minimizer. We can modify the method to ensure definiteness of , as with quasi-Newton methods. Or we can perturb the matrix, if necessary, to make it positive definite. Modified Cholesky factorization perturbs and factorizes the matrix at the same time. It is useful in other situations, too, such as in constructing preconditioners and in bounding the distance to a positive semidefinite matrix.
A modified Cholesky factorization of a symmetric matrix is a factorization , where is a permutation matrix, is unit lower triangular, and is diagonal or block diagonal and positive definite. It follows that is a positive definite matrix.
A natural way to compute a modified Cholesky factorization is to modify the Cholesky factorization algorithm. Cholesky factorization fails when it tries to take the square root of a negative quantity or divide by zero. We can avoid both possibilities by increasing nonpositive pivots when they are encountered. This corresponds to making a diagonal perturbation to and computing a Cholesky factorization . However, choosing a suitable is more difficult than it might seem.
Consider the matrix
Since Cholesky factorization generates the same sequence of Schur complements as Gaussian elimination, it suffices to consider Gaussian elimination. The diagonal elements of are the square roots of the pivots. After one step of elimination the reduced matrix is
and the trailing matrix (a Schur complement) is clearly indefinite because the entry, which is the next pivot, is negative. We can make the (2,2) entry positive by adding to it:
The next stage of the factorization can complete and it yields
The trailing submatrix has elements of order . Not only will a perturbation of order be required to the element to allow the Cholesky factorization to continue, but the Cholesky factor will have elements of order so numerical stability will likely be lost. Yet the smallest eigenvalue of is of order , so it should have been possible to make only an perturbation to in order for the factorization to succeed.
This example shows that if we are to increase a pivot element then we need a more sophisticated strategy that takes account of the size of the resulting elements of the factors and the effect on later stages of the factorization.
A modified Cholesky factorization should satisfy, as far as possible, four objectives.
- If is “sufficiently positive definite” then is zero.
- If is indefinite, is not much larger than
for some appropriate norm.
- The matrix is reasonably well conditioned.
- The cost of the algorithm is the same as the cost of standard Cholesky factorization, that is, flops for an matrix.
Gill and Murray (1974) gave the first modified Cholesky algorithm, which computes with diagonal and . It was refined by Gill, Murray, and Wright in 1981. Schnabel and Eskow (1990) gave an algorithm that attempts to produce smaller values of , partly by exploiting eigenvalue bounds obtained from Gershgorin’s theorem. That algorithm was subsequently improved by Schnabel and Eskow (1999).
A different approach was taken by Cheng and Higham (1998), building on an earlier idea by Bunch and Sorensen. This approach computes a block factorization , were is a permutation matrix, is unit lower triangular, and is block diagonal with diagonal blocks of size or . The pivoting strategy is the symmetric rook pivoting strategy of Ashcraft, Grimes, and Lewis (1998), which has the key property of producing a bounded factor. The cost of pivoting is typically comparisons but can be as large as in the worst case. Cheng and Higham compute the perturbation of minimal Frobenius norm such that has eigenvalues greater than or equal to , where is a parameter. The modified Cholesky factorization is then .
A significant advantage of the block approach is that it is modular: any implementation of the factorization can be used and the modification is simply inexpensive postprocessing of the factor. The other algorithms are not simple modifications of an factorization and are not straightforward to implement in an efficient way as a block algorithm. Note that in the block approach is a full matrix and it is not explicitly computed.
Modified Cholesky software is not widely available in libraries. Implementations of the Cheng–Higham algorithm are available in
- the NAG Library routine f01mdf (
- the MATLAB codes in the repository https://github.com/higham/modified-cholesky.
We take the matrix above with :
It has eigenvalues
-1.0050e+00 -2.3744e-01 1.0000e+00 4.2325e+00
The Gill–Murray–Wright algorithm computes as the diagonal matrix with diagonal elements
0 2.0200e+00 2.0000e+00 0
while the Schnabel–Eskow algorithm (1999) computes with diagonal elements
1.0000e+00 1.0050e+00 1.0050e+00 1.0000e+00
For the Cheng–Higham algorithm with (where is the unit roundoff), the perturbed matrix is
1.0000e+00 1.0000e+00 1.0000e+00 0 1.0000e+00 1.4950e+00 1.4975e+00 9.9749e-01 1.0000e+00 1.4975e+00 1.5000e+00 1.0025e+00 0 9.9749e-01 1.0025e+00 2.0100e+00
The Frobenius norms of the perturbations to are , , and , respectively, and the 2-norm condition numbers are , , and . The large condition number for the Cheng–Higham algorithm is caused by the value of the parameter . With , the perturbed matrix is
1.0000e+00 1.0000e+00 1.0000e+00 0 1.0000e+00 1.5453e+00 1.4475e+00 9.9724e-01 1.0000e+00 1.4475e+00 1.5497e+00 1.0027e+00 0 9.9724e-01 1.0027e+00 2.1100e+00
at Frobenius norm distance from and with -norm condition number . For comparison, the symmetric matrix with all eigenvalues greater than or equal to that is closest to in the Frobenius norm is at a distance from .
In general, there is no clear ordering of the different modified Cholesky methods in terms of their ability to satisfy the four criteria.
This is a minimal set of references, which contain further useful references within.
- Sheung Hun Cheng and Nicholas Higham, A Modified Cholesky Algorithm Based on a Symmetric Indefinite Factorization, SIAM J. Matrix Anal. Appl. 19(4), 1097–1110, 1998.
- Haw-Ren Fang and Dianne O’Leary, Modified Cholesky Algorithms: A Catalog with New Approaches, Math. Program. 115, 319–349, 2008
- Philip Gill, Walter Murray, and Margaret Wright, Practical Optimization, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2020. Republication of book first published by Academic Press, 1981.
- Thomas McSweeney, Modified Cholesky Decomposition and Applications, M.Sc. Thesis, The University of Manchester, 2017.
- Robert Schnabel and Elizabeth Eskow, A Revised Modified Cholesky Factorization Algorithm, SIAM J. Optim. 9(4), 1135–1148, 1999.
Related Blog Posts
This article is part of the “What Is” series, available from https://nhigham.com/category/what-is and in PDF form from the GitHub repository https://github.com/higham/what-is.
2 thoughts on “What Is a Modified Cholesky Factorization?”
When you give the cost of the Cholesky factorization, do you mean o(n^3) (little-o), rather than O(n^3) (big-O) ?
That was indeed a typo: n^3/3 + O(n^3) now changed to n^/3 + O(n^2).