Iterative refinement is a method for improving the quality of an approximate solution to a linear system of equations
, where
is an
nonsingular matrix. The basic iterative refinement algorithm is very simple.
- Compute the residual
.
- Solve
.
- Update
.
- Repeat from step 1 if necessary.
At first sight, this algorithm seems as expensive as solving for the original . However, usually the solver is LU factorization with pivoting,
(where we include the permutations in
). Most of the work is in the LU factorization, which costs
flops, and each iteration requires a multiplication with
for the residual and two substitutions to compute
, which total only
flops. If the refinement converges quickly then it is inexpensive.
Turning to the error, with a stable LU factorization the initial computed in floating-point arithmetic of precision
satisfies (omitting constants)
where is the matrix condition number in the
-norm. We would like refinement to produce a solution accurate to precision
:
But if the solver cannot compute the initial accurately when
is ill-conditioned why should it be able to produce an update
that improves
?
The simplest answer is that when iterative refinement was first used on digital computers the residual was computed at twice the working precision, which could be done at no extra cost in the hardware. If
is a reasonable approximation then we expect cancellation in forming
, so using extra precision in forming
ensures that
has enough correct digits to yield a correction
that improves
. This form of iterative refinement produces a solution satisfying (2) as long as
.
Here is a MATLAB example, where the working precision is single and residuals are computed in double precision.
n = 8; A = single(gallery('frank',n)); xact = ones(n,1); b = A*xact; % b is formed exactly for small n. x = A\b; fprintf('Initial_error = %4.1e\n', norm(x - xact,inf)) r = single( double(b) - double(A)*double(x) ); d = A\r; x = x + d; fprintf('Second error = %4.1e\n', norm(x - xact,inf))
The output is
Initial_error = 9.1e-04 Second error = 6.0e-08
which shows that after just one step the error has been brought down from to the level of
, the unit roundoff for IEEE single precision arithmetic.
Fixed Precision Iterative Refinement
By the 1970s, computers had started to lose the ability to cheaply accumulate inner products in extra precision, and extra precision could not be programmed portably in software. It was discovered, though, that even if iterative refinement is run entirely in one precision it can bring benefits when . Specifically,
- if the solver is somewhat numerically unstable the instability is cured by the refinement, in that a relative residual satisfying
is produced, and
- a relative error satisfying
is produced, where
The bound (4) is stronger than (1) because is no larger than
and can be much smaller, especially if
has badly scaled rows.
Low Precision Factorization
In the 2000s processors became available in which 32-bit single precision arithmetic ran at twice the speed of 64-bit double precision arithmetic. A new usage of iterative refinement was developed in which the working precision is double precision and a double precision matrix is factorized in single precision.
- Factorize
in single precision.
- Solve
by substitution in single precision, obtaining
.
- Compute the residual
in double precision.
- Solve
in single precision.
- Update
in double precision.
- Repeat from step 3 if necessary.
Since most of the work is in the single precision factorization, this algorithm is potentially twice as fast as solving entirely in double precision arithmetic. The algorithm achieves the same limiting accuracy (4) and limiting residual (3) provided that
.
GRMES-IR
A way to weaken this restriction on is to use a different solver on step 4: solve
by GMRES (a Krylov subspace iterative method) in double precision. Within GMRES, is not formed but is applied to a vector as a multiplication with
followed by substitutions with
and
. As long as GMRES converges quickly for this preconditioned system the speed again from the fast single precision factorization will not be lost. Moreover, this different step 4 results in convergence for
a couple of orders magnitude larger than before, and if residuals are computed in quadruple precision then a limiting accuracy (2) is achieved.
This GMRES-based iterative refinement becomes particularly advantageous when the fast half precision arithmetic now available in hardware is used within the LU factorization, and one can use three or more precisions in the algorithm in order to balance speed, accuracy, and the range of problems that can be solved.
Finally, we note that iterative refinement can also be applied to least squares problems, eigenvalue problems, and the singular value decomposition. See Higham and Mary (2022) for details and references.
References
We give five references, which contain links to the earlier literature.
- Erin Carson and Nicholas J. Higham. Accelerating the Solution of Linear Systems by Iterative Refinement in Three Precisions. SIAM J. Sci. Comput., 40(2):A817–A847,2018.
- Azzam Haidar, Harun Bayraktar, Stanimire Tomov, Jack Dongarra, and Nicholas J. Higham. Mixed-Precision Iterative Refinement Using Tensor Cores on GPUs to Accelerate Solution Of linear systems. Proc. Roy. Soc. London A, 476(2243):20200110, 2020.
- Nicholas J. Higham and Theo Mary. Mixed Precision Algorithms in Numerical Linear Algebra. Acta Numerica, 31:347–414, 2022.
- Nicholas J. Higham and Dennis Sherwood, How to Boost Your Creativity, SIAM News, 55(5):1, 3, 2022. (Explains how developments in iterative refinement 1948–2022 correspond to asking “how might this be different” about each aspect of the algorithm.)
- Julie Langou, Julien Langou, Piotr Luszczek, Jakub Kurzak, Alfredo Buttari, and Jack Dongarra. Exploiting the Performance of 32 Bit Floating Point Arithmetic in Obtaining 64 Bit Accuracy (Revisiting Iterative Refinement for Linear Systems). In Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, IEEE, November 2006.
Related Blog Posts
- A Multiprecision World (2017)
- Accelerating the Solution of Linear Systems by Iterative Refinement in Three Precisions (2017)
- What Is an LU Factorization? (2021)
- What Is IEEE Standard Arithmetic? (2020)
This article is part of the “What Is” series, available from https://nhigham.com/category/what-is and in PDF form from the GitHub repository https://github.com/higham/what-is.