What Is Backward Error?

Backward error is a measure of error associated with an approximate solution to a problem. Whereas the forward error is the distance between the approximate and true solutions, the backward error is how much the data must be perturbed to produce the approximate solution.

For a function f from \mathbb{R}^n to \mathbb{R}^n and an approximation y to f(x), the backward error in y is the smallest \Delta x such that y = f(x+\Delta x), for some appropriate measure of size. There can be many \Delta x satisfying this equation, so the backward error is the solution to a minimization problem. Using a vector norm and measuring perturbations in a relative sense, we can define the backward error in y as

\eta(y) = \min\{ \, \epsilon: y = f(x+\Delta x), \;                      \|\Delta x\| \le \epsilon \|x\| \,\}.

In the following figure the solid lines denote exact mappings and the dashed line shows the mapping that was actually computed.


Usually, but not always, the errors in question are rounding errors, but backward error can also be a useful way to characterize truncation errors (for example in deriving algorithms based on Padé approximation for computing matrix functions).

As an example, for the inner product u^Tv of two vectors the backward error of an approximation w can be defined as

\eta(w) = \min \{\, \epsilon: w = (u + \Delta u)^Tv,\;    \|\Delta u\|_2 \le \epsilon \|u\|_2 \,\},

where \|u\|_2 = (u^Tu)^{1/2}. It can be shown that

\eta(w) = \displaystyle\frac{ |w - u^Tv| }{ \|u\|_2 \|v\|_2 }.

The definition of \eta(w) is clearly unsymmetric in that u is perturbed but v is not. If v is perturbed instead of u then the same formula is obtained. If both u and v are perturbed then the constraint in the definition of \eta(w) becomes nonlinear in \Delta u and \Delta v and no explicit formula is available for \eta(w).

For some problems a backward error may not exist. An example is computation of the outer product A = uv^T of two n-vectors, which has rank 1. In floating-point arithmetic the computed matrix \widehat{A} is unlikely to be of rank 1, so \widehat{A} = (u + \Delta u)(v + \Delta v)^T is not in general possible for any \Delta u and \Delta v. In this situation one can consider a mixed backward–forward error that perturbs \widehat{A} as well as u and v.

Backward error analysis refers to rounding error analysis in which a bound is obtained for a suitably defined backward error. If the backward error can be shown to be small then y is the solution to a nearby problem. Indeed, if the backward error can be shown to be of the same size as any uncertainties in the data then y is as good a solution as can be expected.

Backward error analysis was developed and popularized by James Wilkinson in the 1950s and 1960s. He first used it in the context of computing zeros of polynomials, but the method’s biggest successes came when he applied it to linear algebra computations.

Backward error analysis has also been used in the context of the numerical solution of differential equations, where it is used in different forms known as defect control and shadowing.

The forward error of y\approx f(x) is bounded in terms of the backward error \eta(y) by

\displaystyle\frac{\|y - f(x)\|}{\|f\|} \le        \mathrm{cond}(f,x) \eta(y) + O(\eta(y))^2,

in view of the definition of condition number. Consequently, we have the rule of thumb that

\mbox{forward error} \lesssim    \mbox{condition number}\times    \mbox{backward error}.


This is a minimal set of references, which contain further useful references within.

Related Blog Posts

This article is part of the “What Is” series, available from https://nhigham.com/category/what-is and in PDF form from the GitHub repository https://github.com/higham/what-is.

One thought on “What Is Backward Error?

  1. This blog post series is a great idea, thanks Nick! Backward error analysis is one of the most elegant ideas in numerical analysis. Something that helped me to “demystify” this type of analysis was to realize that for some common problems (such as linear systems) the global optimization problem is simply a standard linear least-squares problem (when using the Frobenius norm). For other problems, one can linearize first via Taylor and then solve the linear least-squares problem. This gives a first-order estimate of the backward error and is probably not very elegant, but it is probably sufficient for the last two inequalities stated in your blog post.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s