What Is the Complex Step Approximation?

In many situations we need to evaluate the derivative of a function but we do not have an explicit formula for the derivative. The complex step approximation approximates the derivative (and the function value itself) from a single function evaluation. The catch is that it involves complex arithmetic.

For an analytic function f we have the Taylor expansion

\notag     \qquad\qquad\qquad\qquad  f(x + \mathrm{i}h) = f(x) + \mathrm{i}h f'(x) - h^2\displaystyle\frac{f''(x)}{2}                             + O(h^3),    \qquad\qquad\qquad\qquad(*)

where \mathrm{i} = \sqrt{-1} is the imaginary unit. Assume that f maps the real line to the real line and that x and h are real. Then equating real and imaginary parts in (*) gives \mathrm{Re} f(x+\mathrm{i}h) = f(x) + O(h^2) and \mathrm{Im} f(x+\mathrm{i}h) = hf'(x) + O(h^3). This means that for small h, the approximations

\notag     f(x) \approx \mathrm{Re} f(x+\mathrm{i}h), \quad     f'(x) \approx \mathrm{Im}  \displaystyle\frac{f(x+\mathrm{i}h)}{h}

both have error O(h^2). So a single evaluation of f at a complex argument gives, for small h, a good approximation to f'(x), as well as a good approximation to f(x) if we need it.

The usual way to approximate derivatives is with finite differences, for example by the forward difference approximation

\notag       f'(x) \approx \displaystyle\frac{f(x+h) - f(x)}{h}.

This approximation has error O(h) so it is less accurate than the complex step approximation for a given h, but more importantly it is prone to numerical cancellation. For small h, f(x+h) and f(x) agree to many significant digits and so in floating-point arithmetic the difference approximation suffers a loss of significant digits. Consequently, as h decreases the error in the computed approximation eventually starts to increase. As numerical analysis textbooks explain, the optimal choice of h that balances truncation error and rounding errors is approximately

\notag   h_{\mathrm{opt}} = 2\Bigl|\displaystyle\frac{u f(x)}{f''(x))} \Bigr|^{1/2},

where u is the unit roundoff. The optimal error is therefore of order u^{1/2}.

A simple example illustrate these ideas. For the function f(x) = \mathrm{e}^x with x = 1, we plot in the figure below the relative error for the finite difference, in blue, and the relative error for the complex step approximation, in orange, for h ranging from about 10^{-5} to 10^{-11}. The dotted lines show u and u^{1/2}. The computations are in double precision (u \approx 1.1\times 10^{-16}). The finite difference error decreases with h until it reaches about h_{\mathrm{opt}} = 2.1\times 10^{-8}; thereafter the error grows, giving the characteristic V-shaped error curve. The complex step error decreases steadily until it is of order u for h \approx u^{1/2}, and for each h it is about the square of the finite difference error, as expected from the theory.

cstep_ex.jpg

Remarkably, one can take h extremely small in the complex step approximation (e.g., h = 10^{-100}) without any ill effects from roundoff.

The complex step approximation carries out a form of approximate automatic differentiation, with the variable h functioning like a symbolic variable that propagates through the computations in the imaginary parts.

The complex step approximation applies to gradient vectors and it can be extended to matrix functions. If f is analytic and maps real n\times n matrices to real n\times n matrices and A and E are real then (Al-Mohy and Higham, 2010)

\notag     L_f(A,E) \approx \mathrm{Im} \displaystyle\frac{f(A+\mathrm{i}hE)}{h},

where L_f(A,E) is the Fréchet derivative of f at A in the direction E. It is important to note that the method used to evaluate f must not itself use complex arithmetic (as methods based on the Schur decomposition do); if it does, then the interaction of those complex terms with the much smaller \mathrm{i}hE term can lead to damaging subtractive cancellation.

The complex step approximation has also been extended to higher derivatives by using “different imaginary units” in different components (Lantoine et al., 2012).

Here are some applications where the complex step approximation has been used.

  • Sensitivity analysis in engineering applications (Giles et al., 2003).
  • Approximating gradients in deep learning (Goodfellow et al., 2016).
  • Approximating the exponential of an operator in option pricing (Ackerer and Filipović, 2019).

Software has been developed for automatically carrying out the complex step method—for example, by Shampine (2007).

The complex step approximation has been rediscovered many times. The earliest published appearance that we are aware of is in a paper by Squire and Trapp (1998), who acknowledge earlier work of Lyness and Moler on the use of complex variables to approximate derivatives.

References

This is a minimal set of references, which contain further useful references within.

Related Blog Posts

This article is part of the “What Is” series, available from https://nhigham.com/category/what-is and in PDF form from the GitHub repository https://github.com/higham/what-is.

This entry was posted in what-is. Bookmark the permalink.

2 Responses to What Is the Complex Step Approximation?

  1. Orr Shalit says:

    This is the coolest new thing I learned in blog post in I-don’t-know-how-long.
    Thanks!

  2. George Constantinides says:

    Neat! Brings to mind dual numbers but without nilpotency (https://en.wikipedia.org/wiki/Dual_number).

Leave a Reply to Orr Shalit Cancel reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s