Jack Williams (1943–2015)

williams-jack2.jpg

Jack Williams passed away on November 13th, 2015, at the age of 72.

Jack obtained his PhD from the University of Oxford Computing Laboratory in 1968 and spent two years as a Lecturer in Mathematics at the University of Western Australia in Perth. He was appointed Lecturer in Numerical Analysis at the University of Manchester in 1971.

He was a member of the Numerical Analysis Group (along with Christopher Baker, Ian Gladwell, Len Freeman, George Hall, Will McLewin, and Joan Walsh) that, together with numerical analysis colleagues at UMIST, took the subject forward at Manchester from the 1970s onwards.

Jack’s main research area was approximation theory, focusing particularly on Chebyshev approximation of real and complex functions. He also worked on stiff ordinary differential equations (ODEs). His early work on Chebyshev approximation in the complex plane by polynomials and rationals was particularly influential and is among his most-cited. Example contributions are

J. Williams (1972). Numerical Chebyshev approximation by interpolating rationals. Math. Comp., 26(117), 199–206.

S. Ellacott and J. Williams (1976). Rational Chebyshev approximation in the complex plane. SIAM J. Numer. Anal., 13(3), 310–323.

His later work on discrete Chebyshev approximation was of particular interest to me as it involved linear systems with Chebyshev-Vandermonde coefficient matrices, which I, and a number of other people, worked on a few years later:

M. Almacany, C. B. Dunham and J. Williams (1984). Discrete Chebyshev approximation by interpolating rationals. IMA J. Numer. Anal. 4, 467–477.

On the differential equations side, Jack wrote the opening chapter “Introduction to discrete variable methods” of the proceedings of a summer school organized jointly by the University of Liverpool and the University of Manchester in 1975 and published in G. Hall and J. M. Watt, eds, Modern Numerical Methods for Ordinary Differential Equations, Oxford University Press, 1976. This book’s timely account of the state of the art, covering stiff and nonstiff problems, boundary value problems, delay-differential equations, and integral equations, was very influential, as indicted by its 549 citations on Google Scholar. Jack contributed articles on ODEs and PDEs to three later Liverpool–Manchester volumes (1979, 1981, 1986).

Jack’s interests in approximation theory and differential equations were combined in his later work on parameter estimation in ODEs, where a theory of Chebyshev approximation applied to solutions of parameter-dependent ODEs was established, as exemplified by

J. Williams and Z. Kalogiratou (1993). Least squares and Chebyshev fitting for parameter estimation in ODEs. Adv. Comp. Math., 1(3), 357–366.

More details on Jack’s publications can be found at his MathSciNet author profile (subscription required). Some of his later unpublished technical reports from the 1990s can be accessed at from the list of Numerical Analysis Reports of the Manchester Centre for Computational Mathematics.

Jack spent a sabbatical year in the Department of Computer Science at the University of Toronto, 1976–1977, at the invitation of Professor Tom Hull. Over a number of years several visits between Manchester and Toronto were made in both directions by numerical analysts in the two departments.

It’s a fact of academic life that seminars can be boring and even impenetrable. Jack could always be relied on to ask insightful questions, whatever the topic, thereby improving the experience of everyone in the room.

Jack was an excellent lecturer, who taught at all levels from first year undergraduate through to Masters courses. He was confident, polished, and entertaining, and always took care to emphasize practicalities along with the theory. He had the charisma—and the loud voice!—to keep the attention of any audience, no matter how large it might be.

He studied Spanish at the Instituto Cervantes in Manchester, gaining an A-level in 1989 and a Diploma Basico de Espanol Como Lengua Extranjera from the Spanish Ministerio de Educación y Ciencia in 1992. He subsequently set up a four-year degree in Mathematics with Spanish, linking Manchester with Universidad Complutense de Madrid.

Jack was promoted to Senior Lecturer in 1996 and took early retirement in 2000. He continued teaching in the department right up until the end of the 2014/2015 academic year.

I benefited greatly from Jack’s advice and support both as a postgraduate student and when I began as a lecturer. My office was next to his, and from time to time I would hear strains of classical guitar, which he studied seriously and sometimes practiced during the day. For many years I shared pots of tea with him in the Senior Common Room at the refectory, where a group of mathematics colleagues met for lunchtime discussions.

Jack was gregarious, ever cheerful, and a good friend to many of his colleagues. He will be sadly missed.

890700.jpg
George Hall, Jack WIlliams and Des Higham at the Conference on Computational Ordinary Differential Equations, London, England, 1989.

Faster SVD via Polar Decomposition

prof-svd.jpg
The car number plate of Gene Golub, who did so much to promote the SVD. Photo credit: P. M. Kroonenberg.

The singular value decomposition (SVD) is one of the most important tools in matrix theory and matrix computations. It is described in many textbooks and is provided in all the standard numerical computing packages. I wrote a two-page article about the SVD for The Princeton Companion to Applied Mathematics, which can be downloaded in pre-publication form as an EPrint.

The polar decomposition is a close cousin of the SVD. While it is probably less well known it also deserves recognition as a fundamental theoretical and computational tool. The polar decomposition is the natural generalization to matrices of the polar form z = r \mathrm{e}^{\mathrm{i}\theta} for complex numbers, where r\ge0, \mathrm{i} is the imaginary unit, and \theta\in(-\pi,\pi]. The generalization to an m\times n matrix is A = UH, where U is m\times n with orthonormal columns and H is n\times n and Hermitian positive semidefinite. Here, U plays the role of \mathrm{e}^{\mathrm{i}\theta} in the scalar case and H the role of r.

It is easy to prove existence and uniqueness of the polar decomposition when A has full rank. Since A = UH implies A^*A = HU^*UH = H^2, we see that H must be the Hermitian positive definite square root of the Hermitian positive definite matrix A^*A. Therefore we set H = (A^*A)^{1/2}, after which U = AH^{-1} is forced. It just remains to check that this U has orthonormal columns: U^*U = H^{-1}A^*AH^{-1} = H^{-1}H^2H^{-1} = I.

Many applications of the polar decomposition stem from a best approximation property: for any m\times n matrix A the nearest matrix with orthonormal columns is the polar factor U, for distance measured in the 2-norm, the Frobenius norm, or indeed any unitarily invariant norm. This result is useful in applications where a matrix that should be unitary turns out not to be so because of errors of various kinds: one simply replaces the matrix by its unitary polar factor.

However, a more trivial property of the polar decomposition is also proving to be important. Suppose we are given A = UH and we compute the eigenvalue decomposition H = QDQ^*, where D is diagonal with the eigenvalues of H on its diagonal and Q is a unitary matrix of eigenvectors. Then A = UH = UQDQ^* = (UQ)DQ^* \equiv P\Sigma Q^* is an SVD! My PhD student Pythagoras Papadimitriou and I proposed using this relation to compute the SVD in 1994, and obtained speedups by a factor six over the LAPACK SVD code on the Kendall Square KSR1, a shared memory parallel machine of the time.

Yuji Nakatsukasa and I recently revisited this idea. In a 2013 paper in the SIAM Journal of Scientific Computing we showed that on modern architectures it is possible to compute the SVD via the polar decomposition in a way that is both numerically stable and potentially much faster than the standard Golub–Reinsch SVD algorithm. Our algorithm has two main steps.

  1. Compute the polar decomposition by an accelerated Halley algorithm called QDWH devised by Nakatsukasa, Bai, and Gygi (2010), for which the main computational kernel is QR factorization.
  2. Compute the eigendecomposition of the Hermitian polar factor by a spectral divide and conquer algorithm. This algorithm repeatedly applies QDWH to the current block to compute an invariant subspace corresponding to the positive or negative eigenvalues and thereby divides the problem into two smaller pieces.

The polar decomposition is fundamental to both steps of the algorithm. While the total number of flops required is greater than for the standard SVD algorithm, the new algorithm has lower communication costs and so should be faster on parallel computing architectures once communication costs are sufficiently greater than the costs of floating point arithmetic. Sukkari, Ltaief, and Keyes have recently shown that on a multicore architecture enhanced with multiple GPUs the new QDWH-based algorithm is indeed faster than the standard approach. Another interesting feature of the new algorithm is that it has been found experimentally to have better accuracy.

The Halley iteration that underlies the QDWH algorithm for the polar decomposition has cubic convergence. A version of QDWH with order of convergence seventeen, which requires just two iterations to converge to double-precision accuracy, has been developed by Nakatsukasa and Freund (2015), and is aimed particularly at parallel architectures. This is a rare example of an iteration with a very high order of convergence actually being of practical use.

Numerical Linear Algebra and Matrix Analysis

Matrix analysis and numerical linear algebra are two very active, and closely related, areas of research. Matrix analysis can be defined as the theory of matrices with a focus on aspects relevant to other areas of mathematics, while numerical linear algebra (also called matrix computations) is concerned with the construction and analysis of algorithms for solving matrix problems, as well as related topics such as problem sensitivity and rounding error analysis.

My article Numerical Linear Algebra and Matrix Analysis for The Princeton Companion to Applied Mathematics gives a selective overview of these two topics. The table of contents is as follows.

1 Nonsingularity and Conditioning
2 Matrix Factorizations
3 Distance to Singularity and Low-Rank Perturbations
4 Computational Cost
5 Eigenvalue Problems
  5.1 Bounds and Localization
  5.2 Eigenvalue Sensitivity
  5.3 Companion Matrices and the Characteristic Polynomial
  5.4 Eigenvalue Inequalities for Hermitian Matrices
  5.5 Solving the Non-Hermitian Eigenproblem
  5.6 Solving the Hermitian Eigenproblem
  5.7 Computing the SVD
  5.8 Generalized Eigenproblems
6 Sparse Linear Systems
7 Overdetermined and Underdetermined Systems
  7.1 The Linear Least Squares Problem
  7.2 Underdetermined Systems
  7.3 Pseudoinverse
8 Numerical Considerations
9 Iterative Methods
10 Nonnormality and Pseudospectra
11 Structured Matrices
  11.1 Nonnegative Matrices
  11.2 M-Matrices
12 Matrix Inequalities
13 Library Software
14 Outlook

The article can be downloaded in pre-publication form as an EPrint.

Corless and Fillion’s A Graduate Introduction to Numerical Methods from the Viewpoint of Backward Error Analysis

cofi13_cover.jpg

I acquired this book when it first came out in 2013 and have been dipping into it from time to time ever since. At 868 pages long, the book contains a lot of material and I have only sampled a small part of it. In this post I will not attempt to give a detailed review but rather will explain the distinctive features of the book and why I like it.

As the title suggests, the book is pitched at graduate level, but it will also be useful for advanced undergraduate courses. The book covers all the main topics of an introductory numerical analysis course: floating point arithmetic, interpolation, nonlinear equations, numerical linear algebra, quadrature, numerical solution of differential equations, and more.

In order to stand out in the crowded market of numerical analysis textbooks, a book needs to offer something different. This one certainly does.

  • The concepts of backward error and conditioning are used throughout—not just in the numerical linear algebra chapters.
  • Complex analysis, and particularly the residue theorem, is exploited throughout the book, with contour integration used as a fundamental tool in deriving interpolation formulas. I was pleased to see section 11.3.2 on the complex step approximation to the derivative of a real-valued function, which provides an interesting alternative to finite differences. Appendix B, titled “Complex Numbers”, provides in just 8 pages excellent advice on the practical usage of complex numbers and functions of a complex variable that would be hard to find in complex analysis texts. For example, it has a clear discussion of branch cuts, making use of Kahan’s counterclockwise continuity principle (eschewing Riemann surfaces, which have “almost no traction in the computer world”), and makes use of the unwinding number introduced by Corless, Hare, and Jeffrey in 1996.
  • The barycentric formulation of Lagrange interpolation is used extensively, possibly for the first time in a numerical analysis text. This approach was popularized by Berrut and Trefethen in their 2004 SIAM Review paper, and my proof of the numerical stability of the formulas has helped it to gain popularity. Polynomial interpolation and rational interpolation are both covered.
  • Both numerical and symbolic computation are employed—whichever is the most appropriate tool for the topic or problem at hand. Corless is well known for his contributions to symbolic computation and to Maple, but he is equally at home in the world of numerics. Chebfun is also used in a number of places. In addition, section 11.7 gives a 2-page treatment of automatic differentiation.

This is a book that one can dip into at any page and quickly find something that is interesting and beyond standard textbook content. Not many numerical analysis textbooks include the Lambert W function, a topic on which Corless is uniquely qualified to write. (I note that Corless and Jeffrey wrote an excellent article on the Lambert W function for the Princeton Computation to Applied Mathematics.) And not so many use pseudospectra.

I like Notes and References sections and this book has lots of them, with plenty of detail, including references that I was unaware of.

As regards the differential equation content, it includes initial and boundary value problems for ODEs, as well as delay differential equations (DDEs) and PDEs. The DDE chapter uses the MATLAB dde23 and ddesd functions for illustration and, like the other differential equation chapters, discusses conditioning.

The book would probably have benefited from editing to reduce its length. The index is thorough, but many long entries need breaking up into subentries. Navigation of the book would be easier if algorithms, theorems, definitions, remarks, etc., had been numbered in one sequence instead of as separate sequences.

Part of the book’s charm is its sometimes unexpected content. How many numerical analysis textbooks recommend reading a book on the programming language Forth (a small, reverse Polish notation-based language popular on microcomputers when I was a student)? And how many would point out the 1994 “rediscovery” of the trapezoidal rule in an article in the journal Diabetes Care (Google “Tai’s model” for some interesting responses to that article).

I bought the book from SpringerLink via the MyCopy feature, whereby any book available electronically via my institution’s subscription can be bought in (monochrome) hard copy for 24.99 euros, dollars, or pounds (the same price in each currency!).

I give the last word to John Butcher, who concludes the Foreword with “I love this book.”

Publication Peculiarities: Sequences of Papers

This is the third post in my sequence on publication peculiarities.

It is not unusual to see a sequence of related papers with similar titles, sometimes labelled “Part I”, “Part II” etc. Here I present two sequences of papers with intriguing titles and interesting stories behind them.

Computing the Logarithm of a Complex Number

The language Algol 60 did not have complex numbers as a built-in data type, so it was necessary to write routines to implement complex arithmetic. The following sequence of papers appeared in Communications of the ACM in the 1960s and concerns writing an Algol 60 code to evaluate the logarithm of a complex number.

J. R. Herndon (1961). Algorithm 48: Logarithm of a complex number. Comm. ACM, 4(4), 179.

A. P. Relph (1962). Certification of Algorithm 48: Logarithm of a complex number. Comm. ACM, 5(6), 347.

M. L. Johnson and W. Sangren (1962). Remark on Algorithm 48: Logarithm of a complex number. Comm. CACM, 5(7), 391.

D. S. Collens (1964). Remark on remarks on Algorithm 48: Logarithm of a complex number. Comm. ACM, 7(8), 485.

D. S. Collens (1964). Algorithm 243: Logarithm of a complex number: Rewrite of Algorithm 48. Comm. CACM, 7(11), 660.

“Remark on remarks”, “rewrite”—what are the reasons for this sequence of papers?

The first paper, by Herndon, gives a short code (7 lines in total) that uses the arctan function to find the argument of a complex number x+iy as \arctan(y/x). Relph notes that the code fails when the real part is zero and that, because it adds \pi to the \arctan, the imaginary part is on the wrong range, which should be (-\pi,\pi] for the principal logarithm. Moreover, the original code incorrectly uses log (log to base 10) instead of ln (the natural logarithm).

It would appear that at this time codes were not always run and tested before publication, presumably because of the lack of an available compiler. Indeed Herndon’s paper was published in the April 1961 issues of CACM, and the first Algol 60 compilers had only become available the year before. according to this Wikipedia timeline.

Johnson and Sangren give more discussion about division by zero and obtaining the correct signs.

In his first paper, Collens notes that the Johnson and Sangren code wrongly gives \log 0 = 0 and has a missing minus sign in one statement.

Finally, Collens gives a rewritten algorithm that addresses the previously noted deficiencies. It appears to have been run, since some output is shown.

This sequence of papers from the early days of digital computing emphasizes that even for what might seem to be a trivial problem it is not straightforward to design correct, reliable algorithms and codes.

I am working on logarithms and other multivalued functions of matrices, for which many additional complications are present.

Slow Manifolds

Edward Lorenz is well-known for introducing the Lorenz equations, discovering the Lorenz attractor, and describing the “butterfly effect”. His sequence of papers

E. N. Lorenz (1986). On the existence of a slow manifold. J. Atmos. Sci., 43(15), 1547–1557.

E. N. Lorenz and V. Krishnamurthy (1987). On the nonexistence of a slow manifold. J. Atmos. Sci., 44(20), 2940–2950.

E. N. Lorenz (1992). The slow manifold—What is it? J. Atmos. Sci., 49(24), 2449–2451.

seems to suggest a rather confused line of research!

However, inspection of the papers reveals the reasoning behind the choice of titles. The first paper discusses whether or not a slow manifold exists and shows that this question is nontrivial. The second paper shows that a slow manifold does not exist for one particular model. The third paper shows that the apparent contradiction to the second paper’s result by another author’s 1991 proof of the existence of a slow manifold for the same model can be explained by the use of different definitions of slow manifold.