# What Is a Vandermonde Matrix?

A Vandermonde matrix is defined in terms of scalars $x_1$, $x_2$, …, $x_n\in\mathbb{C}$ by

$\notag V = V(x_1,x_2,\dots,x_n) = \begin{bmatrix} 1 & 1 & \dots & 1 \\ x_1 & x_2 & \dots & x_n \\ \vdots &\vdots & & \vdots \\ x_1^{n-1} & x_2^{n-1} & \dots & x_n^{n-1} \end{bmatrix} \in \mathbb{C}^{n\times n}.$

The $x_i$ are called points or nodes. Note that while we have indexed the nodes from $1$, they are usually indexed from $0$ in papers concerned with algorithms for solving Vandermonde systems.

Vandermonde matrices arise in polynomial interpolation. Suppose we wish to find a polynomial $p_{n-1}(x) = a_nx^{n-1} + a_{n-1}x^{n-2} + \cdots + a_1$ of degree at most $n-1$ that interpolates to the data $(x_i,f_i)_{i=1}^n$, that is, $p_{n-1}(x_i) = f_i$, $i=1\colon n$. These equations are equivalent to

$\notag V^Ta = f \quad \mathrm{(dual)},$

where $a = [a_1,a_2,\dots,a_n]^T$ is the vector of coefficients. This is known as the dual problem. We know from polynomial interpolation theory that there is a unique interpolant if the $x_i$ are distinct, so this is the condition for $V$ to be nonsingular.

The problem

$\notag Vy = b \quad \mathrm{(primal)}$

is called the primal problem, and it arises when we determine the weights for a quadrature rule: given moments $b_i$ find weights $y_i$ such that $\sum_{j=1}^n y_j^{} x_j^{\,i-1} = b_i$, $i=1\colon n$.

## Determinant

The determinant of $V$ is a function of the $n$ points $x_i$. If $x_i = x_j$ for some $i\ne j$ then $V$ has identical $i$th and $j$th columns, so is singular. Hence the determinant must have a factor $x_i - x_j$. Consequently, we have

$\notag \det( V(x_1,x_2,\dots,x_n) ) = c \displaystyle\prod_{i,j = 1\atop i > j}^n (x_i - x_j),$

where, since both sides have degree $n(n-1)/2$ in the $x_i$, $c$ is a constant. But $\det(V)$ contains a term $x_2 x_3^2 \dots x_n^{n-1}$ (from the main diagonal), so $c = 1$. Hence

$\notag \det(V) = \displaystyle\prod_{i,j = 1\atop i > j}^n (x_i - x_j). \qquad (1)$

This formula confirms that $V$ is nonsingular precisely when the $x_i$ are distinct.

## Inverse

Now assume that $V$ is nonsingular and let $V^{-1} = W = (w_{ij})_{i,j=1}^n$. Equating elements in the $i$th row of $WV = I$ gives

$\sum_{j=1}^n w_{ij} x_k^{\mskip1mu j-1} = \delta_{ik}, \quad k=1\colon n,$

where $\delta_{ij}$ is the Kronecker delta (equal to $1$ if $i=j$ and $0$ otherwise). These equations say that the polynomial $\sum_{j=1}^n w_{ij} x^{\mskip1mu j-1}$ takes the value $1$ at $x = x_i$ and $0$ at $x = x_k$, $k\ne i$. It is not hard to see that this polynomial is the Lagrange basis polynomial:

$\notag \sum_{j=1}^n w_{ij} x^{j-1} = \displaystyle\prod_{k=1\atop k\ne i}^n \left( \frac{x-x_k}{x_i-x_k} \right) =: \ell_i(x). \qquad (2)$

We deduce that

$\notag w_{ij} = \displaystyle\frac{ (-1)^{n-j} \sigma_{n-j}(x_1,\dots,x_{i-1},x_{i+1},\dots,x_n) } { \displaystyle\prod_{k=1 \atop k\ne i}^n (x_i-x_k) }, \qquad (3)$

where $\sigma_k(y_1,\dots,y_n)$ denotes the sum of all distinct products of $k$ of the arguments $y_1,\dots,y_n$ (that is, $\sigma_k$ is the $k$th elementary symmetric function).

From (1) and (3) we see that if the $x_i$ are real and positive and arranged in increasing order $0 < x_1 < x_2 < \cdots 0$ and $V^{-1}$ has a checkerboard sign pattern: the $(i,j)$ element has sign $(-1)^{i+j}$.

Note that summing (2) over $i$ gives

$\notag \displaystyle\sum_{j=1}^n x^{j-1} \sum_{i=1}^n w_{ij} = \sum_{i=1}^n \ell_i(x) = 1,$

where the second equality follows from the fact that $\sum_{i=1}^n \ell_i(x)$ is a degree $n-1$ polynomial that takes the value $1$ at the $n$ distinct points $x_i$. Hence

$\notag \displaystyle\sum_{i=1}^n w_{ij} = \delta_{j1},$

so the elements in the $j$th column of the inverse sum to $1$ for $j = 1$ and $0$ for $j\ge 2$.

## Example

To illustrate the formulas above, here is an example, with $x_i = (i-1)/(n-1)$ and $n = 5$:

$\notag V = \left[\begin{array}{ccccc} 1 & 1 & 1 & 1 & 1\\ 0 & \frac{1}{4} & \frac{1}{2} & \frac{3}{4} & 1\\[\smallskipamount] 0 & \frac{1}{16} & \frac{1}{4} & \frac{9}{16} & 1\\[\smallskipamount] 0 & \frac{1}{64} & \frac{1}{8} & \frac{27}{64} & 1\\[\smallskipamount] 0 & \frac{1}{256} & \frac{1}{16} & \frac{81}{256} & 1 \end{array}\right], \quad V^{-1} = \left[\begin{array}{ccccc} 1 & -\frac{25}{3} & \frac{70}{3} & -\frac{80}{3} & \frac{32}{3}\\[\smallskipamount] 0 & 16 & -\frac{208}{3} & 96 & -\frac{128}{3}\\ 0 & -12 & 76 & -128 & 64\\[\smallskipamount] 0 & \frac{16}{3} & -\frac{112}{3} & \frac{224}{3} & -\frac{128}{3}\\[\smallskipamount] 0 & -1 & \frac{22}{3} & -16 & \frac{32}{3} \end{array}\right],$

for which $\det(V) = 9/32768$.

## Conditioning

Vandermonde matrices are notorious for being ill conditioned. The ill conditioning stems from the monomials being a poor basis for the polynomials on the real line. For arbitrary distinct points $x_i$, Gautschi showed that $V_n = V(x_1, x_2, \dots, x_n)$ satisfies

$\notag \displaystyle\max_i \displaystyle\prod_{j\ne i} \frac{ \max(1,|x_j|) }{ |x_i-x_j| } \le \|V_n^{-1}\|_{\infty} \le \displaystyle\max_i \prod_{j\ne i} \frac{ 1+|x_j| }{ |x_i-x_j| },$

with equality on the right when $x_j = |x_j| e^{\mathrm{i}\theta}$ for all $j$ with a fixed $\theta$ (in particular, when $x_j\ge0$ for all $j$). Note that the upper and lower bounds differ by at most a factor $2^{n-1}$. It is also known that for any set of real points $x_i$,

$\notag \kappa_2(V_n) \ge \Bigl(\displaystyle\frac{2}{n}\Bigr)^{1/2} \, (1+\sqrt{2})^{n-2}$

and that for $x_i = 1/i$ we have $\kappa_{\infty}(V_n) > n^{n+1}$, where the lower bound is an extremely fast growing function of the dimension!

These exponential lower bounds are alarming, but they do not necessarily rule out the use of Vandermonde matrices in practice. One of the reasons is that there are specialized algorithms for solving Vandermonde systems whose accuracy is not dependent on the condition number $\kappa$, and which in some cases can be proved to be highly accurate. The first such algorithm is an $O(n^2)$ operation algorithm for solving $V_ny =b$ of Björck and Pereyra (1970). There is now a long list of generalizations of this algorithm in various directions, including for confluent Vandermonde-like matrices (Higham, 1990), as well as for more specialized problems (Demmel and Koev, 2005) and more general ones (Bella et al., 2009). Another important observation is that the exponential lower bounds are for real nodes. For complex nodes $V_n$ can be much better conditioned. Indeed when the $x_i$ are the roots of unity, $V_n/\sqrt{n}$ is the unitary Fourier matrix and so $V_n$ is perfectly conditioned.

## Generalizations

Two ways in which Vandermonde matrices have been generalized are by allowing confluency of the points $x_i$ and by replacing the monomials by other polynomials. Confluency arises when the $x_i$ are not distinct. If we assume that equal $x_i$ are contiguous then a confluent Vandermonde matrix is obtained by “differentiating” the previous column for each of the repeated points. For example, with points $x_1, x_1, x_1, x_2, x_2$ we obtain

$\notag \begin{bmatrix} 1 & 0 & 0 & 1 & 0 \\ x_1 & 1 & 0 & x_2 & 1 \\ x_1^2 & 2x_1 & 2 & x_2^2 & 2x_2 \\ x_1^3 & 3x_1^2 & 6x_1 & x_2^3 & 3x_2^2 \\ x_1^4 & 4x_1^3 & 12x_1^2 & x_2^4 & 4x_2^3 \end{bmatrix}. \qquad (4)$

The transpose of a confluent Vandermonde matrix arises in Hermite interpolation; it is nonsingular if the points corresponding to the “nonconfluent columns” are distinct (that is, if $x_1 \ne x_2$ in the case of (4)).

A Vandermonde-like matrix is defined in terms of a set of polynomials $\{p_i(x)\}_{i=0}^n$ with $p_i$ having degree $i$:

$\notag \begin{bmatrix} p_0(x_1) & p_0(x_2) & \dots & p_0(x_n)\\ p_1(x_1) & p_1(x_2) & \dots & p_1(x_n)\\ \vdots & \vdots & \dots & \vdots\\ p_{n-1}(x_1) & p_{n-1}(x_2) & \dots & p_{n-1}(x_n)\\ \end{bmatrix}.$

Of most interest are polynomials that satisfy a three-term recurrence, in particular, orthogonal polynomials. Such matrices can be much better conditioned than general Vandermonde matrices.

## Notes

Algorithms for solving confluent Vandermonde-like systems and their rounding error analysis are described in the chapter “Vandermonde systems” of Higham (2002).

Gautschi has written many papers on the conditioning of Vandermonde matrices, beginning in 1962. We mention just his most recent paper on this topic: Gautschi (2011).

## References

This is a minimal set of references, which contain further useful references within.

## One thought on “What Is a Vandermonde Matrix?”

1. A few interesting tidbits regarding Vandermonde matrices.

Since det V has a nice closed form Vandermonde systems are one of the few places where one can meaningfully apply Cramer’s rule.

If a set of nodes maximize |det V| they are known as Fekete points. In one dimension the Fekete points are the Gauss-Legendre-Lobatto nodes (GLL); a result which I believe was first shown by Fejér. The proof is tedious. A cottage industry exists around trying to find Fekete points in higher dimensional domains with applications to multi-variate Lagrange interpolation.