The Cayley–Hamilton Theorem says that a square matrix satisfies its characteristic equation, that is where is the characteristic polynomial. This statement is not simply the substitution “”, which is not valid since must remain a scalar inside the term. Rather, for an , the characteristic polynomial has the form

and the Cayley–Hamilton theorem says that

Various proofs of the theorem are available, of which we give two. The first is the most natural for anyone familiar with the Jordan canonical form. The second is more elementary but less obvious.

## First proof.

Consider a Jordan block with eigenvalue :

We have

and then . In general, for an Jordan block with eigenvalue , is zero apart from a th superdiagonal of ones for , and .

Let have the Jordan canonical form , where and each is an Jordan block with eigenvalue . The characteristic polynomial of can be factorized as . Note that for all , and hence for any polynomial . Then

and is zero because it contains a factor and this factor is zero, as noted above. Hence and therefore .

## Second Proof

Recall that the adjugate of an matrix is the transposed matrix of cofactors, where a cofactor is a signed sum of products of entries of , and that . With , each entry of is a polynomial of degree in , so can be written

for some matrices , , not depending on . Rearranging, we obtain

and equating coefficients of , …, gives

Premultiplying the first equation by , the second by , and so on, and adding, gives

as required. This proof is by Buchheim (1884).

## Applications and Generalizations

A common use of the Cayley–Hamilton theorem is to show that is expressible as a linear combination of , , …, . Indeed for a nonsingular , implies that

since .

Similarly, for any can be expressed as a linear combination of , , …, . An interesting implication is that any matrix power series is actually a polynomial in the matrix. Thus the matrix exponential can be written for some scalars , …, . However, the depend on , which reduces the usefulness of the polynomial representation. A rare example of an explicit expression of this form is Rodrigues’s formula for the exponential of a skew-symmetric matrix :

where .

Cayley used the Cayley–Hamilton theorem to find square roots of a matrix. If then applying the theorem to gives , or

which gives

Now and taking the trace in gives an equation for , leading to

With appropriate choices of signs for the square roots this formula gives all four square roots of when has distinct eigenvalues, but otherwise the formula can break down.

Expressions obtained from the Cayley–Hamilton theorem are of little practical use for general matrices, because algorithms that compute the coefficients of the characteristic polynomial are typically numerically unstable.

The Cayley–Hamilton theorem has been generalized in various directions. The theorem can be interpreted as saying that the powers for all nonnegative generate a vector space of dimension at most . Gerstenhaber (1961) proved that if and are two commuting matrices then the matrices , for all nonnegative and , generate a vector space of dimension at most . It is conjectured that this result extends to three matrices.

## Historical Note

The Cayley–Hamilton theorem appears in the 1858 memoir in which Cayley introduced matrix algebra. Cayley gave a proof for and stated that he had verified the result for , adding “I have not thought it necessary to undertake the labour of a formal proof of the theorem in the general case of a matrix of any degree.” Hamilton had proved the result for quaternions in 1853. Cayley actually discovered a more general version of the Cayley–Hamilton theorem, which appears in an 1857 letter to Sylvester but not in any of his published work: if the square matrices and commute and then .

## References

- Arthur Buchheim, Mathematical Notes, Messenger Math. 13, 62–66, 1884.
- Arthur Cayley, A Memoir on the Theory of Matrices, Philos. Trans. Roy. Soc. London 148, 17–37, 1858.
- Tony Crilly, Cayley’s Anticipation of a Generalised Cayley–Hamilton Theorem, Historia Mathematica 5, 211–219, 1978.

## Related Blog Posts

- What Is the Adjugate of a Matrix? (2020)
- What Is the Gerstenhaber Problem? (2020)
- What Is the Matrix Exponential? (2020)
- What Is the Square Root of a Matrix? (2020)

This article is part of the “What Is” series, available from https://nhigham.com/category/what-is and in PDF form from the GitHub repository https://github.com/higham/what-is.