When Cayley introduced matrix algebra in 1858, he did much more than merely arrange numbers in a rectangular array. His definitions of addition, multiplication, and inversion produced an algebraic structure that has proved to be immensely useful, and which still holds many mysteries today.
An matrix has
parameters, so the vector space
of real matrices has dimension
, with a basis given by the matrices
, where
is the vector that is zero except for a
in the
th entry. In his original paper, Cayley noticed the important property that the powers of a particular matrix
can never span
. The Cayley–Hamilton theorem says that
satisfies its own characteristic equation, that is,
where
is the characteristic polynomial of
. This means that
can be expressed as a linear combination of
,
, …,
, so the powers of
span a vector space of dimension at most
.
Gerstenhaber proved a generalization of this property in 1961: if and
are two commuting
matrices then the matrices
, for all nonnegative
and
, generate a vector space of dimension at most
. This result is much more difficult to prove than the Cayley–Hamilton theorem. Gerstenhaber’s proof was based on algebraic geometry, but purely matrix-theoretic proofs have been given.
A natural question, called the Gerstenhaber problem, is: does this result extend to three matrices, that is, does the vector space
have dimension at most for any
matrices
,
, and
that commute with each other? (We can limit the powers to
by the Cayley–Hamilton theorem.) The problem is defined over any field, but here I focus on the reals.
Before considering the three matrix case let us note that the answer is known to be “no” for four commuting matrices
,
,
, and
. Indeed let
and note that all possible products of two of these matrices are zero, so the matrices commute pairwise. Yet are clearly five linearly independent matrices. Hence Gerstenhaber’s result does not extend to four matrices. It also does not extend to five or more matrices because it is known that the failure of the result for one value of
implies failure for all larger
. The question, then, is whether the three matrix case is like the two matrix case or the case for four or more matrices.
A great deal of effort has been put into proving or disproving the Gerstenhaber problem, so far without success. Here are two known facts.
- The result holds for all
.
- By a 1905 result of Schur, the dimension of
is at most
, which is less than
but still much bigger than
. (Here,
is the floor function.)
One approach to investigating this problem is to look for a counterexample computationally. For some , choose three commuting
matrices
,
, and
, select
monomials
form the matrix
where converts a matrix into a vector by stacking the columns on top of each other, then compute
, which is a lower bound on
, and check whether it is greater than
.
This simple-minded approach has some obvious difficulties. How do we choose ,
, and
? How do we choose the powers? How do we avoid overflow and underflow and compute a reliable rank, given that we might be dealing with large powers of large matrices?
Holbrook and O’Meara (2020), who have written several papers on the Gerstenhaber problem, which they call the GP problem, state that they “firmly believe the GP will turn out to have a negative answer” and they have developed a sophisticated approach to searching for a case with , which they call a “Eureka”. They first note that
,
and
can be assumed to be nilpotent. This means that all three matrices must be defective, because a nondefective nilpotent matrix is zero. Next they note that since commuting matrices are simultaneously unitarily triangularizable,
,
, and
can be assumed to be strictly upper triangular. Then they note that
can be assumed to be in Weyr canonical form.
The Weyr canonical form is a dual of the Jordan canonical form in which the Jordan matrix is replaced by a Weyr matrix, which is a direct sum of basic Weyr matrices. A basic Weyr matrix has one distinct eigenvalue and is upper block bidiagonal with diagonal blocks that are multiples of the identity matrix and superdiagonal blocks that are rectangular identity matrices. The difference between Jordan and Weyr matrices is illustrated by the example
where the Jordan matrix and Weyr matrix
are related by
for some permutation matrix
. There is an elegant way of relating the block sizes in the Jordan and Weyr matrices via a Young diagram. Form an array whose
th column contains
dots, where
is the size of the
th diagonal block of
:
Then the number of dots in the th row is the size of the
th diagonal block in the Weyr form.
The reason for using the Weyr form is that whereas any matrix that commutes with a Jordan matrix has Toeplitz blocks, any matrix that commutes with a Weyr matrix is block upper triangular and is uniquely determined by the first block row. By choosing a Weyr form for , commuting matrices
and
can be built up in a systematic way.
Thanks to a result of O’Meara (2020) it suffices to compute modulo a prime , so the computations can be done in exact arithmetic, removing the need to worry about rounding errors or overflow.
Holbrook and O’Meara have mostly tried matrices up to dimension 50, but they feel that “the Loch Ness monster probably lives in deeper water, closer to ”. Their MATLAB codes (40 nicely commented but not optimized M-files) are available on request from the address given in their preprint. If you have the time to spare you might want to experiment with the codes and try to find a Eureka.
References
This is a minimal set of references, which contain further useful references within.
- Robert M. Corless and Steven E. Thornton, The Weyr Canonical Form, 2016 (PDF poster).
- John Holbrook and Kevin O’Meara, A Computing Strategy and Programs to Resolve the Gerstenhaber Problem for Commuting Triples of Matrices, ArXiv:2006.085882020, 2020.
- John Holbrook and Kevin O’Meara, Commuting Triples of Matrices Vs the Computer, IMAGE (The Bulletin of the International Linear Algebra Society), to appear.
- Kevin O’Meara, The Gerstenhaber Problem for Commuting Triples of Matrices is “Decidable”, Comm. Algebra 48, 453–466, 2020.
- Kevin O’Meara, John Clark, and Charles Vinsonhaler, Advanced Topics in Linear Algebra. Weaving Matrix Problems Through the Weyr Form, Oxford University Press, Oxford, UK. 2011. Chapter 5.
- B. A. Sethuraman, the Algebra Generated by Three Commuting Matrices, Math. Newsl. 21, 62–67, 2011.
This article is part of the “What Is” series, available from https://nhigham.com/category/what-is and in PDF form from the GitHub repository https://github.com/higham/what-is.
Hello Nick. I confess I do not like your counterexample of 4 matrices. It seems to me the only thing going on in it is that we have a convention which breaks some of the patterns of algebra in which we assign the value of X^0 to be I, the identity matrix, when X is a nonzero square matrix. I should think this ought to be explicitly excluded from consideration here because this convention does not respect the nature of the properties under consideration. One does not derive that X^0 = I from matrix algebra — it is stipulated by fiat because it is consistent with other algebraic facts.
The same would be true for sets of 3 matrices except for the combinatorial constraints on matrices of order 0, which is a “gotcha” on sloppily oversimplified statements of the result. I don’t know the original statement but perhaps it is the original statement which is oversimplified and doesn’t anticipate such degenerate constructions.
The fact that $A^0 = I$ is mentioned in Cayley’s original 1858 paper on matrix algebra. It must hold, as a consequence of the definition of inverse and index laws: $I = A^{-1} A = A^{-1 +1} = A^0$.
A more abstract statement would be that the subalgebra generated by the matrices has the given dimension. And the unit is always included in a subalgebra, by definition.
Shouldn’t the $(3,3)$ entry in $J(\lambda)$ be $\lambda$ and not $1$?
Thanks – corrected.