The eigenvalues of Hermitian matrices satisfy a wide variety of inequalities. We present some of the most useful and explain their implications. Proofs are omitted, but as Parlett (1998) notes, the proofs of the Courant–Fischer, Weyl, and Cauchy results are all consequences of the elementary fact that if the sum of the dimensions of two subspaces of exceeds then the subspaces have a nontrivial intersection.
The eigenvalues of a Hermitian matrix are real and we order them . Note that in some references, such as Horn and Johnson (2013), the reverse ordering is used, with the largest eigenvalue. When it is necessary to specify what matrix is an eigenvalue of we write : the th largest eigenvalue of . All the following results also hold for symmetric matrices over .
The function is the quadratic form for evaluated on the unit sphere, since . As is Hermitian it has a spectral decomposition , where is unitary and . Then
from which is it clear that
with equality when is an eigenvector corresponding to and , respectively, This characterization of the extremal eigenvalues of as the extrema of is due to Lord Rayleigh (John William Strutt), and is called a Rayleigh quotient. The intermediate eigenvalues correspond to saddle points of .
The Courant–Fischer theorem (1905) states that every eigenvalue of a Hermitian matrix is the solution of both a min-max problem and a max-min problem over suitable subspaces of .
For a Hermitian ,
Note that the equalities are special cases of these characterizations.
In general there is no useful formula for the eigenvalues of a sum of Hermitian matrices. However, the Courant–Fischer theorem yields the upper and lower bounds
from which it follows that
This inequality shows that the eigenvalues of a Hermitian matrix are well conditioned under perturbation. We can rewrite the inequality in the symmetric form
If is positive semidefinite then (1) gives
while if is positive definite then strict inequality holds for all . These bounds are known as the Weyl monotonicity theorem.
Weyl’s inequalities (1912) bound the eigenvalues of in terms of those of and .
For Hermitian and ,
The Weyl inequalities yield much information about the effect of low rank perturbations. Consider a positive semidefinite rank- perturbation . Inequality (3) with gives
(which also follows from (1)). Inequality (3) with , combined with (2), gives
These inequalities confine each eigenvalue of to the interval between two adjacent eigenvalues of ; the eigenvalues of are said to interlace those of . The following figure illustrates the case , showing a possible configuration of the eigenvalues of and of .
A specific example, in MATLAB, is
>> n = 4; eig_orig = 5:5+n-1 >> D = diag(eig_orig); eig_pert = eig(D + ones(n))' eig_orig = 5 6 7 8 eig_pert = 5.2961e+00 6.3923e+00 7.5077e+00 1.0804e+01
Since and the trace is the sum of the eigenvalues, we can write
where the are nonnegative and sum to . If we greatly increase , the norm of the perturbation, then most of the increase in the eigenvalues is concentrated in the largest, since (5) bounds how much the smaller eigenvalues can change:
>> eig_pert = eig(D + 100*ones(n))' eig_pert = 5.3810e+00 6.4989e+00 7.6170e+00 4.0650e+02
More generally, if has positive eigenvalues and negative eigenvalues then (3) with gives
while (4) with gives
So the inertia of (the number of negative, zero, and positive eigenvalues) determines how far the eigenvalues can move as measured relative to the indexes of the eigenvalues of .
An important implication of the last two inequalities is for the case , for which we have
Exactly eigenvalues appear in one of these inequalities and appear in both. Therefore of the eigenvalues are equal to and so only eigenvalues can differ from . So perturbing the identity matrix by a Hermitian matrix of rank changes at most of the eigenvalues. (In fact, it changes exactly eigenvalues, as can be seen from a spectral decomposition.)
Finally, if has rank then and and so taking in (3) and in (4) gives
Cauchy Interlace Theorem
The Cauchy interlace theorem relates the eigenvalues of successive leading principal submatrices of a Hermitian matrix. We denote the leading principal submatrix of of order by .
For a Hermitian ,
The theorem says that the eigenvalues of interlace those of for all . Two immediate implications are that (a) if is Hermitian positive definite then so are all its leading principal submatrices and (b) appending a row and a column to a Hermitian matrix does not decrease the largest eigenvalue or increase the smallest eigenvalue.
Since eigenvalues are unchanged under symmetric permutations of the matrix, the theorem can be reformulated to say that the eigenvalues of any principal submatrix of order interlace those of . A generalization to principal submatrices of order is given in the next result.
If is a principal submatrix of order of a Hermitian then
It follows by taking to be a unit vector in the formula that for all . And of course the trace of is the sum of the eigenvalues: . These relations are the first and last in a sequence of inequalities relating sums of eigenvalues to sums of diagonal elements obtained by Schur in 1923.
For a Hermitian ,
where is the set of diagonal elements of arranged in decreasing order: .
These inequalities say that the vector of eigenvalues majorizes the ordered vector of diagonal elements.
An interesting special case is a correlation matrix, a symmetric positive semidefinite matrix with unit diagonal, for which the inequalities are
and . Here is an illustration in MATLAB.
>> n = 5; rng(1); A = gallery('randcorr',n); >> e = sort(eig(A)','descend'), partial_sums = cumsum(e) e = 2.2701e+00 1.3142e+00 9.5280e-01 4.6250e-01 3.6045e-04 partial_sums = 2.2701e+00 3.5843e+00 4.5371e+00 4.9996e+00 5.0000e+00
Ky Fan (1949) proved a majorization relation between the eigenvalues of , , and :
For , the inequality is the same as the upper bound of (1), and for it is an equality: .
For a Hermitian and a nonsingular , the transformation is a congruence transformation. Sylvester’s law of inertia says that congruence transformations preserve the inertia. A result of Ostrowski (1959) goes further by providing bounds on the ratios of the eigenvalues of the original and transformed matrices.
For a Hermitian and ,
If is unitary then and so Ostrowski’s theorem reduces to the fact that a congruence with a unitary matrix is a similarity transformation and so preserves eigenvalues. The theorem shows that the further is from being unitary the greater the potential change in the eigenvalues.
Ostrowski’s theorem can be generalized to the situation where is rectangular (Higham and Cheng, 1998).
The results we have described are strongly interrelated. For example, the Courant–Fischer theorem and the Cauchy interlacing theorem can be derived from each other, and Ostrowski’s theorem can be proved using the Courant–Fischer Theorem.
- Rajendra Bhatia, Linear Algebra to Quantum Cohomology: The Story of Alfred Horn’s Inequalities, Amer. Math. Monthly 108(4), 289–318, 2001.
- Roger A. Horn and Charles R. Johnson, Matrix Analysis, second edition, Cambridge University Press, 2013. My review of the second edition.
- Nicholas J. Higham and Sheung Hun Cheng, Modifying the Inertia of Matrices Arising in Optimization, Linear Algebra Appl. 275–276, 261-279, 1998.
- Beresford Parlett, The Symmetric Eigenvalue Problem, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA 1998.