The eigenvalues of Hermitian matrices satisfy a wide variety of inequalities. We present some of the most useful and explain their implications. Proofs are omitted, but as Parlett (1998) notes, the proofs of the Courant–Fischer, Weyl, and Cauchy results are all consequences of the elementary fact that if the sum of the dimensions of two subspaces of exceeds
then the subspaces have a nontrivial intersection.
The eigenvalues of a Hermitian matrix are real and we order them
. Note that in some references, such as Horn and Johnson (2013), the reverse ordering is used, with
the largest eigenvalue. When it is necessary to specify what matrix
is an eigenvalue of we write
: the
th largest eigenvalue of
. All the following results also hold for symmetric matrices over
.
Quadratic Form
The function is the quadratic form
for
evaluated on the unit sphere, since
. As
is Hermitian it has a spectral decomposition
, where
is unitary and
. Then
from which is it clear that
with equality when is an eigenvector corresponding to
and
, respectively, This characterization of the extremal eigenvalues of
as the extrema of
is due to Lord Rayleigh (John William Strutt), and
is called a Rayleigh quotient. The intermediate eigenvalues correspond to saddle points of
.
Courant–Fischer Theorem
The Courant–Fischer theorem (1905) states that every eigenvalue of a Hermitian matrix is the solution of both a min-max problem and a max-min problem over suitable subspaces of
.
Theorem (Courant–Fischer).
For a Hermitian
,
Note that the equalities are special cases of these characterizations.
In general there is no useful formula for the eigenvalues of a sum of Hermitian matrices. However, the Courant–Fischer theorem yields the upper and lower bounds
from which it follows that
This inequality shows that the eigenvalues of a Hermitian matrix are well conditioned under perturbation. We can rewrite the inequality in the symmetric form
If is positive semidefinite then (1) gives
while if is positive definite then strict inequality holds for all
. These bounds are known as the Weyl monotonicity theorem.
Weyl’s Inequalities
Weyl’s inequalities (1912) bound the eigenvalues of in terms of those of
and
.
Theorem (Weyl).
For Hermitian
and
,
The Weyl inequalities yield much information about the effect of low rank perturbations. Consider a positive semidefinite rank- perturbation
. Inequality (3) with
gives
(which also follows from (1)). Inequality (3) with , combined with (2), gives
These inequalities confine each eigenvalue of to the interval between two adjacent eigenvalues of
; the eigenvalues of
are said to interlace those of
. The following figure illustrates the case
, showing a possible configuration of the eigenvalues
of
and
of
.
A specific example, in MATLAB, is
>> n = 4; eig_orig = 5:5+n-1 >> D = diag(eig_orig); eig_pert = eig(D + ones(n))' eig_orig = 5 6 7 8 eig_pert = 5.2961e+00 6.3923e+00 7.5077e+00 1.0804e+01
Since and the trace is the sum of the eigenvalues, we can write
where the are nonnegative and sum to
. If we greatly increase
, the norm of the perturbation, then most of the increase in the eigenvalues is concentrated in the largest, since (5) bounds how much the smaller eigenvalues can change:
>> eig_pert = eig(D + 100*ones(n))' eig_pert = 5.3810e+00 6.4989e+00 7.6170e+00 4.0650e+02
More generally, if has
positive eigenvalues and
negative eigenvalues then (3) with
gives
while (4) with gives
So the inertia of (the number of negative, zero, and positive eigenvalues) determines how far the eigenvalues can move as measured relative to the indexes of the eigenvalues of
.
An important implication of the last two inequalities is for the case , for which we have
Exactly eigenvalues appear in one of these inequalities and
appear in both. Therefore
of the eigenvalues are equal to
and so only
eigenvalues can differ from
. So perturbing the identity matrix by a Hermitian matrix of rank
changes at most
of the eigenvalues. (In fact, it changes exactly
eigenvalues, as can be seen from a spectral decomposition.)
Finally, if has rank
then
and
and so taking
in (3) and
in (4) gives
Cauchy Interlace Theorem
The Cauchy interlace theorem relates the eigenvalues of successive leading principal submatrices of a Hermitian matrix. We denote the leading principal submatrix of of order
by
.
Theorem (Cauchy).
For a Hermitian
,
The theorem says that the eigenvalues of interlace those of
for all
. Two immediate implications are that (a) if
is Hermitian positive definite then so are all its leading principal submatrices and (b) appending a row and a column to a Hermitian matrix does not decrease the largest eigenvalue or increase the smallest eigenvalue.
Since eigenvalues are unchanged under symmetric permutations of the matrix, the theorem can be reformulated to say that the eigenvalues of any principal submatrix of order interlace those of
. A generalization to principal submatrices of order
is given in the next result.
Theorem.
If
is a principal submatrix of order
of a Hermitian
then
Majorization Results
It follows by taking to be a unit vector
in the formula
that
for all
. And of course the trace of
is the sum of the eigenvalues:
. These relations are the first and last in a sequence of inequalities relating sums of eigenvalues to sums of diagonal elements obtained by Schur in 1923.
Theorem (Schur).
For a Hermitian
,
where
is the set of diagonal elements of
arranged in decreasing order:
.
These inequalities say that the vector of eigenvalues majorizes the ordered vector
of diagonal elements.
An interesting special case is a correlation matrix, a symmetric positive semidefinite matrix with unit diagonal, for which the inequalities are
and . Here is an illustration in MATLAB.
>> n = 5; rng(1); A = gallery('randcorr',n); >> e = sort(eig(A)','descend'), partial_sums = cumsum(e) e = 2.2701e+00 1.3142e+00 9.5280e-01 4.6250e-01 3.6045e-04 partial_sums = 2.2701e+00 3.5843e+00 4.5371e+00 4.9996e+00 5.0000e+00
Ky Fan (1949) proved a majorization relation between the eigenvalues of ,
, and
:
For , the inequality is the same as the upper bound of (1), and for
it is an equality:
.
Ostrowski’s Theorem
For a Hermitian and a nonsingular
, the transformation
is a congruence transformation. Sylvester’s law of inertia says that congruence transformations preserve the inertia. A result of Ostrowski (1959) goes further by providing bounds on the ratios of the eigenvalues of the original and transformed matrices.
Theorem (Ostrowski).
For a Hermitian
and
,
where
.
If is unitary then
and so Ostrowski’s theorem reduces to the fact that a congruence with a unitary matrix is a similarity transformation and so preserves eigenvalues. The theorem shows that the further
is from being unitary the greater the potential change in the eigenvalues.
Ostrowski’s theorem can be generalized to the situation where is rectangular (Higham and Cheng, 1998).
Interrelations
The results we have described are strongly interrelated. For example, the Courant–Fischer theorem and the Cauchy interlacing theorem can be derived from each other, and Ostrowski’s theorem can be proved using the Courant–Fischer Theorem.
References
- Rajendra Bhatia, Linear Algebra to Quantum Cohomology: The Story of Alfred Horn’s Inequalities, Amer. Math. Monthly 108(4), 289–318, 2001.
- Roger A. Horn and Charles R. Johnson, Matrix Analysis, second edition, Cambridge University Press, 2013. My review of the second edition.
- Nicholas J. Higham and Sheung Hun Cheng, Modifying the Inertia of Matrices Arising in Optimization, Linear Algebra Appl. 275–276, 261-279, 1998.
- Beresford Parlett, The Symmetric Eigenvalue Problem, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA 1998.
This is a very nice overview; thanks! Tangential to what Parlett is stating, most if not even all of these results can be concluded from the Eckart–Young–Mirsky theorem in the spectral norm.
Can you elaborate? I mention Eckart–Young–Mirsky in https://nhigham.com/2021/02/02/what-is-a-unitarily-invariant-norm/