When a nonsingular matrix
is perturbed by a matrix of rank
, the inverse also undergoes a rank-
perturbation. More precisely, if
has rank
and
is nonsingular then the identity
shows that
The Sherman–Morrison–Woodbury formula provides an explicit formula for the inverse of the perturbed matrix .
Sherman–Morrison Formula
We will begin with the simpler case of a rank- perturbation:
, where
and
are
-vectors, and we consider first the case where
. We might expect that
for some
(consider a binomial expansion of the inverse). Multiplying out, we obtain
so the product equals the identity matrix when . The condition that
be nonsingular is
(as can also be seen from
, derived in What Is a Block Matrix?). So
For the general case write . Inverting this equation and applying the previous result gives
subject to the nonsingularity condition . This is known as the Sherman–Morrison formula. It explicitly identifies the rank-
change to the inverse.
As an example, if we take and
(where
is the
th column of the identity matrix) then, writing
, we have
The Frobenius norm of the change to is
If is sufficiently small then this quantity is approximately maximized for
and
such that the product of the norms of
th column and
th row of
is maximized. For an upper triangular matrix
and
are likely to give the maximum, which means that the inverse of an upper triangular matrix is likely to be most sensitive to perturbations in the
element of the matrix. To illustrate, we consider the matrix
The element of the following matrix is
:
As our analysis suggests, the entry is the most sensitive to perturbation.
Sherman–Morrison–Woodbury Formula
Now consider a perturbation , where
and
are
. This perturbation has rank at most
, and its rank is
if
and
are both of rank
. If
is nonsingular then
is nonsingular and
which is the Sherman–Morrison–Woodbury formula. The significance of this formula is that is
, so if
and
is known then it is much cheaper to evaluate the right-hand side than to invert
directly. In practice, of course, we rarely invert matrices, but rather exploit factorizations of them. If we have an LU factorization of
then we can use it in conjunction with the Sherman–Morrison–Woodbury formula to solve
in
flops, as opposed to the
flops required to factorize
from scratch.
The Sherman–Morrison–Woodbury formula is straightforward to verify, by showing that the product of the two sides is the identity matrix. How can the formula be derived in the first place? Consider any two matrices and
such that
and
are both defined. The associative law for matrix multiplication gives
, or
, which can be written as
. Postmultiplying by
gives
Setting and
gives the special case of the Sherman–Morrison–Woodbury formula with
, and the general formula follows from
.
General Formula
We will give a different derivation of an even more general formula using block matrices. Consider the block matrix
where is
,
and
are
, and
is
. We will obtain a formula for
by looking at
.
It is straightforward to verify that
Hence
In the block we see the right-hand side of a Sherman–Morrison–Woodbury-like formula, but it is not immediately clear how this relates to
. Let
, and note that
. Then
and applying the above formula (appropriately renaming the blocks) gives, with denoting a block whose value does not matter,
Hence . Equating our two formulas for
gives
provided that is nonsingular.
To see one reason why this formula is useful, suppose that the matrix and its perturbation are symmetric and we wish to preserve symmetry in our formulas. The Sherman–Morrison–Woodbury requires us to write the perturbation as
, so the perturbation must be positive semidefinite. In
, however, we can write an arbitrary symmetric perturbation as
, with
symmetric but possibly indefinite, and obtain a symmetric formula.
The matrix is the Schur complement of
in
. Consequently the inversion formula
is intimately connected with the theory of Schur complements. By manipulating the block matrices in different ways it is possible to derive variations of
. We mention just the simple rewriting
which is valid if is singular, as long as
is nonsingular. Note that the formula is not symmetric when
and
. This variant can also be obtained by replacing
by
in the Sherman–Morrison–Woodbury formula.
Historical Note
Formulas for the change in a matrix inverse under low rank perturbations have a long history. They have been rediscovered on multiple occasions, sometimes appearing without comment within other formulas. Equation is given by Duncan (1944), which is the earliest appearance in print that I am aware of. For discussions of the history of these formulas see Henderson and Searle (1981) or Puntanen and Styan (2005).
References
This is a minimal set of references, which contain further useful references within.
- W. J. Duncan, LXXVIII. Some Devices for the Solution of Large Sets of Simultaneous Linear Equations, The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science 35, 660–670, 1944.
- H. V. Henderson and S. R. Searle, On Deriving the Inverse of a Sum of Matrices, SIAM Rev. 23(1), 53–60, 1981.
- Simo Puntanen and George Styan, Historical Introduction: Issai Schur and the Early Development of the Schur Complement, pages 1-16 in Fuzhen Zhang, ed., The Schur Complement and Its Applications, Springer-Verlag, New York, 2005.
Related Blog Posts
- What Is a Block Matrix? (2020)
This article is part of the “What Is” series, available from https://nhigham.com/category/what-is and in PDF form from the GitHub repository https://github.com/higham/what-is.
Hi!
Right before the section ‘General Formula’, it seems that the expression is missing a parenthesis: A + UV* = A(1 – A)^-1 UV*
Thanks for this nice post!
Cheers.
Thanks – corrected.
I clearly mistaken the factorization of A. Thanks for the correction!
While using Sherman-Morrison-Woodbury for deriving results for exponential and other functions of perturbed matrices for some time, I finally stumbled across your Theorem 1.35 in “N. J. Higham, Functions of Matrices: Theory and Computation, SIAM, 2008.” This theorem is simply: beautiful! I love it! It is a wonderful generalization of Sherman-Morrison-Woodbury and renders my former tedious calculations short and easy ones while lifting the caveats on corner cases. Easy to code, too. Thanks!
So, if you are in need for calculating functions on low-rank perturbed matrices: take a look at Theorem 1.35 🙂