What Is a Generalized Inverse?

The matrix inverse is defined only for square nonsingular matrices. A generalized inverse is an extension of the concept of inverse that applies to square singular matrices and rectangular matrices. There are many definitions of generalized inverses, all of which reduce to the usual inverse when the matrix is square and nonsingular.

A large class of generalized inverses of an m \times n matrix A can be defined in terms of the Moore–Penrose conditions, in which X is n\times m:

\begin{array}{rcrc}  \mathrm{(1)}   &    AXA = A,  \; & \mathrm{(2)} & XAX=X,\\  \mathrm{(3)} & (AX)^* = AX, \; & \mathrm{(4)} & (XA)^* = XA. \end{array}

Here, the superscript * denotes the conjugate transpose. A 1-inverse is any X satisfying condition (1), a (1,3)-inverse is any X satisfying conditions (1) and (3), and so on for any subset of the four conditions.

Condition (1) implies that if Ax = b then A(Xb) = A (XAx) = AXAx = Ax = b, so Xb solves the equation, meaning that any 1-inverse is an equation-solving inverse. Condition (2) implies that X = 0 if A = 0.

A (1,3) inverse can be shown to provide a least squares solution to an inconsistent linear system. A (1,4) inverse can be shown to provide the minimum 2-norm solution of a consistent linear system (where the 2-norm is defined by \|x\|_2 = (x^*x)^{1/2}).

There is not a unique matrix satisfying any one, two, or three of the Moore–Penrose conditions. But there is a unique matrix satisfying all four of the conditions, and it is called the Moore-Penrose pseudoinverse, denoted by A^+ or A^{\dagger}. For any system of linear equations Ax = b, x = A^+b minimizes \|Ax - b\|_2 and has the minimum 2–norm over all minimizers.

The pseudoinverse can be expressed in terms of the singular value decomposition (SVD). If A = U\Sigma V^* is an SVD, where the m\times m matrix U and n\times n matrix V are orthogonal, and \Sigma = \mathrm{diag}(\sigma_1,\dots , \sigma_n) with \sigma_1 \ge \sigma_2 \ge \cdots \ge \sigma_r > \sigma_{r+1} = \cdots =\sigma_n = 0 (so that \mathrm{rank}(A) = r), then

A^+ = V\mathrm{diag}(\sigma_1^{-1},\dots,\sigma_r^{-1},0,\dots,0)U^*.

In MATLAB, the function pinv computes A^+ using this formula. If \mathrm{rank}(A) = n then the concise formula A^+ = (A^*A)^{-1}A^* holds.

For square matrices, the Drazin inverse is the unique matrix A^D such that

A^D A A^D = A^D, \quad A A ^D = A^D A, \quad        A^{k+1} A^D = A^k,

where k = \mathrm{index}(A). The first condition is the same as the second of the Moore–Penrose conditions, but the second and third have a different flavour. The index of a matrix of A is the smallest nonnegative integer k such that \mathrm{rank}(A^k) = \mathrm{rank}(A^{k+1}); it is characterized as the dimension of the largest Jordan block of A with eigenvalue zero.

If \mathrm{index}(A)=1 then A^D is also known as the group inverse of A and is denoted by A^\#. The Drazin inverse is an equation-solving inverse precisely when \mathrm{index}(A)\le 1, for then AA^DA=A, which is the first of the Moore–Penrose conditions.

The Drazin inverse can be represented explicitly as follows. If

A = P \begin{bmatrix}                B & 0 \\                0 & N               \end{bmatrix} P^{-1},

where P and B are nonsingular and N has only zero eigenvalues, then

A^D = P \begin{bmatrix}                 B^{-1} & 0 \\                 0      & 0                 \end{bmatrix} P^{-1}.

Here is the pseudoinverse and the Drazin inverse for a particular matrix with index 2:

A = \left[\begin{array}{rrr} 1 & -1 & -1\\[3pt] 0 & 0 & -1\\[3pt] 0 & 0 & 0 \end{array}\right], \quad A^+ = \left[\begin{array}{rrr} \frac{1}{2} & -\frac{1}{2} & 0\\[3pt] -\frac{1}{2} & \frac{1}{2} & 0\\[3pt] 0 & -1 & 0 \end{array}\right], \quad A^D = \left[\begin{array}{rrr} 1 & -1 & 0\\[3pt] 0 & 0 & 0\\[3pt] 0 & 0 & 0 \end{array}\right].

Applications

The Moore–Penrose pseudoinverse is intimately connected with orthogonality, whereas the Drazin inverse has spectral properties related to those of the original matrix. The pseudoinverse occurs in all kinds of least squares problems. Applications of the Drazin inverse include population modelling, Markov chains, and singular systems of linear differential equations. It is not usually necessary to compute generalized inverses, but they are valuable theoretical tools.

References

This is a minimal set of references, which contain further useful references within.

Related Blog Posts

This article is part of the “What Is” series, available from https://nhigham.com/category/what-is and in PDF form from the GitHub repository https://github.com/higham/what-is.

This entry was posted in what-is. Bookmark the permalink.

3 Responses to What Is a Generalized Inverse?

  1. Cleve Moler says:

    Beautiful “What is” about the pseudoinverse. Well written, concise, accurate. It is an art to write an article about anything — a news story, for example — that is both concise and accurate.

    I have one quibble. As you probably know, I am interested in mathematical typography. I see that you are not using MathJax or MathML, Why not? You don’t have mathematics, you have pictures of mathematics. All of the math, even the A’s and X’s, are little .png files. The inline math doesn’t have the proper baseline. The displayed math looks OK in my browser, but it is pixelated when you print it or enlarge it. I know you are interested in this topic as well. How have you decided to use whatever mathematical typesetting you are using?

    Should I submit this, minus this question, as a comment.

    I hope you and your family are well,

    — Cleve

  2. Nick Higham says:

    Thanks, Cleve. I’m using WordPress.com and it doesn’t support MathJax, apparently because it regards the required Javascript as a security risk. If I were to host my own WordPress installation (“WordPress.org”) I could install a MathJax plugin. I prefer to stick with WordPress.com for its ease of use. The relatively poor typesetting of math is why I’m making every “What Is” post available as a PDF file (see the end of each post). I actually write the posts in Emacs Org mode and export them to both WordPress (using Org2blog) and LaTeX.

  3. FWIW on easy docs with Math (not suggesting you change your blog) this might be of interest to some people: https://casual-effects.com/markdeep/

    Nice idea for a blog series. Thanks for doing it.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s