What Is the Determinant of a Matrix?

The determinant of an n\times n matrix A is defined by

\notag   \det(A) = \displaystyle\sum_j (-1)^{\mathop{\mathrm{sgn}}j}                   a_{1,j_1}a_{2,j_2} \dots a_{n,j_n}, \qquad (1)

where the sum is over all n! permutations j = (j_1,j_2,\dots,j_n) of the sequence (1,2,\dots,n) and \mathop{\mathrm{sgn}}j is the number of inversions in j, that is, the number of pairs (j_k,j_\ell) with k  j_\ell. Each term in the sum is a signed product of n entries of A and the product contains one entry taken from each row and one from each column.

The determinant is sometimes written with vertical bars, as |A|.

Three fundamental properties are

\notag \begin{aligned} \det(\alpha A) &= \alpha^n \det(A)\; \mathrm{for~any~scalar~}\alpha,\qquad(2)\\ \det(A^T) &= \det(A), \qquad(3)\\ \det(AB) &= \det(A)\det(B) \mathrm{~for~} n\times n~ A \mathrm{~and~} B.\qquad(4) \end{aligned}

The first property is immediate, the second can be proved using properties of permutations, and the third is proved in texts on linear algebra and matrix theory.

An alternative, recursive expression for the determinant is the Laplace expansion

\notag   \det(A) = \displaystyle\sum_{j=1}^n (-1)^{i+j} a_{ij} \det (A_{ij}).  \qquad(5)

for any i\in\{1,2,\dots,n\}, where A_{ij} denotes the (n-1)\times (n-1) submatrix of A obtained by deleting row i and column j, and \det(a) = a for a scalar a. This formula is called the expansion by minors because \det (A_{ij}) is a minor of A.

For some types of matrices the determinant is easy to evaluate. If T is triangular then \det(T) = \prod_{i=1}^n t_{ii}. If Q is unitary then Q^*Q = I implies |\det(Q)| = 1 on using (3) and (4). An explicit formula exists for the determinant of a Vandermonde matrix.

The determinant of A is connected with the eigenvalues \lambda_i of A via the property \det(A) = \prod_{i=1}^n \lambda_i. Since the eigenvalues are the roots of the characteristic polynomial \det(tI - A), this relation follows by setting t = 0 in the expression

\notag   \det(tI - A) = t^n + a_{n-1}t^{n-1} + \cdots + a_1 t + a_0                = \displaystyle\prod_{i=1}^n (t - \lambda_i).

For n=2, the determinant is

\notag     \det\biggl( \begin{bmatrix} a & b \\ c & d \end{bmatrix} \biggr)      = ad - bc,

but already for n=3 the determinant is tedious to write down. If one must compute \det(A), the formulas (1) and (5) are too expensive unless n is very small: they have an exponential cost. The best approach is to use a factorization of A involving factors that are triangular or orthogonal, so that the determinants of the factors are easily computed. If PA = LU is an LU factorization, with P a permutation matrix, L unit lower triangular, and U upper triangular, then \det(A) = \det(P) \prod_{i=1}^n u_{ii} = \pm \prod_{i=1}^n u_{ii}. As this expression indicates, the determinant is prone to overflow and underflow in floating-point arithmetic, so it may be preferable to compute \log(|\det(A)|) = \sum_{i=1}^n \log|u_{ii}|.

The determinant features in the formula

\notag   A^{-1} = \displaystyle\frac{\mathrm{adj}(A)}{\det(A)}

for the inverse, where \mathrm{adj}(A) is the adjugate of A (recall that \mathrm{adj}(A) has (i,j) element (-1)^{i+j} \det(A_{ji})). More generally, Cramer’s rule says that the components of the solution to a linear system Ax = b are given by x_i = \det(A_i(b))/\det(A), where A_i(b) denotes A with its ith column replaced by b. While mathematically elegant, Cramer’s rule is of no practical use, as it is both expensive and numerically unstable in finite precision arithmetic.

Inequalities

A celebrated bound for the determinant of a Hermitian positive definite matrix H\in\mathbb{C}^{n\times n} is Hadamard’s inequality. Note that for such H, \det(H) is real and positive (being the product of the eigenvalues, which are real and positive) and the diagonal elements are also real and positive (since h_{ii} = e_i^*He_i > 0).

Theorem 1 (Hadamard’s inequality). For a Hermitian positive definite matrix H\in\mathbb{C}^{n\times n},

\notag     \det(H) \le \displaystyle\prod_{i=1}^n h_{ii},

with equality if and only if H is diagonal.

Theorem 1 is easy to prove using a Cholesky factorization.

The following corollary can be obtained by applying Theorem 1 to H = A^*A or by using a QR factorization of A.

Corollary 2. For A = [a_1,a_2,\dots,a_n] \in\mathbb{C}^{n\times n},

\notag     \det(A) \le \displaystyle\prod_{i=1}^n \|a_i\|_2,

with equality if and only if the columns of A are orthogonal.

Obviously, one can apply the corollary to A^* and obtain the analogous bound with column norms replaced by row norms.

The determinant of A\in\mathbb{R}^{n\times n} can be interpreted as the volume of the parallelepiped \{\, \sum_{i=1}^n t_ia_i : 0 \le t_i \le 1, ~ i = 1\colon n\,\}, whose sides are the columns of A. Corollary 2 says that for columns of given lengths the volume is maximized when the columns are orthogonal.

Nearness to Singularity and Conditioning

The determinant characterizes nonsingularity: A is singular if and only if \det(A) = 0. It might be tempting to use |\det(A)| as a measure of how close a nonsingular matrix A is to being singular, but this measure is flawed, not least because of the sensitivity of the determinant to scaling. Indeed if Q is unitary then \det(\alpha Q) = \alpha^n \det(Q) can be given any value by a suitable choice of \alpha, yet \alpha Q is perfectly conditioned: \kappa_2(\alpha Q) = 1, where \kappa(A) = \|A\| \|A^{-1}\| is the condition number.

To deal with the poor scaling one might normalize the determinant: in view of Corollary 2,

\notag   \psi(A) = \displaystyle\frac{\prod_{i=1}^n \|a_i\|_2} {\det(A)}

satisfies \psi(A) \ge 1 and \psi(A) = 1 if and only if the columns of A are orthogonal. Birkhoff (1975) calls \psi the Hadamard condition number. In general, \psi is not related to the condition number \kappa, but if A has columns of unit 2-norm then it can be shown that \kappa_2(A) < 2\psi(A) (Higham, 2002, Prob. 14.13). Dixon (1984) shows that for classes of n\times n random matrices A_n that include matrices with elements independently drawn from a normal distribution with mean 0, the probability that the inequality

n^{1/4 - \epsilon} \mathrm{e}^{n/2}     < \psi(A_n) < n^{1/4 + \epsilon} \mathrm{e}^{n/2}

holds tends to 1 as n\to\infty for any \epsilon > 0, so \psi(A_n) \approx n^{1/4}\mathrm{e}^{n/2} for large n. This exponential growth is much faster than the growth of \kappa, for which Edelman (1998) showed that for the standard normal distribution, \mathbb{E}(\log(\kappa_2(A_n))) \approx \log n + 1.537, where \mathbb{E} denotes the mean value. This MATLAB example illustrates these points.

>> rng(1); n = 50; A = randn(n); 
>> psi = prod(sqrt(sum(A.*A)))/abs(det(A)), kappa2 = cond(A)
psi =
   5.3632e+10
kappa2 =
   1.5285e+02
>> ratio = psi/(n^(0.25)*exp(n/2))
ratio =
   2.8011e-01

The relative distance from A to the set of singular matrices is equal to the reciprocal of the condition number.

Theorem 3 (Gastinel, Kahan). For A\in\mathbb{C}^{n\times n} and any subordinate matrix norm,

\notag   \min \left\{ \displaystyle\frac{\|\Delta A\|}{\|A\|} : A+\Delta A\mathrm{~singular} \right\}  = \displaystyle\frac{1}{\kappa(A)}.

Notes

Determinants came before matrices, historically. Most linear algebra textbooks make significant use of determinants, but a lot can be done without them. Axler (1995) shows how the theory of eigenvalues can be developed without using determinants.

Determinants have little application in practical computations, but they are a useful theoretical tool in numerical analysis, particularly for proving nonsingularity.

There is a large number of formulas and identities for determinants. Sir Thomas Muir collected many of them in his five-volume magnum opus The Theory of Determinants in the Historical Order of Development, published between 1890 and 1930. Brualdi and Schneider (1983) give concise derivations of many identities using Gaussian elimination, bringing out connections between the identities.

The quantity obtained by modifying the definition (1) of determinant to remove the (-1)^{\mathop{\mathrm{sgn}}j} term is the permanent. The permanent arises in combinatorics and quantum mechanics and is much harder to compute than the determinant: no algorithm is known for computing the permanent in p(n) operations for a polynomial p.

References

This is a minimal set of references, which contain further useful references within.

Related Blog Posts

This article is part of the “What Is” series, available from https://nhigham.com/category/what-is and in PDF form from the GitHub repository https://github.com/higham/what-is.

Leave a comment