What is a Diagonally Dominant Matrix?

Matrices arising in applications often have diagonal elements that are large relative to the off-diagonal elements. In the context of a linear system this corresponds to relatively weak interactions between the different unknowns. We might expect a matrix with a large diagonal to be assured of certain properties, such as nonsingularity. However, to ensure nonsingularity it is not enough for each diagonal element to be the largest in its row. For example, the matrix

\notag  \left[\begin{array}{rrr}     3 & -1 & -2\\    -2 &  3 & -1\\    -2 & -1 &  3   \end{array}\right] \qquad (1)

is singular because [1~1~1]^T is a null vector. A useful definition of a matrix with large diagonal requires a stronger property.

A matrix A\in\mathbb{C}^{n\times n} is diagonally dominant by rows if

\notag        |a_{ii}| \ge \displaystyle\sum_{j\ne i} |a_{ij}|, \quad i=1\colon n. \qquad (2)

It is strictly diagonally dominant by rows if strict inequality holds in (2) for all i. A is (strictly) diagonally dominant by columns if A^T is (strictly) diagonally dominant by rows.

Diagonal dominance on its own is not enough to ensure nonsingularity, as the matrix (1) shows. Strict diagonal dominance does imply nonsingularity, however.

Theorem 1.

If A\in\mathbb{C}^{n\times n} is strictly diagonally dominant by rows or columns then it is nonsingular.

Proof. Since A is nonsingular if and only if A^T is nonsingular, it suffices to consider diagonal dominance by rows. For any nonzero x let y = Ax and choose k so that |x_k| = \|x\|_{\infty}. Then the kth equation of y = Ax can be written

\notag    a_{kk}x_k = y_k - \displaystyle\sum_{j\ne k} a_{kj}x_j,

which gives

\notag    |a_{kk}|\|x\|_{\infty} = |a_{kk}||x_k|     \le |y_k| + \displaystyle\sum_{j\ne k} |a_{kj}||x_j|     \le |y_k| + \|x\|_\infty \displaystyle\sum_{j\ne k} |a_{kj}|.

Using (2), we have

\notag    |y_k| \ge \|x\|_{\infty} \Bigl(|a_{kk}| - \displaystyle\sum_{j\ne k} |a_{kj}|\Bigr) > 0.    \qquad (3)

Therefore y\ne0 and so A is nonsingular. ~\square

Diagonal dominance plus two further conditions is enough to ensure nonsingularity. We need the notion of irreducibility. A matrix A\in\mathbb{R}^{n\times n} is irreducible if there does not exist a permutation matrix P such that

\notag       P^TAP = \begin{bmatrix} A_{11} & A_{12} \\                                  0   & A_{22} \end{bmatrix}

with A_{11} and A_{22} square matrices. Irreducibility is equivalent to the directed graph of A being strongly connected.

Theorem 2.

If A\in\mathbb{C}^{n\times n} is irreducible and diagonally dominant by rows with strict inequality in (2) for some i then it is nonsingular.

Proof. The proof is by contradiction. Suppose there exists x\ne 0 such that Ax = 0. Define

\notag   G = \{\, j: |x_j| = \|x\|_{\infty} \,\},   \quad   H = \{\, j: |x_j| < \|x\|_{\infty} \,\}.

The ith equation of Ax = 0 can be written

\notag       a_{ii}x_i = - \displaystyle\sum_{j\ne i} a_{ij}x_j                 = - \displaystyle\sum_{j\in G \atop j\ne i } a_{ij}x_j                   - \displaystyle\sum_{j\in H \atop j\ne i } a_{ij}x_j. \qquad (4)

Hence for i = r\in G,

\notag |a_{rr}| \le \displaystyle\sum_{j\in G \atop j\ne r } |a_{rj}|       + \displaystyle\sum_{j\in H \atop j\ne r } |a_{rj}|\frac{|x_j|}{\|x\|_\infty}.

The set H is nonempty, because if it were empty then we would have |x_j| = \|x\|_\infty for all j and if there is strict inequality in (2) for i = m, then putting i = m in (4) would give |a_{mm}| \le \sum_{j\ne m} |a_{mj}| |x_j|/|x_m|             =  \sum_{j\ne m} |a_{mj}|, which is a contradiction. Hence as long as a_{rj}\ne0 for some j\in H, we obtain |a_{rr}| <  \sum_{j\ne r } |a_{rj}|, which contradicts the diagonal dominance. Therefore we must have a_{rj}= 0 for all j\in H and all r\in G. This means that all the rows indexed by G have zeros in the columns indexed by H, which means that A is reducible. This is a contradiction, so A must be nonsingular. ~\square

The obvious analogue of Theorem 2 holds for column diagonal dominance.

As an example, the n\times n symmetric tridiagonal matrix (minus the second difference matrix)

\notag  T_n = \left[\begin{array}{@{\mskip 5mu}c*{4}{@{\mskip 15mu} r}@{\mskip 5mu}}      2 &   -1  &          &         & \\     -1 &    2  &  -1      &         & \\        &    -1 &   2      &  \ddots & \\        &       &  \ddots  &  \ddots & -1\\        &       &          &  -1     & 2            \end{array}\right], \qquad (5)

is row diagonally dominant with strict inequality in the first and last diagonal dominance relations. It can also be shown to be irreducible and so it is nonsingular by Theorem 2. If we replace t_{11} or t_{nn} by 1, then T remains nonsingular by the same argument. What if we replace both t_{11} and t_{nn} by 1? We can answer this question by using an observation of Strang. If we define the rectangular matrix

\notag  L_n = \begin{bmatrix}                1  &      &        &  \\                -1 &  1   &        &  \\                   & -1   & \ddots &  \\                   &      & \ddots & 1 \\                   &      &        &  -1     \end{bmatrix} \in\mathbb{R}^{(n+1)\times n}

then T_n = L_n^T L_n and

\notag \widetilde{T}_{n+1}  = \begin{bmatrix}                        1 &-1      &        &      & \\                       -1 & 2      & \ddots &      & \\                          & \ddots & \ddots &  -1  & \\                          &        &   -1    &  2  & -1\\                          &        &         &  -1 & 1                    \end{bmatrix}                          = L_n L_n^T \in \mathbb{R}^{(n+1) \times (n+1)}.

Since in general AB and BA have the same nonzero eigenvalues, we conclude that \Lambda(\widetilde{T}_{n+1})  = \Lambda(T_n) \cup \{0\}, where \Lambda(\cdot) denotes the spectrum. Hence T_n is symmetric positive definite and \widetilde{T}_n is singular and symmetric positive semidefinite.

Relation to Gershgorin’s Theorem

Theorem 1 can be used to obtain information about the location of the eigenvalues of a matrix. Indeed if \lambda is an eigenvalue of A then A - \lambda I is singular and hence cannot be strictly diagonally dominant, by Theorem 1. So |a_{ii}-\lambda| > \sum_{j\ne i} |a_{ij}| cannot be true for all i. Gershgorin’s theorem is simply a restatement of this fact.

Theorem 3 (Gershgorin’s theorem).

The eigenvalues of A\in\mathbb{C}^{n\times n} lie in the union of the n discs in the complex plane

\notag      D_i = \Big\{ z\in\mathbb{C}: |z-a_{ii}| \le \displaystyle\sum_{j\ne i}      |a_{ij}|\Big\}, \quad i=1\colon n.

If A is symmetric with positive diagonal elements and satisfies the conditions of Theorem 1 or Theorem 2 then it is positive definite. Indeed the eigenvalues are real and so in Gershgorin’s theorem the discs are intervals and a_{ii} - z \le |z-a_{ii}| \le \sum_{j\ne i}^n |a_{ij}|, so z \ge |a_{ii}| - \sum_{j\ne i}^n |a_{ij}| \ge 0, so the eigenvalues are nonnegative, and hence positive since nonzero. This provides another proof that the matrix T_n in (5) is positive definite.

Generalized Diagonal Dominance

In some situations A is not diagonally dominant but a row or column scaling of it is. For example, the matrix

\notag   A = \begin{bmatrix}         1   & 1   & 0 \\         2/3 & 2   & 1/4 \\         2/3 & 1/2 & 1       \end{bmatrix}

is not diagonally dominant by rows or columns but

\notag   A \, \mathrm{diag}(3,2,4)    = \begin{bmatrix}         3   & 2   & 0 \\         2   & 4   & 1   \\         2   & 1   & 4       \end{bmatrix}

is strictly diagonally dominant by rows.

A matrix A\in\mathbb{C}^{n\times n} is generalized diagonally dominant by rows if AD is diagonally dominant by rows for some diagonal matrix D = \mathrm{diag}(d_i) with d_i > 0 for all i, that is, if

\notag      |a_{ii}|d_i \ge \displaystyle\sum_{j\ne i} |a_{ij}|d_j, \quad i=1\colon n. \qquad (6)

It is easy to see that if A is irreducible and there is strictly inequality in (6) for some i then A is nonsingular by Theorem 2.

It can be shown that A is generalized diagonally dominant by rows if and only if it is an H-matrix, where an H-matrix is a matrix for which the comparison matrix M(A), defined by

\notag   M(A) = (m_{ij}), \quad m_{ij} =    \begin{cases} |a_{ii}|, & i=j, \\                 -|a_{ij}|, & i\ne j,     \end{cases}

is an M-matrix (see What Is an M-Matrix?).

Block Diagonal Dominance

A matrix A\in\mathbb{C}^{n\times n} is block diagonally dominant by rows if, for a given norm and block m\times m partitioning A = (A_{ij}), the diagonal blocks A_{jj} are all nonsingular and

\notag     \displaystyle\sum_{j\ne i} \|A_{ij}\| \le  \|A_{ii}^{-1}\|^{-1}, \quad i = 1\colon m.   \label{bdd}

A is block diagonally dominant by columns if A^T is block diagonally dominant by rows. If the blocks are all 1\times 1 then block diagonal dominance reduces to the usual notion of diagonal dominance. Block diagonal dominance holds for certain block tridiagonal matrices arising in the discretization of PDEs.

Analogues of Theorems 1 and 2 giving conditions under which block diagonal dominance implies nonsingularity are given by Feingold and Varga (1962).

Bounding the Inverse

If a matrix is strictly diagonally dominant then we can bound its inverse in terms of the minimum amount of diagonal dominance. For full generality, we state the bound in terms of generalized diagonal dominance.

Theorem 4.

If A\in\mathbb{C}^{n\times n} and AD is strictly diagonally dominant by rows for a diagonal matrix D = \mathrm{diag}(d_i) with d_i > 0 for all i, then

\notag   \|A^{-1}\|_\infty \le \displaystyle\frac{\|D\|_{\infty}}{\alpha},

where \alpha = \min_i (|a_{ii}|d_i - \sum_{j\ne i} |a_{ij}|d_j).

Proof. Assume first that D = I. Let y satisfy \|A^{-1}\|_{\infty} = \|A^{-1}y\|_{\infty} / \|y\|_{\infty} and let x = A^{-1}y. Applying (3) gives \|A^{-1}\|_{\infty} = \|x\|_{\infty} / \|y\|_{\infty} \le \alpha^{-1}. The result is obtained on applying this bound to AD and using \|A^{-1}\|_{\infty} \le \|D\|_{\infty} \|(AD)^{-1}\|_{\infty}. ~\square.

Another bound for A^{-1} when A is strictly diagonally dominant by rows can be obtained by writing A = D(I - E), where D = \mathrm{diag}(a_{ii}), e_{ii} = 0, and e_{ij} = -a_{ij}/a_{ii} for i\ne j. It is easy to see that \|E\|_\infty < 1, which gives another proof that A is nonsingular. Then

\notag  \begin{aligned}   |A^{-1}| &= |(I-E)^{-1}D^{-1}|            = |I + E + E^2 + \cdots | |D^{-1}|\\            &\le (I + |E| + |E|^2 + \cdots ) |D|^{-1}\\             &= (I - |E|)^{-1} |D|^{-1}\\             &= M(A)^{-1}.  \end{aligned}

This bound implies that M(A)^{-1} \ge 0, so in view of its sign pattern M(A) is an M-matrix, which essentially proves one direction of the H-matrix equivalence in the previous section. The same bound holds if A is diagonally dominant by columns, by writing A = (I-E)D.

An upper bound also holds for block diagonal dominance.

Theorem 5.

If A\in\mathbb{C}^{n\times n} is block diagonally dominant by rows then

\notag   \|A^{-1}\|_\infty \le \displaystyle\frac{1}{\alpha}.

where \alpha = \min_i ( \|A_{ii}^{-1}\|^{-1} - \sum_{j\ne i} \|A_{ij}\| ).

It is interesting to note that the inverse of a strictly row diagonally dominant matrix enjoys a form of diagonal dominance, namely that the largest element in each column is on the diagonal.

Theorem 6.

If A\in\mathbb{C}^{n\times n} is strictly diagonally dominant by rows then B = A^{-1} satisfies |b_{ij}| < |b_{jj}| for all i\ne j.

Proof. For i\ne j we have \sum_{k=1}^n a_{ik}b_{kj} = 0. Let \beta_j = \max_k |b_{kj}|. Taking absolute values in a_{ii}b_{ij} = -\sum_{k\ne i}a_{ik}b_{kj} gives

\notag  |a_{ii}||b_{ij}| \le \beta_j \sum_{k\ne i} |a_{ik}| < \beta_j |a_{ii}|,

or |b_{ij}| < \beta_j, since a_{ii} \ne 0. This inequality holds for all i\ne j, so we must have \beta_j = |b_{jj}|, which gives the result.

Historical Remarks

Theorems 1 and 2 have a long history and have been rediscovered many times. Theorem 1 was first stated by Lévy (1881) with additional assumptions. In a short but influential paper, Taussky (1949) pointed out the recurring nature of the theorems and gave simple proofs (our proof of Theorem 2 is Taussky’s). Schneider (1977) attributes the surge in interest in matrix theory in the 1950s and 1960s to Taussky’s paper and a few others by her, Brauer, Ostrowski, and Wielandt. The history of Gershgorin’s theorem (published in 1931) is intertwined with that of Theorems 1 and 2; see Varga’s 2004 book for details.

Theorems 4 and 5 are from Varah (1975) and Theorem 6 is from Ostrowski (1952).

References

This is a minimal set of references, which contain further useful references within.

Related Blog Posts

This article is part of the “What Is” series, available from https://nhigham.com/category/what-is and in PDF form from the GitHub repository https://github.com/higham/what-is.

5 thoughts on “What is a Diagonally Dominant Matrix?

  1. Hello Nick,

    Nice post. There is an important class of diagonally dominant (DD) matrices that just miss being M-matrices. I’ll refer to them as Q matrices, the name bestowed upon them by probabilists in their study of continuous-time Markov chains. Like M-matrices, the diagonal elements are positive and the off-diagonal elements are non-positive. But they are singular. Are you aware of a specific name for this class of DD matrices outside of Q matrices?

    1. Hi Rich. What you are describing sounds like minus a transition intensity matrix, which has zero row sums and which comes up as a generator for a Markov chain.

Leave a comment