What is a Diagonally Dominant Matrix?

Matrices arising in applications often have diagonal elements that are large relative to the off-diagonal elements. In the context of a linear system this corresponds to relatively weak interactions between the different unknowns. We might expect a matrix with a large diagonal to be assured of certain properties, such as nonsingularity. However, to ensure nonsingularity it is not enough for each diagonal element to be the largest in its row. For example, the matrix

$\notag \left[\begin{array}{rrr} 3 & -1 & -2\\ -2 & 3 & -1\\ -2 & -1 & 3 \end{array}\right] \qquad (1)$

is singular because $[1~1~1]^T$ is a null vector. A useful definition of a matrix with large diagonal requires a stronger property.

A matrix $A\in\mathbb{C}^{n\times n}$ is diagonally dominant by rows if

$\notag |a_{ii}| \ge \displaystyle\sum_{j\ne i} |a_{ij}|, \quad i=1\colon n. \qquad (2)$

It is strictly diagonally dominant by rows if strict inequality holds in (2) for all $i$. $A$ is (strictly) diagonally dominant by columns if $A^T$ is (strictly) diagonally dominant by rows.

Diagonal dominance on its own is not enough to ensure nonsingularity, as the matrix (1) shows. Strict diagonal dominance does imply nonsingularity, however.

Theorem 1.

If $A\in\mathbb{C}^{n\times n}$ is strictly diagonally dominant by rows or columns then it is nonsingular.

Proof. Since $A$ is nonsingular if and only if $A^T$ is nonsingular, it suffices to consider diagonal dominance by rows. For any nonzero $x$ let $y = Ax$ and choose $k$ so that $|x_k| = \|x\|_{\infty}$. Then the $k$th equation of $y = Ax$ can be written

$\notag a_{kk}x_k = y_k - \displaystyle\sum_{j\ne k} a_{kj}x_j,$

which gives

$\notag |a_{kk}|\|x\|_{\infty} = |a_{kk}||x_k| \le |y_k| + \displaystyle\sum_{j\ne k} |a_{kj}||x_j| \le |y_k| + \|x\|_\infty \displaystyle\sum_{j\ne k} |a_{kj}|.$

Using (2), we have

$\notag |y_k| \ge \|x\|_{\infty} \Bigl(|a_{kk}| - \displaystyle\sum_{j\ne k} |a_{kj}|\Bigr) > 0. \qquad (3)$

Therefore $y\ne0$ and so $A$ is nonsingular. $~\square$

Diagonal dominance plus two further conditions is enough to ensure nonsingularity. We need the notion of irreducibility. A matrix $A\in\mathbb{R}^{n\times n}$ is irreducible if there does not exist a permutation matrix $P$ such that

$\notag P^TAP = \begin{bmatrix} A_{11} & A_{12} \\ 0 & A_{22} \end{bmatrix}$

with $A_{11}$ and $A_{22}$ square matrices. Irreducibility is equivalent to the directed graph of $A$ being strongly connected.

Theorem 2.

If $A\in\mathbb{C}^{n\times n}$ is irreducible and diagonally dominant by rows with strict inequality in $(2)$ for some $i$ then it is nonsingular.

Proof. The proof is by contradiction. Suppose there exists $x\ne 0$ such that $Ax = 0$. Define

$\notag G = \{\, j: |x_j| = \|x\|_{\infty} \,\}, \quad H = \{\, j: |x_j| < \|x\|_{\infty} \,\}.$

The $i$th equation of $Ax = 0$ can be written

$\notag a_{ii}x_i = - \displaystyle\sum_{j\ne i} a_{ij}x_j = - \displaystyle\sum_{j\in G \atop j\ne i } a_{ij}x_j - \displaystyle\sum_{j\in H \atop j\ne i } a_{ij}x_j. \qquad (4)$

Hence for $i = r\in G$,

$\notag |a_{rr}| \le \displaystyle\sum_{j\in G \atop j\ne r } |a_{rj}| + \displaystyle\sum_{j\in H \atop j\ne r } |a_{rj}|\frac{|x_j|}{\|x\|_\infty}.$

The set $H$ is nonempty, because if it were empty then we would have $|x_j| = \|x\|_\infty$ for all $j$ and if there is strict inequality in $(2)$ for $i = m$, then putting $i = m$ in (4) would give $|a_{mm}| \le \sum_{j\ne m} |a_{mj}| |x_j|/|x_m| = \sum_{j\ne m} |a_{mj}|$, which is a contradiction. Hence as long as $a_{rj}\ne0$ for some $j\in H$, we obtain $|a_{rr}| < \sum_{j\ne r } |a_{rj}|$, which contradicts the diagonal dominance. Therefore we must have $a_{rj}= 0$ for all $j\in H$ and all $r\in G$. This means that all the rows indexed by $G$ have zeros in the columns indexed by $H$, which means that $A$ is reducible. This is a contradiction, so $A$ must be nonsingular. $~\square$

The obvious analogue of Theorem 2 holds for column diagonal dominance.

As an example, the $n\times n$ symmetric tridiagonal matrix (minus the second difference matrix)

$\notag T_n = \left[\begin{array}{@{\mskip 5mu}c*{4}{@{\mskip 15mu} r}@{\mskip 5mu}} 2 & -1 & & & \\ -1 & 2 & -1 & & \\ & -1 & 2 & \ddots & \\ & & \ddots & \ddots & -1\\ & & & -1 & 2 \end{array}\right], \qquad (5)$

is row diagonally dominant with strict inequality in the first and last diagonal dominance relations. It can also be shown to be irreducible and so it is nonsingular by Theorem 2. If we replace $t_{11}$ or $t_{nn}$ by $1$, then $T$ remains nonsingular by the same argument. What if we replace both $t_{11}$ and $t_{nn}$ by $1$? We can answer this question by using an observation of Strang. If we define the rectangular matrix

$\notag L_n = \begin{bmatrix} 1 & & & \\ -1 & 1 & & \\ & -1 & \ddots & \\ & & \ddots & 1 \\ & & & -1 \end{bmatrix} \in\mathbb{R}^{(n+1)\times n}$

then $T_n = L_n^T L_n$ and

$\notag \widetilde{T}_{n+1} = \begin{bmatrix} 1 &-1 & & & \\ -1 & 2 & \ddots & & \\ & \ddots & \ddots & -1 & \\ & & -1 & 2 & -1\\ & & & -1 & 1 \end{bmatrix} = L_n L_n^T \in \mathbb{R}^{(n+1) \times (n+1)}.$

Since in general $AB$ and $BA$ have the same nonzero eigenvalues, we conclude that $\Lambda(\widetilde{T}_{n+1}) = \Lambda(T_n) \cup \{0\}$, where $\Lambda(\cdot)$ denotes the spectrum. Hence $T_n$ is symmetric positive definite and $\widetilde{T}_n$ is singular and symmetric positive semidefinite.

Relation to Gershgorin’s Theorem

Theorem 1 can be used to obtain information about the location of the eigenvalues of a matrix. Indeed if $\lambda$ is an eigenvalue of $A$ then $A - \lambda I$ is singular and hence cannot be strictly diagonally dominant, by Theorem 1. So $|a_{ii}-\lambda| > \sum_{j\ne i} |a_{ij}|$ cannot be true for all $i$. Gershgorin’s theorem is simply a restatement of this fact.

Theorem 3 (Gershgorin’s theorem).

The eigenvalues of $A\in\mathbb{C}^{n\times n}$ lie in the union of the $n$ discs in the complex plane

$\notag D_i = \Big\{ z\in\mathbb{C}: |z-a_{ii}| \le \displaystyle\sum_{j\ne i} |a_{ij}|\Big\}, \quad i=1\colon n.$

If $A$ is symmetric with positive diagonal elements and satisfies the conditions of Theorem 1 or Theorem 2 then it is positive definite. Indeed the eigenvalues are real and so in Gershgorin’s theorem the discs are intervals and $a_{ii} - z \le |z-a_{ii}| \le \sum_{j\ne i}^n |a_{ij}|$, so $z \ge |a_{ii}| - \sum_{j\ne i}^n |a_{ij}| \ge 0$, so the eigenvalues are nonnegative, and hence positive since nonzero. This provides another proof that the matrix $T_n$ in (5) is positive definite.

Generalized Diagonal Dominance

In some situations $A$ is not diagonally dominant but a row or column scaling of it is. For example, the matrix

$\notag A = \begin{bmatrix} 1 & 1 & 0 \\ 2/3 & 2 & 1/4 \\ 2/3 & 1/2 & 1 \end{bmatrix}$

is not diagonally dominant by rows or columns but

$\notag A \, \mathrm{diag}(3,2,4) = \begin{bmatrix} 3 & 2 & 0 \\ 2 & 4 & 1 \\ 2 & 1 & 4 \end{bmatrix}$

is strictly diagonally dominant by rows.

A matrix $A\in\mathbb{C}^{n\times n}$ is generalized diagonally dominant by rows if $AD$ is diagonally dominant by rows for some diagonal matrix $D = \mathrm{diag}(d_i)$ with $d_i > 0$ for all $i$, that is, if

$\notag |a_{ii}|d_i \ge \displaystyle\sum_{j\ne i} |a_{ij}|d_j, \quad i=1\colon n. \qquad (6)$

It is easy to see that if $A$ is irreducible and there is strictly inequality in (6) for some $i$ then $A$ is nonsingular by Theorem 2.

It can be shown that $A$ is generalized diagonally dominant by rows if and only if it is an $H$-matrix, where an $H$-matrix is a matrix for which the comparison matrix $M(A)$, defined by

$\notag M(A) = (m_{ij}), \quad m_{ij} = \begin{cases} |a_{ii}|, & i=j, \\ -|a_{ij}|, & i\ne j, \end{cases}$

is an $M$-matrix (see What Is an M-Matrix?).

Block Diagonal Dominance

A matrix $A\in\mathbb{C}^{n\times n}$ is block diagonally dominant by rows if, for a given norm and block $m\times m$ partitioning $A = (A_{ij})$, the diagonal blocks $A_{jj}$ are all nonsingular and

$\notag \displaystyle\sum_{j\ne i} \|A_{ij}\| \le \|A_{ii}^{-1}\|^{-1}, \quad i = 1\colon m. \label{bdd}$

$A$ is block diagonally dominant by columns if $A^T$ is block diagonally dominant by rows. If the blocks are all $1\times 1$ then block diagonal dominance reduces to the usual notion of diagonal dominance. Block diagonal dominance holds for certain block tridiagonal matrices arising in the discretization of PDEs.

Analogues of Theorems 1 and 2 giving conditions under which block diagonal dominance implies nonsingularity are given by Feingold and Varga (1962).

Bounding the Inverse

If a matrix is strictly diagonally dominant then we can bound its inverse in terms of the minimum amount of diagonal dominance. For full generality, we state the bound in terms of generalized diagonal dominance.

Theorem 4.

If $A\in\mathbb{C}^{n\times n}$ and $AD$ is strictly diagonally dominant by rows for a diagonal matrix $D = \mathrm{diag}(d_i)$ with $d_i > 0$ for all $i$, then

$\notag \|A^{-1}\|_\infty \le \displaystyle\frac{\|D\|_{\infty}}{\alpha},$

where $\alpha = \min_i (|a_{ii}|d_i - \sum_{j\ne i} |a_{ij}|d_j)$.

Proof. Assume first that $D = I$. Let $y$ satisfy $\|A^{-1}\|_{\infty} = \|A^{-1}y\|_{\infty} / \|y\|_{\infty}$ and let $x = A^{-1}y$. Applying (3) gives $\|A^{-1}\|_{\infty} = \|x\|_{\infty} / \|y\|_{\infty} \le \alpha^{-1}$. The result is obtained on applying this bound to $AD$ and using $\|A^{-1}\|_{\infty} \le \|D\|_{\infty} \|(AD)^{-1}\|_{\infty}$. $~\square$.

Another bound for $A^{-1}$ when $A$ is strictly diagonally dominant by rows can be obtained by writing $A = D(I - E)$, where $D = \mathrm{diag}(a_{ii})$, $e_{ii} = 0$, and $e_{ij} = -a_{ij}/a_{ii}$ for $i\ne j$. It is easy to see that $\|E\|_\infty < 1$, which gives another proof that $A$ is nonsingular. Then

\notag \begin{aligned} |A^{-1}| &= |(I-E)^{-1}D^{-1}| = |I + E + E^2 + \cdots | |D^{-1}|\\ &\le (I + |E| + |E|^2 + \cdots ) |D|^{-1}\\ &= (I - |E|)^{-1} |D|^{-1}\\ &= M(A)^{-1}. \end{aligned}

This bound implies that $M(A)^{-1} \ge 0$, so in view of its sign pattern $M(A)$ is an $M$-matrix, which essentially proves one direction of the $H$-matrix equivalence in the previous section. The same bound holds if $A$ is diagonally dominant by columns, by writing $A = (I-E)D$.

An upper bound also holds for block diagonal dominance.

Theorem 5.

If $A\in\mathbb{C}^{n\times n}$ is block diagonally dominant by rows then

$\notag \|A^{-1}\|_\infty \le \displaystyle\frac{1}{\alpha}.$

where $\alpha = \min_i ( \|A_{ii}^{-1}\|^{-1} - \sum_{j\ne i} \|A_{ij}\| )$.

It is interesting to note that the inverse of a strictly row diagonally dominant matrix enjoys a form of diagonal dominance, namely that the largest element in each column is on the diagonal.

Theorem 6.

If $A\in\mathbb{C}^{n\times n}$ is strictly diagonally dominant by rows then $B = A^{-1}$ satisfies $|b_{ij}| < |b_{jj}|$ for all $i\ne j$.

Proof. For $i\ne j$ we have $\sum_{k=1}^n a_{ik}b_{kj} = 0$. Let $\beta_j = \max_k |b_{kj}|$. Taking absolute values in $a_{ii}b_{ij} = -\sum_{k\ne i}a_{ik}b_{kj}$ gives

$\notag |a_{ii}||b_{ij}| \le \beta_j \sum_{k\ne i} |a_{ik}| < \beta_j |a_{ii}|,$

or $|b_{ij}| < \beta_j$, since $a_{ii} \ne 0$. This inequality holds for all $i\ne j$, so we must have $\beta_j = |b_{jj}|$, which gives the result.

Historical Remarks

Theorems 1 and 2 have a long history and have been rediscovered many times. Theorem 1 was first stated by Lévy (1881) with additional assumptions. In a short but influential paper, Taussky (1949) pointed out the recurring nature of the theorems and gave simple proofs (our proof of Theorem 2 is Taussky’s). Schneider (1977) attributes the surge in interest in matrix theory in the 1950s and 1960s to Taussky’s paper and a few others by her, Brauer, Ostrowski, and Wielandt. The history of Gershgorin’s theorem (published in 1931) is intertwined with that of Theorems 1 and 2; see Varga’s 2004 book for details.

Theorems 4 and 5 are from Varah (1975) and Theorem 6 is from Ostrowski (1952).

References

This is a minimal set of references, which contain further useful references within.