What Is a Stochastic Matrix?

A stochastic matrix is an n\times n matrix with nonnegative entries and unit row sums. If A\in\mathbb{R}^{n\times n} is stochastic then Ae = e, where e = [1,1,\dots,1]^T is the vector of ones. This means that e is an eigenvector of A corresponding to the eigenvalue 1.

The identity matrix is stochastic, as is any permutation matrix. Here are some other examples of stochastic matrices:

\notag \begin{aligned}   A_n &= n^{-1}ee^T, \quad \textrm{in particular}~~     A_3 = \begin{bmatrix}    \frac{1}{3} & \frac{1}{3} & \frac{1}{3}\\[3pt]    \frac{1}{3} & \frac{1}{3} & \frac{1}{3}\\[3pt]    \frac{1}{3} & \frac{1}{3} & \frac{1}{3}    \end{bmatrix}, \qquad (1)\\   B_n &= \frac{1}{n-1}(ee^T -I), \quad \textrm{in particular}~~     B_3 = \begin{bmatrix}              0 & \frac{1}{2} & \frac{1}{2}\\[2pt]    \frac{1}{2} & 0           & \frac{1}{2}\\[2pt]    \frac{1}{2} & \frac{1}{2} & 0    \end{bmatrix},    \qquad (2)\\   C_n &= \begin{bmatrix}             1         &                   &           &    \\      \frac{1}{2}      &  \frac{1}{2}      &           &    \\      \vdots           &  \vdots           &\ddots     &    \\      \frac{1}{n}      &  \frac{1}{n}      &\cdots     &  \frac{1}{n}      \end{bmatrix}.   \qquad (3)\\ \end{aligned}

For any matrix A, the spectral radius \rho(A) = \max\{ |\lambda| : \lambda \mathrm{~is~an~eigenvalue~of~} A \} is bounded by \rho(A) \le \|A\| for any norm. For a stochastic matrix, taking the \infty-norm (the maximum row sum of absolute values) gives \rho(A) \le 1, so since we know that 1 is an eigenvalue, \rho(A) = 1. It can be shown that 1 is a semisimple eigenvalue, that is, if there are k eigenvalues equal to 1 then there are k linearly independent eigenvectors corresponding to 1 (Meyer, 2000, p. 696).

A lower bound on the spectral radius can be obtained from Gershgorin’s theorem. The ith Gershgorin disc is defined by |\lambda - a_{ii}| \le \sum_{j\ne i} |a_{ij}| = 1 - a_{ii}, which implies |\lambda| \ge 2a_{ii}-1. Every eigenvalue \lambda lies in the union of the n discs and so must satisfy

\notag         2\min_i a_{ii}-1 \le |\lambda| \le 1.

The lower bound is positive if A is strictly diagonally dominant by rows.

If A and B are stochastic then AB is nonnegative and ABe = Ae = e, so AB is stochastic. In particular, any positive integer power of A is stochastic. Does A^k converge as k\to\infty? The answer is that it does, and the limit is stochastic, as long as 1 is the only eigenvalue of modulus 1, and this will be the case if all the elements of A are positive (by Perron’s theorem). The simplest example of non-convergence is the stochastic matrix

\notag    A = \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix},

which has eigenvalues 1 and -1. Since A^2 = I, all even powers are equal to I and all odd powers are equal to A. For the matrix (1), A_n^k = A_n for all k, while for (2), B_n^k \to A_n as k\to\infty. For (3), C_n^k converges to the matrix with 1 in every entry of the first column and zeros everywhere else.

Stochastic matrices arise in Markov chains. A transition matrix for a finite homogeneous Markov chain is a matrix whose (i,j) element is the probability of moving from state i to state j over a time step. It has nonnegative entries and the rows sum to 1, so it is a stochastic matrix. In applications including finance and healthcare, a transition matrix may be estimated for a certain time period, say one year, but a transition matrix for a shorter period, say one month, may be needed. If A is a transition matrix for a time period p then a stochastic pth root of A, which is a stochastic solution X of the equation X^p = A, is a transition matrix for a time period a factor p smaller. Therefore p = 12 (years to months) and p = 7 (weeks to days) are among the values of interest. Unfortunately, a stochastic pth root may not exist. For example, the matrix

\notag \begin{bmatrix}    0 & 1 & 0 \\    0 & 0 & 1 \\    0 & 0 & 1 \end{bmatrix}

has no pth roots at all, let alone stochastic ones. Yet many stochastic matrices do have stochastic roots. The matrix (3) has a stochastic pth root for all p, as shown by Higham and Lin (2011), who give a detailed analysis of pth roots of stochastic matrices. For example, to four decimal places,

\notag  C_6^{1/7} = \begin{bmatrix} 1.000  &        &        &        &        &        \\ 0.094  & 0.906  &        &        &        &        \\ 0.043  & 0.102  & 0.855  &        &        &        \\ 0.027  & 0.050  & 0.103  & 0.820  &        &        \\ 0.019  & 0.032  & 0.052  & 0.103  & 0.795  &        \\ 0.014  & 0.023  & 0.034  & 0.053  & 0.102  & 0.774  \\ \end{bmatrix}.

A stochastic matrix is sometime called a row-stochastic matrix to distinguish it from a column-stochastic matrix, which is a nonnegative matrix for which e^TA = e^T (so that A^T is row-stochastic). A matrix that is both row-stochastic and column-stochastic is called doubly stochastic. A permutation matrix is an example of a doubly stochastic matrix. If U is a unitary matrix then the matrix with a_{ij} = |u_{ij}|^2 is doubly stochastic. A magic square scaled by the magic sum is also doubly stochastic. For example,

>> M = magic(4), A = M/sum(M(1,:)) % A is doubly stochastic.
M =
    16     2     3    13
     5    11    10     8
     9     7     6    12
     4    14    15     1
A =
   4.7059e-01   5.8824e-02   8.8235e-02   3.8235e-01
   1.4706e-01   3.2353e-01   2.9412e-01   2.3529e-01
   2.6471e-01   2.0588e-01   1.7647e-01   3.5294e-01
   1.1765e-01   4.1176e-01   4.4118e-01   2.9412e-02
>> [sum(A) sum(A')]
ans =
     1     1     1     1     1     1     1     1
>> eig(A)'
ans =
   1.0000e+00   2.6307e-01  -2.6307e-01   8.5146e-18

References

  • Nicholas J. Higham and Lijing Lin, On pth Roots of Stochastic Matrices, Linear Algebra Appl. 435, 448–463, 2011.
  • Carl D. Meyer, Matrix Analysis and Applied Linear Algebra, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2000.

Related Blog Posts

Leave a comment