What Is a Fractional Matrix Power?

A pth root of an n\times n matrix A is a matrix X such that X^p = A, and it can be written X = A^{1/p}. For a rational number r = j/k (where j and k are integers), defining A^r is more difficult: is it (A^j)^{1/k} or (A^{1/k})^j? These two possibilities can be different even for n = 1. More generally, how can we define A^\alpha for an arbitrary real number \alpha?

Recall, first, that for a nonzero complex scalar z we define z^\alpha = \mathrm{e}^{\alpha\log z}, where \log is the principal logarithm: the one taking values in the strip \{\, z : -\pi <  \mathop{\mathrm{Im}} z \le \pi \,\}. We can generalize this definition to matrices. For a nonsingular matrix A we define

\notag        A^\alpha = \mathrm{e}^{\alpha\log A}.

Here the logarithm is the principal matrix logarithm, the matrix function built on the principal scalar logarithm, and so the eigenvalues of \log A lie in the strip \{\, z : -\pi <  \mathop{\mathrm{Im}} z \le \pi \,\}. When \alpha \in [-1,1], the eigenvalues of A^\alpha, which are \lambda^\alpha where \lambda is an eigenvalue of A, lie in the segment \{\, z: -|\alpha|\pi < \arg z \le |\alpha|\pi\,\} of the complex plane.

The most important special case is \alpha = 1/p for a positive integer p, in which case

\notag        A^{1/p} = \mathrm{e}^{\frac{1}{p}\log A}.

We can check that

\notag    \bigl(A^{1/p}\bigr)^p = \bigl(\mathrm{e}^{\frac{1}{p}\log A}\bigr)^p                          = \mathrm{e}^{\log A} = A,

so the definition does indeed produce a pth root. The matrix A^{1/p} is called the principal pth root.

Returning to the case of rational p, we note that

\notag    A^{j/k} = \mathrm{e}^{\frac{j}{k}\log A}            = \Bigl(\mathrm{e}^{\frac{1}{k}\log A}\Bigr)^j            = (A^{1/k})^j,

but (A^{j})^{1/k} can be a different matrix. In general, it is not true that (A^\alpha)^\beta = (A^\beta)^\alpha for real \alpha and \beta, although for symmetric positive definite matrices this identity does hold because the eigenvalues are real and positive.

An integral expression for A^\alpha valid for \alpha \in(0,1) is

\notag    A^\alpha  = \displaystyle\frac{\sin(\alpha\pi)} {\alpha\pi} A \int_0^{\infty}(t^{1/\alpha}I+A)^{-1}\,\mathrm{d}t,    \quad \alpha \in(0,1).

Another representation for A = t (I - B) with \rho(B) < 1 is given by the binomial expansion

\notag     A^{\alpha} =  t^{\alpha}          \displaystyle\sum_{j=0}^\infty {\alpha \choose j}(-B)^j,     \quad \rho(B) < 1.

For 2\times 2 real matrices of the form

\notag A =  \begin{bmatrix}        a & b \\        c & a      \end{bmatrix}

with bc < 0 there is an explicit formula for A^\alpha. It is easy to see that A has eigenvalues \lambda_{\pm} = a \pm \mathrm{i} d, where d = (-bc)^{1/2}. Let \theta = \arg(\lambda_+) \in (0,\pi) and r = |\lambda_+|. It can be shown that

\notag A^\alpha = \displaystyle\frac{r^\alpha}{d}        \begin{bmatrix}            d\cos(\alpha\theta) & b\sin(\alpha\theta) \\            c\sin(\alpha\theta) & d\cos(\alpha\theta)        \end{bmatrix}.

Computation

The formula A^\alpha = \mathrm{e}^{\alpha \log A} can be used computationally, but it is somewhat indirect in that one must approximate both the exponential and the logarithm. A more direct algorithm based on the Schur decomposition and Padé approximation of the power function is developed by Higham and Lin (2013). MATLAB code is available from MathWorks File Exchange.

If A is diagonalizable, so that A = XDX^{-1} for some nonsingular X with D = \mathrm{diag}(\lambda_i), then A^\alpha = X D^\alpha X^{-1} = X \mathrm{diag}(\lambda_i^\alpha) X^{-1}. This formula is safe to use computationally only if X is well conditioned. For defective (non-diagonalizable) A we can express A^\alpha in terms of the Jordan canonical form, but this expression is not useful computationally because the Jordan canonical form cannot be reliably computed.

Inverse Function

If X = A^{1/2} then A = X^2, by the definition of square root. If X = A^{\alpha} does it follow that A = X^{1/\alpha}? Clearly, the answer is “no” in general because, for example, X = A^2 does not imply A = X^{1/2}.

Using the matrix unwinding function it can be shown that (A^{\alpha})^{1/\alpha} = A for \alpha \in [-1,1]. Hence the function g(A) = A^{1/\alpha} is the inverse function of f(A) = A^{\alpha} for \alpha\in[-1,1].

Backward Error

How can we check the quality of an approximation X to A^{\alpha}? For \alpha = 1/p we can check the residual A - X^p, but for real p there is no natural residual. Instead we can look at the backward error.

For an approximation X to A^\alpha, a backward error is a matrix \Delta A such that X = (A + \Delta A)^\alpha. Assume that A and X are nonsingular and that \alpha \in [-1,1]. Then, as shown in the previous section, X = (A + \Delta A)^\alpha implies \Delta A = X^{1/\alpha} - A. Hence the normwise relative backward error is

\notag      \eta(X) = \displaystyle\frac{ \|X^{1/\alpha} - A \|}{\|A\|}, \quad \alpha\in[-1,1].

Applications with Stochastic Matrices

An important application of fractional matrix powers is in discrete-time Markov chains, which arise in areas including finance and medicine. A transition matrix for a Markov process is a matrix whose (i,j) element is the probability of moving from state i to state j over a time step. It has nonnegative entries and the rows sum to 1, so it is a stochastic matrix. In practice, a transition matrix may be estimated for a certain time period, say one year, but a transition matrix for a shorter period, say one month, may be needed. If A is a transition matrix for a time period p then a stochastic pth root of A is a transition matrix for a time period a factor p smaller. Therefore p = 12 (years to months) and p = 7 (weeks to days) are among the values of interest. Unfortunately, A^{1/p} is not necessarily a stochastic matrix. Moreover, A can have a stochastic pth root that is not A^{1/p}. For example, the stochastic matrix

\notag    A = \left[\begin{array}{ccc}        0  & 1 & 0\\        0  & 0 & 1\\        1  & 0 & 0\\        \end{array} \right]

has principal square root

\notag    A^{1/2} = \frac{1}{3} \left[\begin{array}{rrr}        2  & 2 & -1 \\        -1 & 2 & 2 \\        2 & -1 & 2        \end{array} \right],

but A^{1/2} is not stochastic because of the negative elements. The square root

\notag         Y = \left[\begin{array}{ccc}          0  & 0  &  1\\          1  & 0  &  0\\          0  & 1  &  0\\          \end{array} \right]

is stochastic, though. (Interestingly, A is also a square root of Y!)

A wide variety configurations is possible as regards existence, nature (primary or nonprimary), and number of stochastic roots. Higham and Lin (2011) delineate the various possibilities that can arise. They note that the stochastic lower triangular matrix

\notag    A_n = \begin{bmatrix}             1         &                   &           &    \\      \frac{1}{2}      &  \frac{1}{2}      &           &    \\      \vdots           &  \vdots           &\ddots     &    \\      \frac{1}{n}      &  \frac{1}{n}      &\cdots     &  \frac{1}{n}    \end{bmatrix}

has a stochastic pth root, namely A_n^{1/p}, for all p. For example, to three significant figures,

\notag  A_6^{1/3} = \begin{bmatrix} 1.000  &        &        &        &        &        \\ 0.206  & 0.794  &        &        &        &        \\ 0.106  & 0.201  & 0.693  &        &        &        \\ 0.069  & 0.111  & 0.190  & 0.630  &        &        \\ 0.050  & 0.075  & 0.109  & 0.181  & 0.585  &        \\ 0.039  & 0.056  & 0.076  & 0.107  & 0.172  & 0.550  \\ \end{bmatrix}.

The existence of stochastic roots of stochastic matrices is connected with the embeddability problem, which asks when a nonsingular stochastic matrix A can be written A = \mathrm{e}^Q for some Q with q_{ij} \ge 0 for i\ne j and \sum_{j=1}^n q_{ij} = 0, i=1\colon n. Kingman showed in 1962 that this condition holds if and only if for every positive integer p there exists a stochastic X_p such that A = X_p^p.

Applications in Fractional Differential Equations

Fractional matrix powers arise in the numerical solution of differential equations of fractional order, especially partial differential equations involving fractional Laplace operators. Here, the problem may be one of computing A^\alpha b, in which case for large problems it is preferable to directly approximate A^\alpha b, for example by Krylov methods or quadrature methods, rather than to explicitly compute A^\alpha.

References

This is a minimal set of references, which contain further useful references within.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s