What Is a Block Matrix?

A matrix is a rectangular array of numbers treated as a single object. A block matrix is a matrix whose elements are themselves matrices, which are called submatrices. By allowing a matrix to be viewed at different levels of abstraction, the block matrix viewpoint enables elegant proofs of results and facilitates the development and understanding of numerical algorithms.

A block matrix is defined in terms of a partitioning, which breaks a matrix into contiguous pieces. The most common and important case is for an $n\times n$ matrix to be partitioned as a block $2\times 2$ matrix (two block rows and two block columns). For $n = 4$, partitioning into $2\times 2$ blocks gives

$\notag A = \left[\begin{array}{cc|cc} a_{11} & a_{12} & a_{13} & a_{14}\\ a_{21} & a_{22} & a_{23} & a_{24}\\\hline a_{31} & a_{32} & a_{33} & a_{34}\\ a_{41} & a_{42} & a_{43} & a_{44}\\ \end{array}\right] = \begin{bmatrix} A_{11} & A_{12} \\ A_{21} & A_{22} \end{bmatrix},$

where

$\notag A_{11} = A(1\colon2,1\colon2) = \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix},$

and similarly for the other blocks. The diagonal blocks in a partitioning of a square matrix are usually square (but not necessarily so), and they do not have to be of the same dimensions. This same $4\times 4$ matrix could be partitioned as

$\notag A = \left[\begin{array}{c|ccc} a_{11} & a_{12} & a_{13} & a_{14}\\\hline a_{21} & a_{22} & a_{23} & a_{24}\\ a_{31} & a_{32} & a_{33} & a_{34}\\ a_{41} & a_{42} & a_{43} & a_{44}\\ \end{array}\right] = \begin{bmatrix} A_{11} & A_{12} \\ A_{21} & A_{22} \end{bmatrix},$

where $A_{11} = (a_{11})$ is a scalar, $A_{21}$ is a column vector, and $A_{12}$ is a row vector.

The sum $C = A + B$ of two block matrices $A = (A_{ij})$ and $B = (B_{ij})$ of the same dimension is obtained by adding blockwise as long as $A_{ij}$ and $B_{ij}$ have the same dimensions for all $i$ and $j$, and the result has the same block structure: $C_{ij} = A_{ij}+B_{ij}$,

The product $C = AB$ of an $m\times n$ matrix $A = (A_{ij})$ and an $n\times p$ matrix $B = (B_{ij})$ can be computed as $C_{ij} = \sum_k A_{ik}B_{kj}$ as long as the products $A_{ik}B_{kj}$ are all defined. In this case the matrices $A$ and $B$ are said to be conformably partitioned for multiplication. Here, $C$ has as many block rows as $A$ and as many block columns as $B$. For example,

$\notag AB = \begin{bmatrix} A_{11} & A_{12} \\ A_{21} & A_{22} \end{bmatrix} \begin{bmatrix} B_{11} & B_{12} \\ B_{21} & B_{22} \end{bmatrix} = \begin{bmatrix} A_{11} B_{11} + A_{12} B_{21} & A_{11} B_{12} + A_{12} B_{22} \\ A_{21} B_{11} + A_{22} B_{21} & A_{21} B_{12} + A_{22} B_{22} \end{bmatrix}$

as long as all the eight products $A_{ik}B_{kj}$ are defined.

Block matrix notation is an essential tool in numerical linear algebra. Here are some examples of its usage.

Matrix Factorization

For an $n\times n$ matrix $A$ with nonzero $(1,1)$ element $\alpha$ we can write

$\notag A = \begin{bmatrix} \alpha & b^T \\ c & D \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ c/\alpha & I_{n-1} \end{bmatrix} \begin{bmatrix} \alpha & b^T \\ 0 & D - cb^T/\alpha \end{bmatrix} = : L_1U_1$

The first row and column of $L_1$ have the correct form for a unit lower triangular matrix and likewise the first row and column of $U_1$ have the correct form for an upper triangular matrix. If we can find an LU factorization $D - cb^T/\alpha = L_2U_2$ of the $(n-1)\times (n-1)$ Schur complement $D$ then $A = L_1\mathrm{diag}(1,L_2)\cdot \mathrm{diag}(1,U_2)U_1$ is an LU factorization of $A$. This construction is the basis of an inductive proof of the existence of an LU factorization (provided all the pivots are nonzero) and it also yields an algorithm for computing it.

The same type of construction applies to other factorizations, such as Cholesky factorization, QR factorization, and the Schur decomposition.

Matrix Inverse

A useful formula for the inverse of a nonsingular block triangular matrix

$\notag T = \begin{bmatrix} T_{11} & T_{12} \\ 0 & T_{22} \end{bmatrix}$

is

$\notag T^{-1} = \begin{bmatrix} T_{11}^{-1} & - T_{11}^{-1}T_{12}T_{22}^{-1}\\ 0 & T_{22}^{-1} \end{bmatrix},$

which has the special case

$\notag \begin{bmatrix} I & X \\ 0 & I \end{bmatrix}^{-1} = \begin{bmatrix} I & -X\\ 0 & I \end{bmatrix}.$

If $T$ is upper triangular then so are $T_{11}$ and $T_{22}$. By taking $T_{11}$ of dimension the nearest integer to $n/2$ this formula can be used to construct a divide and conquer algorithm for computing $T^{-1}$.

We note that $\det(T) = \det(T_{11}) \det(T_{22})$, a fact that will be used in the next section.

Determinantal Formulas

Block matrices provides elegant proofs of many results involving determinants. For example, consider the equations

$\notag \begin{bmatrix} I & -A \\ 0 & I \end{bmatrix} \begin{bmatrix} I+AB & 0 \\ B & I \end{bmatrix} = \begin{bmatrix} I & -A\\ B & I \end{bmatrix} = \begin{bmatrix} I & 0 \\ B & I+BA \end{bmatrix} \begin{bmatrix} I & -A \\ 0 & I \end{bmatrix},$

which hold for any $A$ and $B$ such that $AB$ and $BA$ are defined. Taking determinants gives the formula $\det(I + AB) = \det(I + BA)$. In particular we can take $A = x$, $B = y^T$, for $n$-vectors $x$ and $y$, giving $\det(I + xy^T) = 1 + y^Tx$.

Constructing Matrices with Required Properties

We can sometimes build a matrix with certain desired properties by a block construction. For example, if $X$ is an $n\times n$ involutory matrix ($X^2 = I$) then

$\notag \begin{bmatrix} X & I \\ 0 & -X \end{bmatrix}$

is a (block triangular) $2n\times 2n$ involutory matrix. And if $A$ and $B$ are any two $n\times n$ matrices then

$\notag \begin{bmatrix} I - BA & B \\ 2A-ABA & AB-I \end{bmatrix}$

is involutory.

The Anti Block Diagonal Trick

For $n\times n$ matrices $A$ and $B$ consider the anti block diagonal matrix

$\notag X = \begin{bmatrix} 0 & A \\ B & 0 \end{bmatrix}.$

Note that

$\notag X^2 = \begin{bmatrix} AB & 0 \\ 0 & BA \end{bmatrix}, \quad X^{-1} = \begin{bmatrix} 0 & B^{-1} \\ A^{-1} & 0 \end{bmatrix}.$

Using these properties one can show a relation between the matrix sign function and the principal matrix square root:

$\notag \mathrm{sign}\left( \begin{bmatrix} 0 & A \\ I & 0 \end{bmatrix} \right) = \begin{bmatrix} 0 & A^{1/2} \\ A^{-1/2} & 0 \end{bmatrix}.$

This allows one to derive iterations for computing the matrix square root and its inverse from iterations for computing the matrix sign function.

It is easy to derive explicit formulas for all the powers of $X$, and hence for any power series evaluated at $X$. In particular, we have the formula

$\notag \mathrm{e}^X = \left[\begin{array}{cc} \cosh\sqrt{AB} & A (\sqrt{BA})^{-1} \sinh \sqrt{BA} \\[\smallskipamount] B(\sqrt{AB})^{-1} \sinh \sqrt{AB} & \cosh\sqrt{BA} \end{array}\right],$

where $\sqrt{Y}$ denotes any square root of $Y$. With $B = I$, this formula arises in the solution of the ordinary differential equation initial value problem $y'' + Ay = 0$, $y(0)=y_0$, $y'(0)=y'_0$,

The most well known instance of the trick is when $B = A^T$. The eigenvalues of

$\notag X = \begin{bmatrix} 0 & A \\ A^T & 0 \end{bmatrix}$

are plus and minus the singular values of $A$, together with $|m-n|$ additional zeros if $A$ is $m\times n$ with $m \ne n$, and the eigenvectors of $X$ and the singular vectors of $A$ are also related. Consequently, by applying results or algorithms for symmetric matrices to $X$ one obtains results or algorithms for the singular value decomposition of $A$.

References

This is a minimal set of references, which contain further useful references within.

• Gene Golub and Charles F. Van Loan, Matrix Computations, fourth edition, Johns Hopkins University Press, Baltimore, MD, USA, 2013.
• Nicholas J. Higham, Functions of Matrices: Theory and Computation, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2008. (Sections 1.5 and 1.6 for the theory of matrix square roots.)
• Roger A. Horn and Charles R. Johnson, Matrix Analysis, second edition, Cambridge University Press, 2013. My review of the second edition.