What Is a Block Matrix?

A matrix is a rectangular array of numbers treated as a single object. A block matrix is a matrix whose elements are themselves matrices, which are called submatrices. By allowing a matrix to be viewed at different levels of abstraction, the block matrix viewpoint enables elegant proofs of results and facilitates the development and understanding of numerical algorithms.

A block matrix is defined in terms of a partitioning, which breaks a matrix into contiguous pieces. The most common and important case is for an n\times n matrix to be partitioned as a block 2\times 2 matrix (two block rows and two block columns). For n = 4, partitioning into 2\times 2 blocks gives

\notag   A = \left[\begin{array}{cc|cc}         a_{11} & a_{12} & a_{13} & a_{14}\\         a_{21} & a_{22} & a_{23} & a_{24}\\\hline         a_{31} & a_{32} & a_{33} & a_{34}\\         a_{41} & a_{42} & a_{43} & a_{44}\\         \end{array}\right]    =  \begin{bmatrix}         A_{11} & A_{12} \\         A_{21} & A_{22}        \end{bmatrix},

where

\notag  A_{11} =   A(1\colon2,1\colon2) = \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix},

and similarly for the other blocks. The diagonal blocks in a partitioning of a square matrix are usually square (but not necessarily so), and they do not have to be of the same dimensions. This same 4\times 4 matrix could be partitioned as

\notag   A = \left[\begin{array}{c|ccc}         a_{11} & a_{12} & a_{13} & a_{14}\\\hline         a_{21} & a_{22} & a_{23} & a_{24}\\         a_{31} & a_{32} & a_{33} & a_{34}\\         a_{41} & a_{42} & a_{43} & a_{44}\\         \end{array}\right]    =  \begin{bmatrix}         A_{11} & A_{12} \\         A_{21} & A_{22}        \end{bmatrix},

where A_{11} = (a_{11}) is a scalar, A_{21} is a column vector, and A_{12} is a row vector.

The sum C = A + B of two block matrices A = (A_{ij}) and B = (B_{ij}) of the same dimension is obtained by adding blockwise as long as A_{ij} and B_{ij} have the same dimensions for all i and j, and the result has the same block structure: C_{ij} = A_{ij}+B_{ij},

The product C = AB of an m\times n matrix A = (A_{ij}) and an n\times p matrix B = (B_{ij}) can be computed as C_{ij} = \sum_k A_{ik}B_{kj} as long as the products A_{ik}B_{kj} are all defined. In this case the matrices A and B are said to be conformably partitioned for multiplication. Here, C has as many block rows as A and as many block columns as B. For example,

\notag   AB =        \begin{bmatrix}         A_{11} & A_{12} \\         A_{21} & A_{22}        \end{bmatrix}        \begin{bmatrix}         B_{11} & B_{12} \\         B_{21} & B_{22}        \end{bmatrix}   =  \begin{bmatrix}         A_{11} B_{11} + A_{12} B_{21} & A_{11} B_{12} + A_{12} B_{22} \\         A_{21} B_{11} + A_{22} B_{21} & A_{21} B_{12} + A_{22} B_{22}        \end{bmatrix}

as long as all the eight products A_{ik}B_{kj} are defined.

Block matrix notation is an essential tool in numerical linear algebra. Here are some examples of its usage.

Matrix Factorization

For an n\times n matrix A with nonzero (1,1) element \alpha we can write

\notag   A =  \begin{bmatrix}         \alpha & b^T \\           c & D        \end{bmatrix}    =        \begin{bmatrix}         1 & 0 \\           c/\alpha & I_{n-1}        \end{bmatrix}       \begin{bmatrix}         \alpha  & b^T \\          0 & D - cb^T/\alpha        \end{bmatrix} = : L_1U_1

The first row and column of L_1 have the correct form for a unit lower triangular matrix and likewise the first row and column of U_1 have the correct form for an upper triangular matrix. If we can find an LU factorization D - cb^T/\alpha = L_2U_2 of the (n-1)\times (n-1) Schur complement D then A = L_1\mathrm{diag}(1,L_2)\cdot \mathrm{diag}(1,U_2)U_1 is an LU factorization of A. This construction is the basis of an inductive proof of the existence of an LU factorization (provided all the pivots are nonzero) and it also yields an algorithm for computing it.

The same type of construction applies to other factorizations, such as Cholesky factorization, QR factorization, and the Schur decomposition.

Matrix Inverse

A useful formula for the inverse of a nonsingular block triangular matrix

\notag      T = \begin{bmatrix}         T_{11} & T_{12} \\           0 & T_{22}        \end{bmatrix}

is

\notag      T^{-1}        =        \begin{bmatrix}      T_{11}^{-1} & - T_{11}^{-1}T_{12}T_{22}^{-1}\\               0  & T_{22}^{-1}        \end{bmatrix},

which has the special case

\notag        \begin{bmatrix}         I & X \\           0 & I        \end{bmatrix}^{-1}    =        \begin{bmatrix}              I   & -X\\               0  & I        \end{bmatrix}.

If T is upper triangular then so are T_{11} and T_{22}. By taking T_{11} of dimension the nearest integer to n/2 this formula can be used to construct a divide and conquer algorithm for computing T^{-1}.

We note that \det(T) = \det(T_{11}) \det(T_{22}), a fact that will be used in the next section.

Determinantal Formulas

Block matrices provides elegant proofs of many results involving determinants. For example, consider the equations

\notag        \begin{bmatrix}         I & -A \\           0 & I        \end{bmatrix}        \begin{bmatrix}         I+AB & 0 \\           B & I        \end{bmatrix}      =        \begin{bmatrix}               I  & -A\\               B  & I        \end{bmatrix}      =        \begin{bmatrix}         I & 0 \\         B & I+BA        \end{bmatrix}        \begin{bmatrix}         I & -A \\         0 & I        \end{bmatrix},

which hold for any A and B such that AB and BA are defined. Taking determinants gives the formula \det(I + AB) = \det(I + BA). In particular we can take A = x, B = y^T, for n-vectors x and y, giving \det(I + xy^T) = 1 + y^Tx.

Constructing Matrices with Required Properties

We can sometimes build a matrix with certain desired properties by a block construction. For example, if X is an n\times n involutory matrix (X^2 = I) then

\notag        \begin{bmatrix}         X & I \\         0 & -X        \end{bmatrix}

is a (block triangular) 2n\times 2n involutory matrix. And if A and B are any two n\times n matrices then

\notag        \begin{bmatrix}               I - BA & B \\               2A-ABA & AB-I        \end{bmatrix}

is involutory.

The Anti Block Diagonal Trick

For n\times n matrices A and B consider the anti block diagonal matrix

\notag     X = \begin{bmatrix}               0 & A \\               B & 0        \end{bmatrix}.

Note that

\notag     X^2 = \begin{bmatrix}               AB & 0 \\               0 & BA        \end{bmatrix}, \quad     X^{-1} = \begin{bmatrix}               0 & B^{-1} \\               A^{-1} & 0        \end{bmatrix}.

Using these properties one can show a relation between the matrix sign function and the principal matrix square root:

\notag     \mathrm{sign}\left(        \begin{bmatrix} 0 & A \\ I & 0 \end{bmatrix}                 \right)         = \begin{bmatrix} 0 & A^{1/2} \\ A^{-1/2} & 0 \end{bmatrix}.

This allows one to derive iterations for computing the matrix square root and its inverse from iterations for computing the matrix sign function.

It is easy to derive explicit formulas for all the powers of X, and hence for any power series evaluated at X. In particular, we have the formula

\notag   \mathrm{e}^X = \left[\begin{array}{cc}                   \cosh\sqrt{AB} & A (\sqrt{BA})^{-1} \sinh \sqrt{BA}                        \\[\smallskipamount]                       B(\sqrt{AB})^{-1} \sinh \sqrt{AB} &                      \cosh\sqrt{BA}                  \end{array}\right],

where \sqrt{Y} denotes any square root of Y. With B = I, this formula arises in the solution of the ordinary differential equation initial value problem y'' + Ay = 0, y(0)=y_0, y'(0)=y'_0,

The most well known instance of the trick is when B = A^T. The eigenvalues of

\notag     X = \begin{bmatrix}               0 & A \\               A^T & 0        \end{bmatrix}

are plus and minus the singular values of A, together with |m-n| additional zeros if A is m\times n with m \ne n, and the eigenvectors of X and the singular vectors of A are also related. Consequently, by applying results or algorithms for symmetric matrices to X one obtains results or algorithms for the singular value decomposition of A.

References

This is a minimal set of references, which contain further useful references within.

  • Gene Golub and Charles F. Van Loan, Matrix Computations, fourth edition, Johns Hopkins University Press, Baltimore, MD, USA, 2013.
  • Nicholas J. Higham, Functions of Matrices: Theory and Computation, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2008. (Sections 1.5 and 1.6 for the theory of matrix square roots.)
  • Roger A. Horn and Charles R. Johnson, Matrix Analysis, second edition, Cambridge University Press, 2013. My review of the second edition.

Related Blog Posts

This article is part of the “What Is” series, available from https://nhigham.com/category/what-is and in PDF form from the GitHub repository https://github.com/higham/what-is.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s