In numerical linear algebra a blocked algorithm organizes a computation so that it works on contiguous chunks of data. A blocked algorithm and the corresponding unblocked algorithm (with blocks of size ) are mathematically equivalent, but the blocked algorithm is generally more efficient on modern computers.
A simple example of blocking is in computing an inner product of vectors
. The unblocked algorithm
- for
-
- end
can be expressed in blocked form, with block size , as
- for
-
% Compute by the unblocked algorithm.
- end
The sums of terms in line 2 of the blocked algorithm are independent and could be evaluated in parallel, whereas the unblocked form is inherently sequential.
To see the full benefits of blocking we need to consider an algorithm operating on matrices, of which matrix multiplication is the most important example. Suppose we wish to compute the product of
matrices
and
. The natural computation is, from the definition of matrix multiplication, the “point algorithm”
- for
- for
- for
-
- end
- end
- end
Let ,
, and
be partitioned into blocks of size
, where
is assumed to be an integer. The blocked computation is
- for
- for
- for
-
- end
- end
- end
On a computer with a hierarchical memory the blocked form can be much more efficient than the point form if the blocks fit into the high speed memory, as much less data transfer is required. Indeed line 5 of the blocked algorithm performs flops on about
data, whereas the point algorithm performs
flops on
data on line 5, or
flops on
data if we combine lines 4–6 into a vector inner product. It is the
flops-to-data ratio that gives the blocked algorithm its advantage, because it masks the memory access costs.
The LAPACK (first released in 1992) was the first program library to systematically use blocked algorithms for a wide range of linear algebra computations.
An extra advantage that blocked algorithms have over unblocked algorithms is a reduction in the error constant in a rounding error bound by a factor or more and, typically, a reduction in the actual error (Higham, 2021).
The adjective “block” is sometimes used in place of “blocked”, but we prefer to reserve “block” for the meaning described in the next section.
Block Algorithms
A block algorithm is a generalization of a scalar algorithm in which the basic scalar operations become matrix operations (,
, and
become
,
, and
). It usually computes a block factorization, in which a matrix property based on the nonzero structure becomes the corresponding property blockwise; in particular, the scalars 0 and 1 become the zero matrix and the identity matrix, respectively.
An example of a block factorization is block LU factorization. For a matrix
an LU factorization can be written in the form
A block LU factorization has the form
Clearly, these are different factorizations. In general, a block LU factorization has the form with
block lower triangular, with identity matrices on the diagonal, and
block upper triangular. A blocked algorithm for computing the LU factorization and a block algorithm for computing the block LU factorization are readily derived.
The adjective “block” also applies to a variety of matrix properties, for which there are often special-purpose block algorithms. For example, the matrix
is a block tridiagonal matrix, and a block Toeplitz matrix has constant block diagonals:
One can define block diagonal dominance as a generalization of diagonal dominance. A block Householder matrix generalizes a Householder matrix: it is a perturbation of the identity matrix by a matrix of rank greater than or equal to .
References
This is a minimal set of references, which contain further useful references within.
- Nicholas J. Higham, Accuracy and Stability of Numerical Algorithms, second edition, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2002. (Chapter 13, “Block LU Factorization”.)
- Nicholas J. Higham, Numerical Stability of Algorithms at Extreme Scale and Low Precisions, MIMS EPrint 2021.14, Manchester Institute for Mathematical Sciences, The University of Manchester, UK, September 2021.
Related Blog Posts
- Can We Solve Linear Algebra Problems at Extreme Scale and Low Precisions? (2021)
- What Is a Block Matrix? (2020)
- What Is an LU Factorization? (2021)
This article is part of the “What Is” series, available from https://nhigham.com/category/what-is and in PDF form from the GitHub repository https://github.com/higham/what-is.