# What Is a Blocked Algorithm?

In numerical linear algebra a blocked algorithm organizes a computation so that it works on contiguous chunks of data. A blocked algorithm and the corresponding unblocked algorithm (with blocks of size $1$) are mathematically equivalent, but the blocked algorithm is generally more efficient on modern computers.

A simple example of blocking is in computing an inner product $s = x^Ty$ of vectors $x,y\in\mathbb{R}^n$. The unblocked algorithm

1. $s = x_1y_1$
2. for $i = 2\colon n$
3.    $s = s + x_ky_k$
4. end

can be expressed in blocked form, with block size $b$, as

1. for $i=1\colon n/b$
2.    $s_i = \sum_{j=(i-1)b+1}^{ib} x_jy_j$ % Compute by the unblocked algorithm.
3. end
4. $s = \sum_{i=1}^{n/b} s_i$

The sums of $b$ terms in line 2 of the blocked algorithm are independent and could be evaluated in parallel, whereas the unblocked form is inherently sequential.

To see the full benefits of blocking we need to consider an algorithm operating on matrices, of which matrix multiplication is the most important example. Suppose we wish to compute the product $C = AB$ of $n\times n$ matrices $A$ and $B$. The natural computation is, from the definition of matrix multiplication, the “point algorithm”

1. $C = 0$
2. for $i=1\colon n$
3.    for $j=1\colon n$
4.      for $k=1\colon n$
5.        $c_{ij} = c_{ij} + a_{ik}b_{kj}$
6.      end
7.   end
8. end

Let $A = (A_{ij})$, $B = (B_{ij})$, and $C = (C_{ij})$ be partitioned into blocks of size $b$, where $r = n/b$ is assumed to be an integer. The blocked computation is

1. $C = 0$
2. for $i=1\colon r$
3.    for $j=1\colon r$
4.      for $k=1\colon r$
5.        $C_{ij} = C_{ij} + A_{ik}B_{kj}$
6.      end
7.    end
8. end

On a computer with a hierarchical memory the blocked form can be much more efficient than the point form if the blocks fit into the high speed memory, as much less data transfer is required. Indeed line 5 of the blocked algorithm performs $O(b^3)$ flops on about $O(n^2)$ data, whereas the point algorithm performs $O(1)$ flops on $O(1)$ data on line 5, or $O(n)$ flops on $O(n)$ data if we combine lines 4–6 into a vector inner product. It is the $O(b)$ flops-to-data ratio that gives the blocked algorithm its advantage, because it masks the memory access costs.

The LAPACK (first released in 1992) was the first program library to systematically use blocked algorithms for a wide range of linear algebra computations.

An extra advantage that blocked algorithms have over unblocked algorithms is a reduction in the error constant in a rounding error bound by a factor $b$ or more and, typically, a reduction in the actual error (Higham, 2021).

The adjective “block” is sometimes used in place of “blocked”, but we prefer to reserve “block” for the meaning described in the next section.

## Block Algorithms

A block algorithm is a generalization of a scalar algorithm in which the basic scalar operations become matrix operations ($\alpha+\beta$, $\alpha\beta$, and $\alpha/\beta$ become $A+B$, $AB$, and $AB^{-1}$). It usually computes a block factorization, in which a matrix property based on the nonzero structure becomes the corresponding property blockwise; in particular, the scalars 0 and 1 become the zero matrix and the identity matrix, respectively.

An example of a block factorization is block LU factorization. For a $4\times 4$ matrix $A$ an LU factorization can be written in the form

$\notag A = \left[ \begin{tabular}{cc|cc} 1 & & & \\ x & 1 & & \\ \hline x & x & 1 & \\ x & x & x & 1 \end{tabular} \right] \left[ \begin{tabular}{cc|cc} x & x & x & x \\ & x & x & x \\ \hline & & x & x \\ & & & x \end{tabular} \right].$

A block LU factorization has the form

$\notag A = \left[ \begin{tabular}{cc|cc} 1 & 0 & & \\ 0 & 1 & & \\ \hline x & x & 1 & 0 \\ x & x & 0 & 1 \end{tabular} \right] \left[ \begin{tabular}{cc|cc} x & x & x & x \\ x & x & x & x \\ \hline & & x & x \\ & & x & x \end{tabular} \right].$

Clearly, these are different factorizations. In general, a block LU factorization has the form $A = LU$ with $L$ block lower triangular, with identity matrices on the diagonal, and $U$ block upper triangular. A blocked algorithm for computing the LU factorization and a block algorithm for computing the block LU factorization are readily derived.

The adjective “block” also applies to a variety of matrix properties, for which there are often special-purpose block algorithms. For example, the matrix

$\notag \begin{bmatrix} A_{11} & A_{12} & & & \\ A_{21} & A_{22} & A_{23} & & \\ & \ddots & \ddots & \ddots & \\ & & \ddots & \ddots & A_{n-1,1} \\ & & & A_{n,n-1} & A_{n,n} \end{bmatrix}$

is a block tridiagonal matrix, and a block Toeplitz matrix has constant block diagonals:

$\notag \begin{bmatrix} A_1 & A_2 & \dots & \dots & A_n \\ A_{-1} & A_1 & A_{2} & & \vdots \\ \vdots & A_{-1} & \ddots & \ddots & \vdots \\ \vdots & & \ddots & \ddots & A_2 \\ A_{1-n}& \dots & \dots & A_{-1} & A_1 \end{bmatrix}.$

One can define block diagonal dominance as a generalization of diagonal dominance. A block Householder matrix generalizes a Householder matrix: it is a perturbation of the identity matrix by a matrix of rank greater than or equal to $1$.

## References

This is a minimal set of references, which contain further useful references within.

## Related Blog Posts

This article is part of the “What Is” series, available from https://nhigham.com/category/what-is and in PDF form from the GitHub repository https://github.com/higham/what-is.

# What Is a Matrix Norm?

A matrix norm is a function $\|\cdot\| : \mathbb{C}^{m\times n} \to \mathbb{R}$ satisfying

• $\|A\| \ge 0$ with equality if and only if $A=0$ (nonnegativity),
• $\|\alpha A\| =|\alpha| \|A\|$ for all $\alpha\in\mathbb{C}$, $A\in\mathbb{C}^{m\times n}$ (homogeneity),
• $\|A+B\| \le \|A\|+\|B\|$ for all $A,B\in\mathbb{C}^{m\times n}$ (the triangle inequality).

These are analogues of the defining properties of a vector norm.

An important class of matrix norms is the subordinate matrix norms. Given a vector norm on $\mathbb{C}^n$ the corresponding subordinate matrix norm on $\mathbb{C}^{m\times n}$ is defined by

$\notag \|A\| = \displaystyle\max_{x\ne0} \frac{ \|Ax\| }{ \|x\| },$

or, equivalently,

$\|A\| = \displaystyle\max_{\|x\| = 1} \|Ax\|.$

For the $1$-, $2$-, and $\infty$-vector norms it can be shown that

\notag \begin{alignedat}{2} \|A\|_1 &= \max_{1\le j\le n}\sum_{i=1}^m |a_{ij}|, &&\qquad \mbox{max column sum'',} \\ \|A\|_{\infty} &= \max_{1\le i\le m}\sum_{j=1}^n |a_{ij}|, &&\qquad \mbox{max row sum'',} \\ \|A\|_2 &= \sigma_{\max}(A), &&\qquad \mbox{spectral norm,} \end{alignedat}

where $\sigma_{\max}(A)$ denotes the largest singular value of $A$. To remember the formulas for the $1$– and $\infty$-norms, note that 1 is a vertical symbol (for columns) and $\infty$ is a horizontal symbol (for rows). For the general Hölder $p$-norm no explicit formula is known for $\|A\|_p$ for $p\ne 1,2,\infty$.

An immediate implication of the definition of subordinate matrix norm is that

$\notag \|AB\| \le \|A\| \|B\|$

whenever the product is defined. A norm satisfying this condition is called consistent or submultiplicative.

Another commonly used norm is the Frobenius norm,

$\notag \|A\|_F = \biggl( \displaystyle\sum_{i=1}^m \sum_{j=1}^n |a_{ij}|^2 \biggr)^{1/2} = \bigl( \mathrm{trace}(A^*\!A) \bigr)^{1/2}.$

The Frobenius norm is not subordinate to any vector norm (since $\|I_n\|_F = n^{1/2}$, whereas $\|I_n\| = 1$ for any subordinate norm), but it is consistent.

The $2$-norm and the Frobenius norm are unitarily invariant: they satisfy $\|UAV\| = \|A\|$ for any unitary matrices $U$ and $V$. For the Frobenius norm the invariance follows easily from the trace formula.

As for vector norms, all matrix norms are equivalent in that any two norms differ from each other by at most a constant. This table shows the optimal constants $\alpha_{pq}$ such that $\|A\|_p \le \alpha_{pq} \|A\|_q$ for $A\in\mathbb{C}^{m\times n}$ and the norms mentioned above. Each inequality in the table is attained for some $A$.

For any $A\in\mathbb{C}^{n\times n}$ and any consistent matrix norm,

$\notag \rho(A) \le \|A\|, \qquad\qquad (1)$

where $\rho$ is the spectral radius (the largest absolute value of any eigenvalue). To prove this inequality, let $\lambda$ be an eigenvalue of $A$ and $x$ the corresponding eigenvector (necessarily nonzero), and form the matrix $X=[x,x,\dots,x] \in \mathbb{C}^{n\times n}$. Then $AX = \lambda X$, so $|\lambda| \|X\| = \|AX\| \le \|A\| \, \|X\|$, giving $|\lambda|\le\|A\|$ since $X \ne 0$. For a subordinate norm it suffices to take norms in the equation $Ax = \lambda x$.

A useful relation is

$\notag \|A\|_2 \le \bigl( \|A\|_1 \|A\|_{\infty}\bigr)^{1/2}, \qquad\qquad (2)$

which follows from $\|A\|_2^2 = \|A^*A\|_2 = \rho(A^*A) \le \| A^*A \|_{\infty} \le \|A^*\|_{\infty} \|A\|_{\infty} = \|A\|_1 \|A\|_{\infty}$, where the first two equalities are obtained from the singular value decomposition and we have used (1).

## Mixed Subordinate Matrix Norms

A more general subordinate matrix norm can be defined by taking different vector norms in the numerator and denominator:

$\notag \|A\|_{\alpha,\beta} = \displaystyle\max_{x\ne 0} \frac{ \|Ax\|_{\beta} }{ \|x\|_{\alpha} }.$

Some authors denote this norm by $\|A\|_{\alpha\to\beta}$.

A useful characterization of $\|A\|_{\alpha,\beta}$ is given in the next result. Recall that $\|\cdot\|^D$ denotes the dual of the vector norm $\|\cdot\|$.

Theorem 1. For $A\in\mathbb{C}^{m\times n}$,

$\notag \|A\|_{\alpha,\beta} = \displaystyle\max_{x\in\mathbb{C}^n \atop y \in\mathbb{C}^m} \frac{\mathrm{Re}\, y^*Ax}{\|y\|_\beta^D \|x\|_\alpha}.$

Proof. We have

\notag \begin{aligned} \displaystyle \max_{x\in\mathbb{C}^n \atop y \in\mathbb{C}^m} \frac{\mathrm{Re}\, y^*Ax}{\|y\|_\beta^D \|x\|_\alpha} &= \max_{x\in\mathbb{C}^n} \frac{1}{\|x\|_\alpha} \max_{y\in\mathbb{C}^m} \frac{\mathrm{Re}\, y^*Ax}{\|y\|_\beta^D} \\ &= \max_{x\in\mathbb{C}^n} \frac{\| Ax \|_\beta }{\|x\|_\alpha} = \|A\|_{\alpha,\beta}, \end{aligned}

where the second equality follows from the definition of dual vector norm and the fact that the dual of the dual norm is the original norm.

We can now obtain a connection between the norms of $A$ and $A^*$. Here, $\|A^*\|_{\beta^D,\alpha^D}$ denotes $\max_{x\ne 0} \|Ax\|_\alpha^D / \|x\|_\beta^D$.

Theorem 2. If $A\in\mathbb{C}^{m\times n}$ then $\|A\|_{\alpha,\beta} = \|A^*\|_{\beta^D,\alpha^D}$.

Proof. Using Theorem 1, we have

$\notag \|A^*\|_{\alpha,\beta} = \displaystyle\max_{x\in\mathbb{C}^n \atop y \in\mathbb{C}^m} \frac{\mathrm{Re}\, y^*Ax}{\|y\|_\beta^D \|x\|_\alpha} = \displaystyle\max_{x\in\mathbb{C}^n \atop y \in\mathbb{C}^m} \frac{\mathrm{Re}\, x^*(A^*y)}{\|x\|_\alpha \|y\|_{\beta^D}} = \|A^*\|_{\beta^D,\alpha^D}. \quad\square$

If we take the $\alpha$– and $\beta$-norms to be the same $p$-norm then we have $\|A\|_p = \|A^*\|_q$, where $p^{-1} + q^{-1} = 1$ (giving, in particular, $\|A\|_2 = \|A^*\|_2$ and $\|A\|_1 = \|A^*\|_\infty$, which are easily obtained directly).

Now we give explicit formulas for the $(\alpha,\beta)$ norm when $\alpha$ or $\beta$ is $1$ or $\infty$ and the other is a general norm.

Theorem 3. For $A\in\mathbb{C}^{m\times n}$,

\notag \begin{alignedat}{2} \|A\|_{1,\beta} &= \max_j \| A(:,j) \|_{\beta}, &\qquad\qquad& (3) \\ \|A\|_{\alpha,\infty} &= \max_i \|A(i,:)^*\|_{\alpha}^D, &\qquad\qquad& (4) \\ \end{alignedat}

For $A\in\mathbb{R}^{m\times n}$,

$\notag \|A\|_{\infty,\beta} = \displaystyle\max_{x\in Z} \|Az\|_{\beta}, \qquad\qquad (5)$

where

$Z = \{ z\in\mathbb{R}^n: z_i = \pm 1~\mathrm{for~all}~i \,\},$

and if $A$ is symmetric positive semidefinite then

$\notag \|A\|_{\infty,1} = \displaystyle\max_{x\in Z} z^T\!Az.$

Proof. For (3),

$\notag \|Ax\|_{\beta} = \Big\| \sum_j x_j A(\colon,j) \Bigr\|_{\beta} \le \displaystyle\max_j \| A(\colon,j) \|_{\beta} \sum_j |x_j|,$

with equality for $x=e_k$, where the maximum is attained for $j=k$. For (4), using the Hölder inequality,

$\|Ax\|_{\infty} = \displaystyle\max_i | A(i,\colon) x | \le \max_i \bigl( \|A(i,\colon)^*\|_{\alpha}^D \|x\|_{\alpha} \bigr) = \|x\|_{\alpha} \max_i \|A(i,\colon)^*\|_{\alpha}^D.$

Equality is attained for an $x$ that gives equality in the Hölder inequality involving the $k$th row of $A$, where the maximum is attained for $i=k$.

Turning to (5), we have $\|A\|_{\infty,\beta} = \max_{ \|x\|_\infty =1} \|Ax\|_\beta$. The unit cube $\{\, x\in\mathbb{R}^n: -e \le x \le e\,\}$, where $e = [1,1,\dots,1]^T$, is a convex polyhedron, so any point within it is a convex combination of the vertices, which are the elements of $Z$. Hence $\|x\|_\infty = 1$ implies

$\notag x = \displaystyle\sum_{z\in Z} \lambda_z Z, \quad \mathrm{where} \quad \lambda_z \ge 0, \quad \sum_{z\in Z} \lambda_z = 1$

and then

$\notag \|Ax\|_\beta = \biggl\| \displaystyle\sum_{z\in Z} \lambda_z Az \biggr\|_\beta \le \max_{z\in Z} \|Az\|_\beta.$

Hence $\max_{\|x\|_\infty = 1} \|Ax\|_\beta \le \max_{z\in Z} \|Az\|_\beta$, but trivially $\max_{z\in Z} \|Az\|_\beta \le \max_{\|x\|_\infty = 1} \|Ax\|_\beta$ and (5) follows.

Finally, if $A$ is symmetric positive semidefinite let $\xi_z = \mathrm{sign}(Az) \in Z$. Then, using a Cholesky factorization $A = R^T\!R$ (which exists even if $A$ is singular) and the Cauchy–Schwarz inequality,

\notag \begin{aligned} \max_{z\in Z} \|Az\|_1 &= \max_{z\in Z} \xi_z^T Az = \max_{z\in Z} \xi_z^T R^T\!Rz\\ &= \max_{z\in Z} (R\xi_z)^T Rz \le \max_{z\in Z} \|R\xi_z\|_2 \|Rz\|_2\\ &\le \max_{z\in Z} \|Rz\|_2^2\\ &= \max_{z\in Z} z^T\!Az. \end{aligned}

Conversely, for $z\in Z$ we have

$\notag z^T\!Az = |z^T\!Az| \le |z|^T|Az| = e^T |Az| = \|Az\|_1,$

so $\max_{z\in Z} z^T\!Az \le \max_{z\in Z} \|Az\|_1$. Hence $\max_{z\in Z} z^T\!Az = \max_{z\in Z} \|Az\|_1 = \|A\|_{\infty,1}$, using (5). $\square$

As special cases of (3) and (4) we have

\notag \begin{aligned} \|A\|_{1,\infty} &= \max_{i,j} |a_{ij}|, \qquad\qquad\qquad (6) \\ \|A\|_{2,\infty} &= \max_i \|A(i,:)^*\|_2, \\ \|A\|_{1,2} &= \max_J \|A(:,j)\|_2. \end{aligned}

We also obtain by using Theorem 2 and (5), for $A\in\mathbb{R}^{m\times n}$,

$\notag \|A\|_{2,1} = \displaystyle\max_{z\in Z} \|A^Tz\|_2.$

The $(2,\infty)$-norm has recently found use in statistics (Cape, Tang, and Priebe, 2019), the motivation being that because it satisfies

$\notag \|A\|_{2,\infty} \le \|A\|_2 \le m^{1/2} \|A\|_{2,\infty},$

the $(2,\infty)$-norm can be much smaller than the $2$-norm when $m \gg n$ and so can be a better norm to use in bounds. The $(2,\infty)$– and $(\infty,2)$-norms are used by Rebrova and Vershynin (2018) in bounding the $2$-norm of a random matrix after zeroing a submatrix. They note that the $2$-norm of a random $n\times n$ matrix involves maximizing over infinitely many random variables, while the $(\infty,2)$-norm and $(2,\infty)$-norm involve only $2^n$ and $n$ random variables, respectively.

The ($\alpha,\beta$) norm is not consistent, but for any vector norm $\gamma$, we have

\notag \begin{aligned} \|AB\|_{\alpha,\beta} &= \max_{x\ne0} \frac{ \|ABx\|_\beta }{ \|x\|_\alpha} = \max_{x\ne0} \frac{ \|A(Bx)\|_\beta }{ \|Bx\|_\gamma} \frac{ \|Bx\|_\gamma}{ \|x\|_\alpha} \le \max_{x\ne0} \frac{ \|A(Bx)\|_\beta }{ \|Bx\|_\gamma} \max_{x\ne0} \frac{ \|Bx\|_\gamma}{ \|x\|_\alpha}\\ &\le \|A\|_{\gamma,\beta} \|B\|_{\alpha,\gamma}. \end{aligned}

## Matrices With Constant $p$-Norms

Let $A$ be a nonnegative square matrix whose row and column sums are all equal to $\mu$. This class of matrices includes magic squares and doubly stochastic matrices. We have $\|A\|_1 = \|A\|_{\infty} = \mu$, so $\|A\|_2 \le \mu$ by (2). But $Ae = \mu e$ for the vector $e$ of $1$s, so $\mu$ is an eigenvalue of $A$ and hence $\|A\|_2 \ge \mu$ by (1). Hence $\|A\|_1 = \|A\|_2 = \|A\|_{\infty} = \mu$. In fact, $\|A\|_p = \mu$ for all $p\in[1,\infty]$ (see Higham (2002, Prob. 6.4) and Stoer and Witzgall (1962)).

## Computing Norms

The problem of computing or estimating $\|A\|_{\alpha,\beta}$ may be difficult for two reasons: we may not have an explicit formula for the norm, or $A$ may be given implicitly, as $A = f(B)$, $A = B^{-1}$, or $A = BC$, for example, and we may wish to compute the norm without forming $A$. Both difficulties are overcome by a generalization of the power method proposed by Tao in 1975 for arbitrary norms and independently Boyd in 1984 for $\alpha$ and $\beta$ both $p$-norms (see Higham, 2002, Sec.$\,$15.2). The power method accesses $A$ only through matrix–vector products with $A$ and $A^*$, so $A$ does not need to be explicitly available.

Let us focus on the case where $\alpha$ and $\beta$ are the same $p$-norm. The power method is implemented in the Matrix Computation Toolbox as pnorm and we use it here to estimate the $\pi$-norm and the $99$-norm, which we compare with the $1$-, $2$-, and $\infty$-norms.

>> A = gallery('frank',4)
A =
4     3     2     1
3     3     2     1
0     2     2     1
0     0     1     1
>> [norm(A,1) norm(A,2) pnorm(A,pi), pnorm(A,99), norm(A,inf)]
ans =
8.0000e+00   7.6237e+00   8.0714e+00   9.8716e+00   1.0000e+01


The plot in the top half of the following figure shows the estimated $p$-norm for the matrix gallery('binomial',8) for $p \in[1,25]$. This shape of curve is typical, because, as the plot in the lower half indicates, $\log(\|A\|_p)$ is a convex function of $1/p$ for $p\ge 1$ (see Higham, 2002, Sec.$\,$6.3).

The power method for the $(1,\infty)$ norm, which is the max norm in (6), is investigated by Higham and Relton (2016) and extended to estimate the $k$ largest values $a_{ij}$ or $|a_{ij}|$.

## A Norm Related to Grothendieck’s Inequality

A norm on $\mathbb{C}^{n\times n}$ can be defined by

\notag \begin{aligned} \|A\|^{(t)} = \max \biggl\{ \bigg|\sum_{i,j=1}^n a_{ij} y^*_jx_i \bigg| : x_i,y_j \in\mathbb{C}^t, \; \|x_i\|_2 \le 1, \; \|y_j\|_2 \le 1, \; i,j=1\colon t \, \biggr\}. \end{aligned}

Note that

$\notag \|A\|^{(1)} = \displaystyle\max_{0\ne x,y\in\mathbb{C}^n} \frac{|y^*Ax|}{\|y\|_\infty \|x\|_\infty} = \|A\|_{\infty,1}.$

A famous inequality of Grothendieck states that $\|A\|^{(n)} \le c \|A\|^{(1)}$ for all $A$, for some constant $c$ independent of $n$. Much work has been done to estimate the smallest possible constant $c$, which is known to be in the interval $[1.33,1.41]$, or $[1.67,1.79]$ for the analogous inequality for real data. See Friedland, Lim, and Zhang (2018) and the references therein.

## Notes

The formula (5) with $\beta = 1$ is due to Rohn (2000), who shows that evaluating it is NP-hard. For general $\beta$, the formula is given by Lewis (2010).

Computing mixed subordinate matrix norms based on $p$-norms is generally NP-hard (Hendrickx and Olshevsky, 2010).

## References

This is a minimal set of references, which contain further useful references within.

## Related Blog Posts

This article is part of the “What Is” series, available from https://nhigham.com/category/what-is and in PDF form from the GitHub repository https://github.com/higham/what-is.

# What’s New in MATLAB R2021b?

In this post I discuss some of the new features in MATLAB R2021b that captured my interest. See the release notes for a detailed list of the many changes in MATLAB and its toolboxes. For my articles about new features in earlier releases, see here.

## High Order Runge–Kutta ODE Solvers

The MATLAB suite of solvers for ordinary differential equations previously contained 8 solvers, including 3 Runge-Kutta solvers: ode23, ode23tb, and ode45. The suite has been augmented with two new high-order Runge-Kutta solvers: ode78 uses 7th and 8th order Runge-Kutta formulas and ode89 uses 8th and 9th order Runge-Kutta formulas.

The documentation says that

• ode78 and ode89 may be more efficient than ode45 on non-stiff problems that are smooth except possibly for a few isolated discontinuities.
• ode89 may be more efficient than ode78 on very smooth problems, when integrating over long time intervals, or when tolerances are tight.
• ode78 and ode89 may not be as fast or as accurate as ode45 in single precision.

## Matrix to Scalar Power

The MATLAB mpower function is called by an exponentiation A^z when A is a matrix and z is a real or complex scalar. For the meaning of such arbitrary matrix powers see What Is a Fractional Matrix Power?

Previously, mpower used a diagonalization of A. For a matrix that is not diagonalizable, or is close to being not diagonalizaable, the results could be inaccurate. In R2021b a Schur–Padé algorithm, which employs a Schur decomposition in conjunction with Padé approximation, is used that works for all matrices. The difference in accuracy between the old and new versions of mpower is clearly demonstrated by computing the square root of a 2-by-2 Jordan block (a difficult case because it is defective).

% R2021a
>> A = [1 1; 0 1], X = A^(0.5), residual = A - X^2
A =
1     1
0     1
X =
1     0
0     1
residual =
0     1
0     0

% R2021b
>> A = [1 1; 0 1], X = A^(0.5), residual = A - X^2
A =
1     1
0     1
X =
1.0000e+00   5.0000e-01
0   1.0000e+00
residual =
0     0
0     0


## Functions on nD Arrays

The new function pagesvd computes the singular value decomposition (SVD) of pages of nD arrays. A page is a 2D slice (i.e., a matrix) formed by taking all the elements in the first two dimensions and fixing the later subscripts. Related functions include pagemtimes for matrix multiplication abd pagetranspose for transposition, both introduced in R2020b. Better performance can be expected from these page functions than from explicit looping over the pages, especially for small-sized pages.

## Improvements to Editor

Several improvements have been introduced to the MATLAB Editor. Ones that caught my eye are the ability to edit rectangular areas of text, automatic completion of block endings and delimiters, wrapping of comments, and changing the case of selected text. I do all my editing in Emacs, using the MATLAB major mode, and these sorts of things are either available or can be added by customization. However, these are great additions for MATLAB Editor users.