The Ten Most-Viewed Posts in 2021

top10.jpg
Image credit: Stuart Miles.

According to the WordPress statistics, this website received over 138,000 visitors and 205,000 views in 2021.

Here are the ten blog posts (published at any time) that received the most views during the year.

  1. Better LaTeX Tables with Booktabs (2019)
  2. What Is a Symmetric Positive Definite Matrix? (2020)
  3. What Is a Condition Number? (2020)
  4. Can We Solve Linear Algebra Problems at Extreme Scale and Low Precisions? (2021)
  5. What Is a Householder Matrix? (2020)
  6. What Is the Cayley–Hamilton Theorem? (2020)
  7. What Is an LU Factorization? (2021)
  8. Five Examples of Proofreading (2017)
  9. What is Numerical Stability? (2020)
  10. What Is the Log-Sum-Exp Function? (2021)

Seven of the posts are from the What Is series, which passed fifty posts in April. The series is ongoing and I have several posts in preparation for early 2022.

Anymatrix: An Extensible MATLAB Matrix Collection

anymatrix_github_sidebar_clarity-1.jpg

Anymatrix is a MATLAB toolbox written by me and Mantas Mikaitis and released at version 1.0 in October 2021. The motivation for developing Anymatrix was that while MATLAB has many matrices built in (73, depending on what you count as a matrix), this set is necessarily limited in scope, yet at the same time it is hard to search within it for a matrix with a given property. Anymatrix overcomes these limitations in two ways.

First, it provides a large and growing collection of matrices organized into groups of related matrices, and it allows users to add further groups, making it extensible.

Second, it allows matrices to be annotated with properties, so that the whole collection can be searched for matrices with particular sets of properties. It includes groups gallery and matlab that access the matrices built into MATLAB and are annotated with properties.

Anymatrix is described in detail in a paper and a users’ guide. It is available from GitHub and MathWorks File Exchange.

The Groups

The matrices built into Anymatrix are organized in seven groups.

  • contest: the CONTEST toolbox of adjacency metrices from random network models (Taylor and D. J. Higham, 2009).
  • core: miscellaneous matrices.
  • gallery: matrices from the MATLAB gallery.
  • hadamard: a large collection of Hadamard matrices (mostly from a collection of Sloane) and complex Hadamard matrices.
  • matlab: other MATLAB matrices (not in gallery).
  • nessie: matrices from real-life networks (Taylor and D. J. Higham, 2009).
  • regtools: matrices from regularization problems (Hansen, 2007).

Every matrix has a unique identifier group_name/matrix_name, where matrix_name is the name of the function that implements the matrix.

In the rest of this post we introduce the toolbox through a few examples.

Positive Definite Integer Matrices

We first find what symmetric positive definite matrices with integer entries are available.

>> anymatrix('properties','integer and positive definite')
ans =
  7×1 cell array
    {'core/beta'     }
    {'core/wilson'   }
    {'gallery/gcdmat'}
    {'gallery/minij' }
    {'gallery/moler' }
    {'gallery/pei'   }
    {'matlab/pascal' }

Three of the seven groups built into Anymatrix—core, gallery, and matlab—contain such matrices. We check the properties of the core/beta matrix. Here, 'p' is short for 'properties'.

>> anymatrix('core/beta','p')
ans =
  12×1 cell array
    {'built-in'            }
    {'infinitely divisible'}
    {'integer'             }
    {'nonnegative'         }
    {'positive'            }
    {'positive definite'   }
    {'real'                }
    {'scalable'            }
    {'square'              }
    {'symmetric'           }
    {'totally nonnegative' }
    {'totally positive'    }

Infinitely divisibility of a symmetric positive semidefinite A is the property that A^{\circ r} is positive semidefinite for all r\ge 0, where A^{\circ r} = (a_{ij}^r) is the Hadamard power. We verify this property for n = 4 and r = n/10 by checking that the eigenvalues are nonnegative:

>> A = anymatrix('core/beta',4)
A =
     1     2     3     4
     2     6    12    20
     3    12    30    60
     4    20    60   140

> eig(A.^(1/10))
ans =
   3.9036e-05
   3.1806e-03
   1.4173e-01
   5.0955e+00

Search Specific Groups

The search on properties returns results across all the groups. If we want to restrict to a particular group we can use the contains command (one of the powerful MATLAB string-handling functions) to narrow the results. In the next example we find all the Hankel matrices built into MATLAB, that is, contained in the gallery and matlab groups.

>> m = anymatrix('p','hankel')
m =
  5×1 cell array
    {'core/dembo9'    }
    {'gallery/ipjfact'}
    {'gallery/ris'    }
    {'matlab/hankel'  }
    {'regtools/ursell'}
>> m = m(contains(m,{'gallery','matlab'}))
m =
  3×1 cell array
    {'gallery/ipjfact'}
    {'gallery/ris'    }
    {'matlab/hankel'  }

Run a Test Over All Matrices

Anymatrix makes it possible to run tests over all or a subset of the matrices in the collection. This is not easy to do with the MATLAB gallery. The following code computes the minimum of the ratio \|A\|_2 / (\|A\|_1 \|A\|_{\infty} )^{1/2} over all the matrices, with default input arguments and size 64 if the dimension is variable. This ratio is known to lie between n^{-1/2} and 1. We note several features of the code.

  • By checking the built-in property, we only include matrices from the built-in groups (as opposed to any remote groups that have been downloaded).
  • The input arguments to some of the matrices are of a special form not respected by our general-purpose code, so the try construct handles the errors generated in these cases.
  • Some of the matrices are stored in the sparse format, so we make sure that the matrices are in the full format before taking the 2-norm.

mats = anymatrix('all'); % All matrix IDs.
rng(1), k = 1;
n = 64; % Size of the scalable matrices.
for i = 1:length(mats)
    ID = mats{i};
    props = anymatrix(ID,'p');
    if ~contains(props, {'built-in'}), continue, end 
    try
        if ismember('scalable',props);
            A = anymatrix(ID,n);
        else
            A = anymatrix(ID);
        end
        A = full(A);  % Convert sparse matrices to full.
        [mm,nn] = size(A);
        if max(mm,nn) > 1 && max(mm,nn) <= 1e3
           fprintf('%s: (%g,%g)\n', ID, size(A,1), size(A,2))
           A = A/norm(A,1); % Normalize to avoid overflow.
           r = norm(A)/sqrt(norm(A,1)*norm(A,inf));
           ratio(k) = r; k = k+1;
        end
    catch
        fprintf('Skipping %s\n', ID)
    end    
end    
fprintf('Min(ratio) = %9.2e\n', min(ratio))

The output is, with [...] denoting omitted lines,

contest/erdrey: (64,64)
contest/geo: (64,64)
[...]
regtools/ursell: (64,64)
regtools/wing: (64,64)
Min(ratio) =  1.25e-01

Optimal Matrices

The core group contains some matrices with optimality properties. For example, core/triminsval01 is the unique matrix having the minimal smallest singular value over all nonsingular binary upper triangular matrices.

>> A = anymatrix('core/triminsval01',8)
A =
     1     1     0     1     0     1     0     1
     0     1     1     0     1     0     1     0
     0     0     1     1     0     1     0     1
     0     0     0     1     1     0     1     0
     0     0     0     0     1     1     0     1
     0     0     0     0     0     1     1     0
     0     0     0     0     0     0     1     1
     0     0     0     0     0     0     0     1
>> min(svd(A))
ans =
   4.7385e-02
>> inv(A)
ans =
     1    -1     1    -2     3    -5     8   -13
     0     1    -1     1    -2     3    -5     8
     0     0     1    -1     1    -2     3    -5
     0     0     0     1    -1     1    -2     3
     0     0     0     0     1    -1     1    -2
     0     0     0     0     0     1    -1     1
     0     0     0     0     0     0     1    -1
     0     0     0     0     0     0     0     1

Notice the appearance of Fibonacci numbers in the inverse.

Remote Groups

The following groups of matrices can be added to Anymatrix. We hope that other groups will be made available by users in the future.

To incorporate the matrices in the first of these repositories as a group named corrinv we can use the 'groups' command ('g' for short) as follows.

>> anymatrix('g','corrinv','higham/matrices-correlation-invalid');
Cloning into '[...]/corrinv/private'...
[...]
Anymatrix remote group cloned.

Now we can access matrices in the corrinv group.

>> anymatrix('corrinv/tec03','h')
 tec03    Invalid correlation matrix from stress testing.
    tec03 is a 4-by-4 invalid correlation matrix from stress testing.

>> C = anymatrix('corrinv/tec03')
C =
   1.0000e+00  -5.5000e-01  -1.5000e-01  -1.0000e-01
  -5.5000e-01   1.0000e+00   9.0000e-01   9.0000e-01
  -1.5000e-01   9.0000e-01   1.0000e+00   9.0000e-01
  -1.0000e-01   9.0000e-01   9.0000e-01   1.0000e+00

>> eig(C)
ans =
  -2.7759e-02
   1.0000e-01
   1.0137e+00
   2.9140e+00

Top BibTeX Tips

BibTeX is an essential tool for preparing references for a LaTeX document. It collects entries from a BibTeX .bib database and formats a reference list in a .bbl file. Here are my top BibTeX tips.

Protect Capitals

The title field should be typed in title case, with all words capitalized except for articles (the, a, an), prepositions (to, with, on, etc.), and conjunctions (and, for, or, etc.). BibTeX will convert words to lower case if the style requires it, which it typically does for the article entry type.

Capitalized words that must not be converted to lower case because they are proper nouns must be protected by putting them in braces, as in the example

title = "A {Krylov} Methods Toolbox for {MATLAB}"

Without the braces this title appears with most styles as “A krylov methods toolbox for matlab”.

Failing to protect capitals is the most common error I see in bib entries. Note that bib entries downloaded from publisher’s websites usually do not protect capitals (and often have other errors), so you should always carefully check them.

Identify Surnames

When a surname comprises two words but is not hyphenated, BibTeX will interpret the first part of the surname as a given name. This is problematic when the BibTeX style abbreviates given names to initials. For example,

author = "Charles F. Van Loan",

will produce “C. F. V. Loan” instead of “C. F. Van Loan”. The cure is to reverse the order of the given names and surname

author = "Van Loan, Charles F.",

Alternatively, you can protect the surname with braces:

author = "Charles F. {Van Loan}",

Add a DOI Field

A digital object identifier (DOI) provides a persistent link to an object on the web, and almost every published paper has one, as do many books that are available electronically. I recommend including the DOI in a doi field, such as

doi = "10.1137/1.9781611976106",

With an appropriate BibTeX style file either the DOI will be displayed and hyperlinked or the title will be hyperlinked. The doi field above generates the hyperlink

https://doi.org/10.1137/1.9781611976106

which resolves to the object in question. Here is an example of a bibliography entry from a SIAM paper that contains a DOI: doi_ex_wilk61.jpg

Some styles that support the doi field are listed here; see also myplain2-doi.bst (my modification of an existing style file) and the siamplain.bst style file available here.

Get a Bib Entry from a DOI

If you have a paper that has a DOI and wish to obtain a bib entry you can type the DOI into doi2bib or DOI to BibTeX converter. I use doi2bib rather than download bib entries from publisher websites, as I find the results are more reliable.

Include a URL Field

For items that do not have a DOI it is desirable to include a url field that gives the URL of the item on the web, especially for preprints, general articles, and software. Again, this is not a standard field, but some styles support it and print the URL or produce a hyperlink.

Include Date Fields

The last two fields in my bib entries are of the form

created = "2019.04.23",
updated = "2020.03.01"

These are nonstandard fields so are ignored by BibTeX. I include them because these two dates can be very useful. They give me clues about when I first came across or read a paper. The “updated” field indicates when I have added a DOI to an old entry, corrected a nontrivial error in an entry, or a preprint has been revised or become a published paper.

Use Strings for Journal Names

If you type the journal field explicitly, as in

journal = "SIAM J. Sci. Comput.",

it will be hard to keep these fields consistent across hundreds of entries. It is better to use an abbreviation

@String{j-SISC                  = "SIAM J. Sci. Comput."}

then type

journal = j-SISC,

These String lines must appear before the string is used. I collect them into a separate bib file, strings.bib, and load it before my other bib files:

\bibliography{strings,njhigham,paper}

For mathematics journals I use abbreviations used by Mathematical Reviews, which can be found in this PDF document.

Use Backref

For larger documents it is helpful for the reader to know where bibliography items are cited in the text. Including

\usepackage{backref}

in the preamble of the LaTeX document will cause each bibliography entry to be appended with a list of the pages on which it is cited. Here is an example of a bibliography entry annotated by backref: backref_ex_hbtd20.jpg

Further Reading

More tips on BibTeX can be found in

What Is a Blocked Algorithm?

In numerical linear algebra a blocked algorithm organizes a computation so that it works on contiguous chunks of data. A blocked algorithm and the corresponding unblocked algorithm (with blocks of size 1) are mathematically equivalent, but the blocked algorithm is generally more efficient on modern computers.

A simple example of blocking is in computing an inner product s = x^Ty of vectors x,y\in\mathbb{R}^n. The unblocked algorithm

  1. s = x_1y_1
  2. for i = 2\colon n
  3.    s = s + x_ky_k
  4. end

can be expressed in blocked form, with block size b, as

  1. for i=1\colon n/b
  2.    s_i = \sum_{j=(i-1)b+1}^{ib} x_jy_j % Compute by the unblocked algorithm.
  3. end
  4. s = \sum_{i=1}^{n/b} s_i

The sums of b terms in line 2 of the blocked algorithm are independent and could be evaluated in parallel, whereas the unblocked form is inherently sequential.

To see the full benefits of blocking we need to consider an algorithm operating on matrices, of which matrix multiplication is the most important example. Suppose we wish to compute the product C = AB of n\times n matrices A and B. The natural computation is, from the definition of matrix multiplication, the “point algorithm”

  1. C = 0
  2. for i=1\colon n
  3.    for j=1\colon n
  4.      for k=1\colon n
  5.        c_{ij} = c_{ij} + a_{ik}b_{kj}
  6.      end
  7.   end
  8. end

Let A = (A_{ij}), B = (B_{ij}), and C = (C_{ij}) be partitioned into blocks of size b, where r = n/b is assumed to be an integer. The blocked computation is

  1. C = 0
  2. for i=1\colon r
  3.    for j=1\colon r
  4.      for k=1\colon r
  5.        C_{ij} = C_{ij} + A_{ik}B_{kj}
  6.      end
  7.    end
  8. end

On a computer with a hierarchical memory the blocked form can be much more efficient than the point form if the blocks fit into the high speed memory, as much less data transfer is required. Indeed line 5 of the blocked algorithm performs O(b^3) flops on about O(n^2) data, whereas the point algorithm performs O(1) flops on O(1) data on line 5, or O(n) flops on O(n) data if we combine lines 4–6 into a vector inner product. It is the O(b) flops-to-data ratio that gives the blocked algorithm its advantage, because it masks the memory access costs.

The LAPACK (first released in 1992) was the first program library to systematically use blocked algorithms for a wide range of linear algebra computations.

An extra advantage that blocked algorithms have over unblocked algorithms is a reduction in the error constant in a rounding error bound by a factor b or more and, typically, a reduction in the actual error (Higham, 2021).

The adjective “block” is sometimes used in place of “blocked”, but we prefer to reserve “block” for the meaning described in the next section.

Block Algorithms

A block algorithm is a generalization of a scalar algorithm in which the basic scalar operations become matrix operations (\alpha+\beta, \alpha\beta, and \alpha/\beta become A+B, AB, and AB^{-1}). It usually computes a block factorization, in which a matrix property based on the nonzero structure becomes the corresponding property blockwise; in particular, the scalars 0 and 1 become the zero matrix and the identity matrix, respectively.

An example of a block factorization is block LU factorization. For a 4\times 4 matrix A an LU factorization can be written in the form

\notag    A =  \left[ \begin{tabular}{cc|cc}                  1 &   &    &    \\                  x & 1 &    &    \\  \hline                  x & x &  1 &    \\                  x & x &  x & 1                 \end{tabular}  \right]      \left[ \begin{tabular}{cc|cc}                  x & x &  x & x  \\                    & x &  x & x  \\  \hline                    &   &  x & x  \\                    &   &    & x                 \end{tabular}  \right].

A block LU factorization has the form

\notag    A   =  \left[ \begin{tabular}{cc|cc}                  1 & 0 &    &    \\                  0 & 1 &    &    \\  \hline                  x & x &  1 & 0  \\                  x & x &  0 & 1                 \end{tabular}  \right]      \left[ \begin{tabular}{cc|cc}                  x & x &  x & x  \\                  x & x &  x & x  \\  \hline                    &   &  x & x  \\                    &   &  x & x                 \end{tabular}  \right].

Clearly, these are different factorizations. In general, a block LU factorization has the form A = LU with L block lower triangular, with identity matrices on the diagonal, and U block upper triangular. A blocked algorithm for computing the LU factorization and a block algorithm for computing the block LU factorization are readily derived.

The adjective “block” also applies to a variety of matrix properties, for which there are often special-purpose block algorithms. For example, the matrix

\notag \begin{bmatrix}       A_{11} & A_{12} &        &           &           \\       A_{21} & A_{22} & A_{23} &           &           \\              & \ddots & \ddots & \ddots    &           \\              &        & \ddots & \ddots    & A_{n-1,1} \\              &        &        & A_{n,n-1} & A_{n,n} \end{bmatrix}

is a block tridiagonal matrix, and a block Toeplitz matrix has constant block diagonals:

\notag \begin{bmatrix}       A_1    & A_2    & \dots  &  \dots    &  A_n      \\       A_{-1} & A_1    & A_{2}  &           &  \vdots   \\       \vdots & A_{-1} & \ddots & \ddots    &  \vdots   \\       \vdots &        & \ddots & \ddots    & A_2 \\       A_{1-n}& \dots  & \dots  & A_{-1}    & A_1 \end{bmatrix}.

One can define block diagonal dominance as a generalization of diagonal dominance. A block Householder matrix generalizes a Householder matrix: it is a perturbation of the identity matrix by a matrix of rank greater than or equal to 1.

References

This is a minimal set of references, which contain further useful references within.

Related Blog Posts

This article is part of the “What Is” series, available from https://nhigham.com/category/what-is and in PDF form from the GitHub repository https://github.com/higham/what-is.

What Is a Matrix Norm?

A matrix norm is a function \|\cdot\| : \mathbb{C}^{m\times n} \to \mathbb{R} satisfying

  • \|A\| \ge 0 with equality if and only if A=0 (nonnegativity),
  • \|\alpha A\| =|\alpha| \|A\| for all \alpha\in\mathbb{C}, A\in\mathbb{C}^{m\times n} (homogeneity),
  • \|A+B\| \le \|A\|+\|B\| for all A,B\in\mathbb{C}^{m\times n} (the triangle inequality).

These are analogues of the defining properties of a vector norm.

An important class of matrix norms is the subordinate matrix norms. Given a vector norm on \mathbb{C}^n the corresponding subordinate matrix norm on \mathbb{C}^{m\times n} is defined by

\notag        \|A\| = \displaystyle\max_{x\ne0} \frac{ \|Ax\| }{ \|x\| },

or, equivalently,

\|A\| = \displaystyle\max_{\|x\| = 1} \|Ax\|.

For the 1-, 2-, and \infty-vector norms it can be shown that

\notag \begin{alignedat}{2}    \|A\|_1 &= \max_{1\le j\le n}\sum_{i=1}^m |a_{ij}|,                  &&\qquad \mbox{``max column sum'',} \\    \|A\|_{\infty} &= \max_{1\le i\le m}\sum_{j=1}^n |a_{ij}|,                  &&\qquad \mbox{``max row sum'',} \\    \|A\|_2 &=  \sigma_{\max}(A),                  &&\qquad \mbox{spectral norm,} \end{alignedat}

where \sigma_{\max}(A) denotes the largest singular value of A. To remember the formulas for the 1– and \infty-norms, note that 1 is a vertical symbol (for columns) and \infty is a horizontal symbol (for rows). For the general Hölder p-norm no explicit formula is known for \|A\|_p for p\ne 1,2,\infty.

An immediate implication of the definition of subordinate matrix norm is that

\notag    \|AB\| \le \|A\| \|B\|

whenever the product is defined. A norm satisfying this condition is called consistent or submultiplicative.

Another commonly used norm is the Frobenius norm,

\notag       \|A\|_F = \biggl( \displaystyle\sum_{i=1}^m \sum_{j=1}^n |a_{ij}|^2  \biggr)^{1/2}               = \bigl( \mathrm{trace}(A^*\!A) \bigr)^{1/2}.

The Frobenius norm is not subordinate to any vector norm (since \|I_n\|_F = n^{1/2}, whereas \|I_n\| = 1 for any subordinate norm), but it is consistent.

The 2-norm and the Frobenius norm are unitarily invariant: they satisfy \|UAV\| = \|A\| for any unitary matrices U and V. For the Frobenius norm the invariance follows easily from the trace formula.

As for vector norms, all matrix norms are equivalent in that any two norms differ from each other by at most a constant. This table shows the optimal constants \alpha_{pq} such that \|A\|_p \le \alpha_{pq} \|A\|_q for A\in\mathbb{C}^{m\times n} and the norms mentioned above. Each inequality in the table is attained for some A.

For any A\in\mathbb{C}^{n\times n} and any consistent matrix norm,

\notag      \rho(A) \le \|A\|, \qquad\qquad (1)

where \rho is the spectral radius (the largest absolute value of any eigenvalue). To prove this inequality, let \lambda be an eigenvalue of A and x the corresponding eigenvector (necessarily nonzero), and form the matrix X=[x,x,\dots,x] \in \mathbb{C}^{n\times n}. Then AX = \lambda X, so |\lambda| \|X\| = \|AX\| \le \|A\| \, \|X\|, giving |\lambda|\le\|A\| since X \ne 0. For a subordinate norm it suffices to take norms in the equation Ax = \lambda x.

A useful relation is

\notag      \|A\|_2 \le \bigl( \|A\|_1 \|A\|_{\infty}\bigr)^{1/2}, \qquad\qquad (2)

which follows from \|A\|_2^2 = \|A^*A\|_2 = \rho(A^*A) \le \| A^*A \|_{\infty} \le \|A^*\|_{\infty} \|A\|_{\infty} =   \|A\|_1 \|A\|_{\infty}, where the first two equalities are obtained from the singular value decomposition and we have used (1).

Mixed Subordinate Matrix Norms

A more general subordinate matrix norm can be defined by taking different vector norms in the numerator and denominator:

\notag   \|A\|_{\alpha,\beta} = \displaystyle\max_{x\ne 0}          \frac{ \|Ax\|_{\beta} }{ \|x\|_{\alpha} }.

Some authors denote this norm by \|A\|_{\alpha\to\beta}.

A useful characterization of \|A\|_{\alpha,\beta} is given in the next result. Recall that \|\cdot\|^D denotes the dual of the vector norm \|\cdot\|.

Theorem 1. For A\in\mathbb{C}^{m\times n},

\notag    \|A\|_{\alpha,\beta} =           \displaystyle\max_{x\in\mathbb{C}^n \atop y \in\mathbb{C}^m}           \frac{\mathrm{Re}\, y^*Ax}{\|y\|_\beta^D \|x\|_\alpha}.

Proof. We have

\notag \begin{aligned}   \displaystyle \max_{x\in\mathbb{C}^n \atop y \in\mathbb{C}^m}       \frac{\mathrm{Re}\, y^*Ax}{\|y\|_\beta^D \|x\|_\alpha}  &=   \max_{x\in\mathbb{C}^n} \frac{1}{\|x\|_\alpha}       \max_{y\in\mathbb{C}^m} \frac{\mathrm{Re}\, y^*Ax}{\|y\|_\beta^D} \\  &=   \max_{x\in\mathbb{C}^n} \frac{\| Ax \|_\beta }{\|x\|_\alpha}  =  \|A\|_{\alpha,\beta}, \end{aligned}

where the second equality follows from the definition of dual vector norm and the fact that the dual of the dual norm is the original norm.

We can now obtain a connection between the norms of A and A^*. Here, \|A^*\|_{\beta^D,\alpha^D} denotes \max_{x\ne 0} \|Ax\|_\alpha^D / \|x\|_\beta^D.

Theorem 2. If A\in\mathbb{C}^{m\times n} then \|A\|_{\alpha,\beta} = \|A^*\|_{\beta^D,\alpha^D}.

Proof. Using Theorem 1, we have

\notag   \|A^*\|_{\alpha,\beta}    = \displaystyle\max_{x\in\mathbb{C}^n \atop y \in\mathbb{C}^m}           \frac{\mathrm{Re}\, y^*Ax}{\|y\|_\beta^D \|x\|_\alpha}    = \displaystyle\max_{x\in\mathbb{C}^n \atop y \in\mathbb{C}^m}      \frac{\mathrm{Re}\, x^*(A^*y)}{\|x\|_\alpha \|y\|_{\beta^D}}    = \|A^*\|_{\beta^D,\alpha^D}. \quad\square

If we take the \alpha– and \beta-norms to be the same p-norm then we have \|A\|_p = \|A^*\|_q, where p^{-1} + q^{-1} = 1 (giving, in particular, \|A\|_2 = \|A^*\|_2 and \|A\|_1 = \|A^*\|_\infty, which are easily obtained directly).

Now we give explicit formulas for the (\alpha,\beta) norm when \alpha or \beta is 1 or \infty and the other is a general norm.

Theorem 3. For A\in\mathbb{C}^{m\times n},

\notag \begin{alignedat}{2}       \|A\|_{1,\beta} &= \max_j \| A(:,j) \|_{\beta},    &\qquad\qquad& (3) \\ \|A\|_{\alpha,\infty} &= \max_i \|A(i,:)^*\|_{\alpha}^D, &\qquad\qquad& (4) \\ \end{alignedat}

For A\in\mathbb{R}^{m\times n},

\notag    \|A\|_{\infty,\beta} = \displaystyle\max_{x\in Z} \|Az\|_{\beta},  \qquad\qquad (5)

where

Z = \{ z\in\mathbb{R}^n: z_i = \pm 1~\mathrm{for~all}~i \,\},

and if A is symmetric positive semidefinite then

\notag    \|A\|_{\infty,1} = \displaystyle\max_{x\in Z} z^T\!Az.

Proof. For (3),

\notag    \|Ax\|_{\beta} = \Big\| \sum_j x_j A(\colon,j) \Bigr\|_{\beta}         \le \displaystyle\max_j \| A(\colon,j) \|_{\beta} \sum_j |x_j|,

with equality for x=e_k, where the maximum is attained for j=k. For (4), using the Hölder inequality,

\|Ax\|_{\infty} = \displaystyle\max_i | A(i,\colon) x |      \le \max_i \bigl( \|A(i,\colon)^*\|_{\alpha}^D \|x\|_{\alpha} \bigr)        =  \|x\|_{\alpha} \max_i \|A(i,\colon)^*\|_{\alpha}^D.

Equality is attained for an x that gives equality in the Hölder inequality involving the kth row of A, where the maximum is attained for i=k.

Turning to (5), we have \|A\|_{\infty,\beta} = \max_{ \|x\|_\infty =1} \|Ax\|_\beta. The unit cube \{\, x\in\mathbb{R}^n: -e \le x \le e\,\}, where e = [1,1,\dots,1]^T, is a convex polyhedron, so any point within it is a convex combination of the vertices, which are the elements of Z. Hence \|x\|_\infty = 1 implies

\notag   x = \displaystyle\sum_{z\in Z} \lambda_z Z, \quad \mathrm{where} \quad \lambda_z \ge 0,                                  \quad \sum_{z\in Z} \lambda_z = 1

and then

\notag   \|Ax\|_\beta = \biggl\| \displaystyle\sum_{z\in Z} \lambda_z Az \biggr\|_\beta   \le \max_{z\in Z} \|Az\|_\beta.

Hence \max_{\|x\|_\infty = 1} \|Ax\|_\beta   \le \max_{z\in Z} \|Az\|_\beta, but trivially \max_{z\in Z} \|Az\|_\beta \le \max_{\|x\|_\infty = 1} \|Ax\|_\beta and (5) follows.

Finally, if A is symmetric positive semidefinite let \xi_z = \mathrm{sign}(Az) \in Z. Then, using a Cholesky factorization A = R^T\!R (which exists even if A is singular) and the Cauchy–Schwarz inequality,

\notag \begin{aligned}   \max_{z\in Z} \|Az\|_1 &= \max_{z\in Z} \xi_z^T Az                          = \max_{z\in Z} \xi_z^T R^T\!Rz\\                          &= \max_{z\in Z} (R\xi_z)^T Rz                          \le \max_{z\in Z} \|R\xi_z\|_2 \|Rz\|_2\\                          &\le \max_{z\in Z} \|Rz\|_2^2\\                          &= \max_{z\in Z} z^T\!Az. \end{aligned}

Conversely, for z\in Z we have

\notag    z^T\!Az = |z^T\!Az| \le |z|^T|Az| = e^T |Az| = \|Az\|_1,

so \max_{z\in Z} z^T\!Az \le \max_{z\in Z} \|Az\|_1. Hence \max_{z\in Z} z^T\!Az = \max_{z\in Z} \|Az\|_1 = \|A\|_{\infty,1}, using (5). \square

As special cases of (3) and (4) we have

\notag \begin{aligned}   \|A\|_{1,\infty} &= \max_{i,j} |a_{ij}|, \qquad\qquad\qquad (6) \\   \|A\|_{2,\infty} &= \max_i \|A(i,:)^*\|_2, \\   \|A\|_{1,2}      &= \max_J \|A(:,j)\|_2. \end{aligned}

We also obtain by using Theorem 2 and (5), for A\in\mathbb{R}^{m\times n},

\notag   \|A\|_{2,1} = \displaystyle\max_{z\in Z} \|A^Tz\|_2.

The (2,\infty)-norm has recently found use in statistics (Cape, Tang, and Priebe, 2019), the motivation being that because it satisfies

\notag    \|A\|_{2,\infty} \le \|A\|_2 \le m^{1/2} \|A\|_{2,\infty},

the (2,\infty)-norm can be much smaller than the 2-norm when m \gg n and so can be a better norm to use in bounds. The (2,\infty)– and (\infty,2)-norms are used by Rebrova and Vershynin (2018) in bounding the 2-norm of a random matrix after zeroing a submatrix. They note that the 2-norm of a random n\times n matrix involves maximizing over infinitely many random variables, while the (\infty,2)-norm and (2,\infty)-norm involve only 2^n and n random variables, respectively.

The (\alpha,\beta) norm is not consistent, but for any vector norm \gamma, we have

\notag \begin{aligned}   \|AB\|_{\alpha,\beta}   &= \max_{x\ne0} \frac{ \|ABx\|_\beta }{ \|x\|_\alpha}   = \max_{x\ne0} \frac{ \|A(Bx)\|_\beta }{ \|Bx\|_\gamma}                  \frac{ \|Bx\|_\gamma}{ \|x\|_\alpha}   \le \max_{x\ne0} \frac{ \|A(Bx)\|_\beta }{ \|Bx\|_\gamma}     \max_{x\ne0} \frac{ \|Bx\|_\gamma}{ \|x\|_\alpha}\\   &\le \|A\|_{\gamma,\beta} \|B\|_{\alpha,\gamma}. \end{aligned}

Matrices With Constant p-Norms

Let A be a nonnegative square matrix whose row and column sums are all equal to \mu. This class of matrices includes magic squares and doubly stochastic matrices. We have \|A\|_1 = \|A\|_{\infty} = \mu, so \|A\|_2 \le \mu by (2). But Ae = \mu e for the vector e of 1s, so \mu is an eigenvalue of A and hence \|A\|_2 \ge \mu by (1). Hence \|A\|_1 = \|A\|_2 = \|A\|_{\infty} = \mu. In fact, \|A\|_p = \mu for all p\in[1,\infty] (see Higham (2002, Prob. 6.4) and Stoer and Witzgall (1962)).

Computing Norms

The problem of computing or estimating \|A\|_{\alpha,\beta} may be difficult for two reasons: we may not have an explicit formula for the norm, or A may be given implicitly, as A = f(B), A = B^{-1}, or A = BC, for example, and we may wish to compute the norm without forming A. Both difficulties are overcome by a generalization of the power method proposed by Tao in 1975 for arbitrary norms and independently Boyd in 1984 for \alpha and \beta both p-norms (see Higham, 2002, Sec.\,15.2). The power method accesses A only through matrix–vector products with A and A^*, so A does not need to be explicitly available.

Let us focus on the case where \alpha and \beta are the same p-norm. The power method is implemented in the Matrix Computation Toolbox as pnorm and we use it here to estimate the \pi-norm and the 99-norm, which we compare with the 1-, 2-, and \infty-norms.

>> A = gallery('frank',4)
A =
     4     3     2     1
     3     3     2     1
     0     2     2     1
     0     0     1     1
>> [norm(A,1) norm(A,2) pnorm(A,pi), pnorm(A,99), norm(A,inf)]
ans =
   8.0000e+00   7.6237e+00   8.0714e+00   9.8716e+00   1.0000e+01

The plot in the top half of the following figure shows the estimated p-norm for the matrix gallery('binomial',8) for p \in[1,25]. This shape of curve is typical, because, as the plot in the lower half indicates, \log(\|A\|_p) is a convex function of 1/p for p\ge 1 (see Higham, 2002, Sec.\,6.3).

pnorm_plot.jpg

The power method for the (1,\infty) norm, which is the max norm in (6), is investigated by Higham and Relton (2016) and extended to estimate the k largest values a_{ij} or |a_{ij}|.

A Norm Related to Grothendieck’s Inequality

A norm on \mathbb{C}^{n\times n} can be defined by

\notag \begin{aligned}    \|A\|^{(t)} = \max \biggl\{ \bigg|\sum_{i,j=1}^n a_{ij} y^*_jx_i \bigg| :                      x_i,y_j \in\mathbb{C}^t,                      \; \|x_i\|_2 \le 1,                      \; \|y_j\|_2 \le 1, \; i,j=1\colon t \, \biggr\}. \end{aligned}

Note that

\notag    \|A\|^{(1)} = \displaystyle\max_{0\ne x,y\in\mathbb{C}^n}                   \frac{|y^*Ax|}{\|y\|_\infty \|x\|_\infty}                = \|A\|_{\infty,1}.

A famous inequality of Grothendieck states that \|A\|^{(n)} \le c \|A\|^{(1)} for all A, for some constant c independent of n. Much work has been done to estimate the smallest possible constant c, which is known to be in the interval [1.33,1.41], or [1.67,1.79] for the analogous inequality for real data. See Friedland, Lim, and Zhang (2018) and the references therein.

Notes

The formula (5) with \beta = 1 is due to Rohn (2000), who shows that evaluating it is NP-hard. For general \beta, the formula is given by Lewis (2010).

Computing mixed subordinate matrix norms based on p-norms is generally NP-hard (Hendrickx and Olshevsky, 2010).

References

This is a minimal set of references, which contain further useful references within.

Related Blog Posts

This article is part of the “What Is” series, available from https://nhigham.com/category/what-is and in PDF form from the GitHub repository https://github.com/higham/what-is.

What’s New in MATLAB R2021b?

In this post I discuss some of the new features in MATLAB R2021b that captured my interest. See the release notes for a detailed list of the many changes in MATLAB and its toolboxes. For my articles about new features in earlier releases, see here.

High Order Runge–Kutta ODE Solvers

The MATLAB suite of solvers for ordinary differential equations previously contained 8 solvers, including 3 Runge-Kutta solvers: ode23, ode23tb, and ode45. The suite has been augmented with two new high-order Runge-Kutta solvers: ode78 uses 7th and 8th order Runge-Kutta formulas and ode89 uses 8th and 9th order Runge-Kutta formulas.

The documentation says that

  • ode78 and ode89 may be more efficient than ode45 on non-stiff problems that are smooth except possibly for a few isolated discontinuities.
  • ode89 may be more efficient than ode78 on very smooth problems, when integrating over long time intervals, or when tolerances are tight.
  • ode78 and ode89 may not be as fast or as accurate as ode45 in single precision.

Matrix to Scalar Power

The MATLAB mpower function is called by an exponentiation A^z when A is a matrix and z is a real or complex scalar. For the meaning of such arbitrary matrix powers see What Is a Fractional Matrix Power?

Previously, mpower used a diagonalization of A. For a matrix that is not diagonalizable, or is close to being not diagonalizaable, the results could be inaccurate. In R2021b a Schur–Padé algorithm, which employs a Schur decomposition in conjunction with Padé approximation, is used that works for all matrices. The difference in accuracy between the old and new versions of mpower is clearly demonstrated by computing the square root of a 2-by-2 Jordan block (a difficult case because it is defective).

% R2021a
>> A = [1 1; 0 1], X = A^(0.5), residual = A - X^2
A =
     1     1
     0     1
X =
     1     0
     0     1
residual =
     0     1
     0     0

% R2021b
>> A = [1 1; 0 1], X = A^(0.5), residual = A - X^2
A =
     1     1
     0     1
X =
   1.0000e+00   5.0000e-01
            0   1.0000e+00
residual =
     0     0
     0     0

Functions on nD Arrays

The new function pagesvd computes the singular value decomposition (SVD) of pages of nD arrays. A page is a 2D slice (i.e., a matrix) formed by taking all the elements in the first two dimensions and fixing the later subscripts. Related functions include pagemtimes for matrix multiplication abd pagetranspose for transposition, both introduced in R2020b. Better performance can be expected from these page functions than from explicit looping over the pages, especially for small-sized pages.

Improvements to Editor

Several improvements have been introduced to the MATLAB Editor. Ones that caught my eye are the ability to edit rectangular areas of text, automatic completion of block endings and delimiters, wrapping of comments, and changing the case of selected text. I do all my editing in Emacs, using the MATLAB major mode, and these sorts of things are either available or can be added by customization. However, these are great additions for MATLAB Editor users.

What Is a Vector Norm?

A vector norm measures the size, or length, of a vector. For complex n-vectors, a vector norm is a function \|\cdot\| : \mathbb{C}^n \to \mathbb{R} satisfying

  • \|x\| \ge 0 with equality if and only if x=0 (nonnegativity),
  • \|\alpha x\| =|\alpha| \|x\| for all \alpha\in\mathbb{C} x\in\mathbb{C}^n (homogeneity),
  • \|x+y\| \le \|x\|+\|y\| for all x,y\in\mathbb{C}^n (the triangle inequality).

An important class of norms is the Hölder p-norms

\|x\|_p = \biggl( \displaystyle\sum_{i=1}^n |x_i|^p  \biggr)^{1/p}, \quad p\ge 1.  \qquad\qquad (1)

The \infty-norm is interpreted as \lim_{p\to\infty}\|x\|_p and is given by

\notag    \|x\|_{\infty} = \displaystyle\max_{1\le i\le n} |x_i|.

Other important special cases are

\notag \begin{alignedat}{2}    \|x\|_1 &= \sum_{i=1}^n |x_i|, &\quad&                 \mbox{``Manhattan'' or ``taxi~cab'' norm}, \\    \|x\|_2 &= \biggl( \sum_{i=1}^n |x_i|^2  \biggr)^{1/2} = (x^*x)^{1/2},                   &\quad& \mbox{Euclidean length}. \end{alignedat}

A useful concept is that of the dual norm associated with a given vector norm, which is defined by

\notag    \|y\|^D =  \displaystyle\max_{x\ne0}               \displaystyle\frac{\mathop{\mathrm{Re}}x^* y}{\|x\|}.

The maximum is attained because the definition is equivalent to \|y\|^D = \max\{ \, \mathop{\mathrm{Re}} x^*y: \|x\| = 1\,\}, in which we are maximizing a continuous function over a compact set. Importantly, the dual of the dual norm is the original norm (Horn and Johnson, 2013, Thm.\, 5.5.9(c)).

We can rewrite the definition of dual norm, using the homogeneity of vector norms, as

\notag  \|y\|^D = \displaystyle\max_{|c| = 1} \| cy \|^D          = \max_{|c| = 1} \max_{x\ne 0} \frac{\mathop{\mathrm{Re}} x^*(cy) }{\|x\|}          =  \max_{x\ne 0} \max_{|c| = 1} \frac{\mathop{\mathrm{Re}} c(x^*y) }{\|x\|}          =  \max_{x\ne 0} \frac{ |x^*y| }{\|x\|}.

Hence we have the attainable inequality

\notag    |x^*y| \le \|x\| \, \|y\|^D, \qquad\qquad (2)

which is the generalized Hölder inequality.

The dual of the p-norm is the q-norm, where p^{-1} + q^{-1} = 1, so for the p-norms the inequality (2) becomes the (standard) Hölder inequality,

\notag    |x^*y| \le \|x\|_p \, \|y\|_q, \quad           \displaystyle\frac{1}{p} + \frac{1}{q} = 1.

An important special case is the Cauchy–Schwarz inequality,

\notag    |x^*y| \le \|x\|_2 \, \|y\|_2.

The notation \|x\|_0 is used to denote the number of nonzero entries in x, even though it is not a vector norm and is not obtained from (1) with p = 0. In portfolio optimization, if x_k specifies how much to invest in stock k then the inequality \|x\|_0 \le k says “invest in at most k stocks”.

In numerical linear algebra, vector norms play a crucial role in the definition of a subordinate matrix norm, as we will explain in the next post in this series.

All norms on \mathbb{C}^n are equivalent in the sense that for any two norms \|\cdot\|_\alpha and \|\cdot\|_\beta there exist positive constants \nu_1 and \nu_2 such that

\nu_1\|x\|_\alpha \le \|x\|_\beta \le \nu_2    \|x\|_\alpha \quad \mathrm{for~all}~x\in \mathbb{C}^n.

For example, it is easy to see that

\notag \begin{aligned}          \|x\|_2 &\le \|x\|_1 \le \sqrt{n} \|x\|_2,\\     \|x\|_\infty &\le \|x\|_2 \le \sqrt{n} \|x\|_\infty,\\     \|x\|_\infty &\le \|x\|_1 \le n \|x\|_\infty. \end{aligned}

The 2-norm is invariant under unitary transformations: if Q^*Q = I, then \|Qx\|^2 = x^*Q^* Qx = x^*x = \|x\|_2^2.

Care must be taken to avoid overflow and (damaging) underflow when evaluating a vector p-norm in floating-point arithmetic for p\ne 1,\infty. One can simply use the formula \|x\|_p = \| (x/\|x\|_{\infty}) \|_p \|x\|_{\infty}, but this requires two passes over the data (the first to evaluate \|x\|_{\infty}). For more efficient one-pass algorithms for the 2-norm see Higham (2002, Sec. 21.8) and Harayama et al. (2021).

References

This is a minimal set of references, which contain further useful references within.

Note: This article was revised on October 12, 2021 to change the definition of dual norm to use \mathrm{Re}.

This article is part of the “What Is” series, available from https://nhigham.com/category/what-is and in PDF form from the GitHub repository https://github.com/higham/what-is.

Can We Solve Linear Algebra Problems at Extreme Scale and Low Precisions?

dugaku_1.jpg

The Fugaku supercomputer that tops the HPL-AI mixed-precision benchmark in the June 2021 TOP500 list. It solved a linear system of order 10^7 using IEEE half precision arithmetic for most of the computations.

The largest dense linear systems being solved today are of order n = 10^7, and future exascale computer systems will be able to tackle even larger problems. Rounding error analysis shows that the computed solution satisfies a componentwise backward error bound that, under favorable assumptions, is of order nu, where u is the unit roundoff of the floating-point arithmetic: u \approx 10^{-16} for double precision and u \approx 10^{-8} for single precision. This backward error bound cannot guarantee any stability for single precision solution of today’s largest problems and suggests a loss of half the digits in the backward error for double precision.

Half precision floating-point arithmetic is now readily available in hardware, in both the IEEE binary16 format and the bfloat16 format, and it is increasingly being used in machine learning and in scientific computing more generally. For the computation of the inner product of two n-vectors the backward error bound is again of order nu, and this bound exceeds 1 for n \ge 684 for both half precision formats, suggesting a potentially complete loss of numerical stability. Yet inner products with n \ge 684 are successfully used in half precision computations in practice.

The error bounds I have referred to are upper bounds and so bound the worst-case over all possible rounding errors. Their main purpose is to reveal potential instabilities rather than to provide realistic error estimates. Yet we do need to know the limits of what we can compute, and for mission critical applications we need to be able to guarantee a successful computation..

Can we understand the behavior of linear algebra algorithms at extreme scale and in low precision floating-point arithmetics?

To a large extent the answer is yes if we exploit three different features to obtain smaller error bounds.

Blocked Algorithms

Many algorithms are implemented in blocked form. For example, an inner product x^Ty of two n-vectors x and y can computed as

\notag \begin{aligned} s_i &= x((i-1)b+1:ib)^T y((i-1)b+1:ib), \quad i = 1:k,\\ s   &= s_1 + s_2 + \dots + s_k, \end{aligned}

where n = kb and b \ll n is the block size. The inner product has been broken into k smaller inner products of size b, which are computed independently then summed. Many linear algebra algorithms are blocked in an analogous way, where the blocking is into submatrices with b rows or b columns (or both). Careful analysis of the error analysis shows that a blocked algorithm has an error bound about a factor of b smaller than that for the corresponding unblocked algorithm. Practical block sizes for matrix algorithms are typically 128 or 256, so blocking brings a substantial reduction in the error bounds.

ip_rand1.jpg

Backward errors for the inner product of two vectors with elements of the form -0.25 + randn, computed in single precision in MATLAB with block size 256.

In fact, one can do even better than an error bound of order (n/b)u. By computing the sum s= s_1 + s_2 + \dots + s_k with a more accurate summation method the error constant is further reduced to bu + O(u^2) (this is the FABsum method of Blanchard et al. (2020)).

Architectural Features

Intel x86 processors support an 80-bit extended precision format with a 64-bit significand, which is compatible with that specified in the IEEE standard. When a compiler uses this format with 80-bit registers to accumulate sums and inner products it is effectively working with a unit roundoff of 2^{-64} rather than 2^{-53} for double precision, giving error bounds smaller by a factor up to 2^{11} = 2048.

Some processors have a fused multiply–add (FMA) operation, which computes a combined multiplication and addition x + yz with one rounding error instead of two. This results in a reduction in error bounds by a factor 2.

Mixed precision block FMA operations D = C + AB, with matrices A,B,C,D of fixed size, are available on Google tensor processing units, NVIDIA GPUs, and in the ARMv8-A architecture. For half precision inputs these devices can produce results of single precision quality, which can give a significant boost in accuracy when block FMAs are chained together to form a matrix product of arbitrary dimension.

Probabilistic Bounds

Worst-case rounding error bounds suffer from the problem that they are not attainable for most specific sets of data and are unlikely to be nearly attained. Stewart (1990) noted that

To be realistic, we must prune away the unlikely. What is left is necessarily a probabilistic statement.

Theo Mary and I have recently developed probabilistic rounding error analysis, which makes probabilistic assumptions on the rounding errors and derives bounds that hold with a certain probability. The key feature of the bounds is that they are proportional to \sqrt{n}u when a corresponding worst-case bound is proportional to nu. In the most general form of the analysis (Connolly, Higham, and Mary, 2021), the rounding errors are assumed to be mean independent and of mean zero, where mean independence is a weaker assumption than independence.

Putting the Pieces Together

The different features we have described can be combined to obtain significantly smaller error bounds. If we use a blocked algorithm with block size b \ll n then in an inner product the standard error bound of order nu reduces to a probabilistic bound of order (\sqrt{n/b})u, which is a significant reduction. Block FMAs and extended precision registers provide further reductions.

For example, for a linear system of order 10^7 solved in single precision with a block size of 256, the probabilistic error bound is of order 10^{-5} versus 1 for the standard worst-case bound. If FABsum is used then the bound is further reduced.

Our conclusion is that we can successfully solve linear algebra problems of greater size and at lower precisions than the standard rounding error analysis suggests. A priori bounds will always be pessimistic, though. One should compute a posteriori residuals or backward errors (depending on the problem) in order to assess the quality of a numerical solution.

For full details of the work summarized here, see Higham (2021).

References

Videos from New Directions in Numerical Linear Algebra and High Performance Computing Workshop

In July 2021, Sven Hammarling, Françoise Tisseur and I organized an online workshop New Directions in Numerical Linear Algebra and High Performance Computing. The workshop brought together researchers working in numerical linear algebra and high performance computing to discuss current developments and challenges in the light of evolving computer hardware. It was held to honour Jack Dongarra on the occasion of his 70th birthday. The workshop had been postponed from July 2020 as a result of the pandemic.

Videos of the talks are now available on the Numerical Linear Algebra Group’s YouTube channel and are included below. Slides for the talks are available on the workshop website.

Sven Hammarling (The University of Manchester), “Jack Dongarra”.

Iain Duff (STFC-RAL and CERFACS), “Jack”

James Demmel (University of California, Berkeley), “New Communication-Avoiding Algorithms, and Fixing Old Bugs in the BLAS and LAPACK”

Piotr Luszczek (University of Tennessee), “Numerical Methods and Across Scales, Precisions and Hardware Platforms”

Cleve Moler (MathWorks), “Computers That I Have Known”

Yves Robert (Ecole Normale Supérieure de Lyon), “25+ Years of Scheduling at ICL”

Françoise Tisseur (The University of Manchester), “Mixed Precision Tall and Thin QR Factorization with Applications”

David Keyes (King Abdullah University of Science and Technology), “Adaptive Nonlinear Preconditioning for PDEs with Error Bounds on Output Functionals”

Zhaojun Bai (University of California, Davis), “Many Eigenpair Computation Via Hotelling’S Deflation”,

Ilse Ipsen (North Carolina State University), “A Few Observations About Summation Algorithms”

Erich Strohmaier (TOP500), “TOP500 and Accidental Benchmarking”

Nick Higham (The University of Manchester), “Solving Dense Linear Systems: A Brief History and Future Directions ”

Jack Dongarra (University of Tennessee, Oak Ridge Laboratory and The University of Manchester), “Still Having Fun After 50 Years”,

SIAM AN21 Minisymposium on Bohemian Matrices and Applications

ViridisDragonEye10.png
Image ViridisDragonEye10 courtesy of Rob Corless.

The two-part minisymposium Bohemian Matrices and Applications, organized by Rob Corless and I, took place at the SIAM Annual Meeting, July 22 and 23, 2021. This page makes available slides from some of the talks.

The minisymposium followed a two-part minisymposium on Bohemian matrices at the 2019 ICIAM meeting in Valencia and a 3-day workshop on Bohemian matrices in Manchester in 2018.

For more on Bohemian matrices see the Bohemian matrices website.

Minisymposium description: Bohemian matrices are matrices with entries drawn from a fixed discrete set of small integers (or some other discrete set). The term is a contraction of BOunded HEight Matrix of Integers. Such matrices arise in many applications, and include (0,1) graph incidence matrices and (-1,1) Bernoulli matrices. The questions of interest range from identifying structures in the spectra of particular classes of Bohemian matrix to searching for most ill conditioned matrices within a class, and applications include stress-testing algorithms and software. This minisymposium will report recent theoretical and computational progress as well as open questions.

Putting Skew-Symmetric Tridiagonal Bohemians on the Calendar. Robert M. Corless, Western University, Canada. Abstract. Rob did not use slides but gave his talk using this paper and this Maple worksheet.

Determinants of Normalized Bohemian Upper Hessenberg Matrices. Massimiliano Fasi, Örebro University, Sweden; Jishe Feng, Longdong University, China; Gian Maria Negri Porzio, University of Manchester, United Kingdom. Abstract. Slides.

Experiments on Upper Hessenberg and Toeplitz Bohemians. Eunice Chan, Western University, Canada. Abstract. Slides.

Eigenvalues of Magic Squares and Related Bohemian Matrices. Hariprasad Manjunath Hegde, Indian Institute of Science, Bengaluru, India. Abstract. Slides.

Calculating the 3D Kings Multiplicity Constant. Nicholas Cohen and Neil Calkin, Clemson University, U.S. Abstract. Slides.

Bohemian Inners Inverses: A First Step Toward Bohemian Generalized Inverses. Laureano Gonzalez-Vega, Universidad de Cantabria, Spain; Juan Rafael Sendra, Universidad Alcalá de Henares, Spain; Juana Sendra Pons, Universidad Politécnica de Madrid, Spain. Abstract. Slides.

Recent Progress in the Rational Factorisation of Integer Matrices. Matthew Lettington, Cardiff University, United Kingdom. Abstract. Slides.

Which Columns are Independent? Why does Row Rank = Column Rank? Gilbert Strang, Massachusetts Institute of Technology, U.S. Abstract. Slides.

Bohemian Matrices: the Symbolic Computation Approach. Juana Sendra, Universidad Autónoma de Madrid, Spain; Laureano González-Vega, Universidad de Estudios Financieros en Madrid, Spain; Juan Rafael Sendra, Universidad Alcalá de Henares, Spain. Abstract. Slides.