What Is a Vector Norm?

A vector norm measures the size, or length, of a vector. For complex n-vectors, a vector norm is a function \|\cdot\| : \mathbb{C}^n \to \mathbb{R} satisfying

  • \|x\| \ge 0 with equality if and only if x=0 (nonnegativity),
  • \|\alpha x\| =|\alpha| \|x\| for all \alpha\in\mathbb{C} x\in\mathbb{C}^n (homogeneity),
  • \|x+y\| \le \|x\|+\|y\| for all x,y\in\mathbb{C}^n (the triangle inequality).

An important class of norms is the Hölder p-norms

\|x\|_p = \biggl( \displaystyle\sum_{i=1}^n |x_i|^p  \biggr)^{1/p}, \quad p\ge 1.  \qquad\qquad (1)

The \infty-norm is interpreted as \lim_{p\to\infty}\|x\|_p and is given by

\notag    \|x\|_{\infty} = \displaystyle\max_{1\le i\le n} |x_i|.

Other important special cases are

\notag \begin{alignedat}{2}    \|x\|_1 &= \sum_{i=1}^n |x_i|, &\quad&                 \mbox{``Manhattan'' or ``taxi~cab'' norm}, \\    \|x\|_2 &= \biggl( \sum_{i=1}^n |x_i|^2  \biggr)^{1/2} = (x^*x)^{1/2},                   &\quad& \mbox{Euclidean length}. \end{alignedat}

A useful concept is that of the dual norm associated with a given vector norm, which is defined by

\notag    \|y\|^D =  \displaystyle\max_{x\ne0}               \displaystyle\frac{\mathop{\mathrm{Re}}x^* y}{\|x\|}.

The maximum is attained because the definition is equivalent to \|y\|^D = \max\{ \, \mathop{\mathrm{Re}} x^*y: \|x\| = 1\,\}, in which we are maximizing a continuous function over a compact set. Importantly, the dual of the dual norm is the original norm (Horn and Johnson, 2013, Thm.\, 5.5.9(c)).

We can rewrite the definition of dual norm, using the homogeneity of vector norms, as

\notag  \|y\|^D = \displaystyle\max_{|c| = 1} \| cy \|^D          = \max_{|c| = 1} \max_{x\ne 0} \frac{\mathop{\mathrm{Re}} x^*(cy) }{\|x\|}          =  \max_{x\ne 0} \max_{|c| = 1} \frac{\mathop{\mathrm{Re}} c(x^*y) }{\|x\|}          =  \max_{x\ne 0} \frac{ |x^*y| }{\|x\|}.

Hence we have the attainable inequality

\notag    |x^*y| \le \|x\| \, \|y\|^D, \qquad\qquad (2)

which is the generalized Hölder inequality.

The dual of the p-norm is the q-norm, where p^{-1} + q^{-1} = 1, so for the p-norms the inequality (2) becomes the (standard) Hölder inequality,

\notag    |x^*y| \le \|x\|_p \, \|y\|_q, \quad           \displaystyle\frac{1}{p} + \frac{1}{q} = 1.

An important special case is the Cauchy–Schwarz inequality,

\notag    |x^*y| \le \|x\|_2 \, \|y\|_2.

The notation \|x\|_0 is used to denote the number of nonzero entries in x, even though it is not a vector norm and is not obtained from (1) with p = 0. In portfolio optimization, if x_k specifies how much to invest in stock k then the inequality \|x\|_0 \le k says “invest in at most k stocks”.

In numerical linear algebra, vector norms play a crucial role in the definition of a subordinate matrix norm, as we will explain in the next post in this series.

All norms on \mathbb{C}^n are equivalent in the sense that for any two norms \|\cdot\|_\alpha and \|\cdot\|_\beta there exist positive constants \nu_1 and \nu_2 such that

\nu_1\|x\|_\alpha \le \|x\|_\beta \le \nu_2    \|x\|_\alpha \quad \mathrm{for~all}~x\in \mathbb{C}^n.

For example, it is easy to see that

\notag \begin{aligned}          \|x\|_2 &\le \|x\|_1 \le \sqrt{n} \|x\|_2,\\     \|x\|_\infty &\le \|x\|_2 \le \sqrt{n} \|x\|_\infty,\\     \|x\|_\infty &\le \|x\|_1 \le n \|x\|_\infty. \end{aligned}

The 2-norm is invariant under unitary transformations: if Q^*Q = I, then \|Qx\|^2 = x^*Q^* Qx = x^*x = \|x\|_2^2.

Care must be taken to avoid overflow and (damaging) underflow when evaluating a vector p-norm in floating-point arithmetic for p\ne 1,\infty. One can simply use the formula \|x\|_p = \| (x/\|x\|_{\infty}) \|_p \|x\|_{\infty}, but this requires two passes over the data (the first to evaluate \|x\|_{\infty}). For more efficient one-pass algorithms for the 2-norm see Higham (2002, Sec. 21.8) and Harayama et al. (2021).

References

This is a minimal set of references, which contain further useful references within.

Note: This article was revised on October 12, 2021 to change the definition of dual norm to use \mathrm{Re}.

This article is part of the “What Is” series, available from https://nhigham.com/category/what-is and in PDF form from the GitHub repository https://github.com/higham/what-is.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s