Matrix Norms and Inequalities with Python

$$\renewcommand{\vec}[1]{\mathbf{#1}}$$

Matrix norms are an extension of vector norms to matrices and are used to define a measure of distance on the space of a matrix. More specifically, a matrix norm is defined as a function $f: \mathbb{R}^{m \times n} \rightarrow \mathbb{R}$. The double bar notation used to denote vector norms is also used for matrix norms. The properties of a matrix norm are similar to those of a vector norm.

  • $\Vert \vec{A} \Vert \geq 0$ for any matrix $\vec{A} \in \mathbb{R}^{m \times n}$
    • $ \Vert \vec{A} \Vert = 0$ if the matrix $\vec{A} = 0$
  • $ \Vert \alpha \vec{A} \Vert = \vert \alpha \vert \Vert \vec{A} \Vert$ for a $m \times n$ matrix and scalar $\alpha$
  • $\Vert \vec{A} + \vec{B} \Vert \leq \Vert \vec{A} \Vert + \Vert \vec{B} \Vert$ for $m \times n$ matrices $\vec{A}$ and $\vec{B}$

The most commonly occurring matrix norms in matrix analysis are the Frobenius, $L_1$, $L_2$ and $L_\infty$ norms. The following will investigate these norms, along with some Python implementations of the calculation of the matrix norm. Before getting started, we create a $3 \times 2$ matrix with random elements between $-10$ and $10$ that will be used for the examples.

In [1]:
import numpy as np
a = np.random.randint(-10, 10, size=(3, 2))
a
Out[1]:
array([[-8, -9],
       [-1, -1],
       [ 7,  4]])

The number of matrix rows $n$ and columns $m$ will be required throughout to calculate the matrix norms. To get these values, we take the shape of the created matrix.

In [2]:
n, m = a.shape
print(n, m)
3 2

Frobenius Norm

The Frobenius norm, also known as the Hilbert-Schmidt norm or the Schur norm, is one of the most commonly employed matrix norms in linear algebra. The Frobenius norm is defined as:

$$ \large \Vert \vec{A} \Vert_F = \sqrt{\sum_{i=1}^m \sum_{j=1}^n \vert a_{ij} \vert^2} $$

A demonstrative implementation of the Frobenius norm in Python is as follows.

In [3]:
f = 0

for i in np.arange(n):
    for j in np.arange(m):
        f = f + np.sum(np.power(np.abs(a[i, j]), 2))

np.sqrt(f)
Out[3]:
14.560219778561036

As with vector norms, we can take advantage of numpy's vectorization support to calculate the Frobenius norm with a one-liner.

In [4]:
np.sqrt(np.sum(np.abs(a) ** 2))
Out[4]:
14.560219778561036

We can confirm our result by comparing it to the output of numpy's norm function. The ord parameter is specified as 'fro' to output the Frobenius norm, but this is the default behavior when a matrix is passed to the norm function.

In [5]:
np.linalg.norm(a, 'fro')
Out[5]:
14.560219778561036

$1$-Norm

The matrix 1-norm is defined as the maximum absolute column sum of a matrix.

$$ \large \Vert \vec{A} \Vert_1 = \underset{\Vert \vec{x} \Vert_1 = 1}{\max} \Vert \vec{Ax} \Vert_1 = \underset{1 \leq j \leq n}{\max}\left( \sum^n_{i=1} \vert a_{ij} \vert \right) $$

Therefore, the $1$-Norm of a matrix is the maximum of the sum of each column.

In [6]:
colsums = []
for i in np.arange(m):
    v = np.sum(np.abs(a[:, i]))
    colsums.append(v)

np.max(colsums)
Out[6]:
16

By setting the axis parameter to 0, numpy's sum function will sum column-wise. This gives us a faster, one-line implementation of the matrix $1$-norm.

In [7]:
np.max(np.sum(np.abs(a), axis=0))
Out[7]:
16

Confirming our results with the norm function in numpy's linalg module.

In [8]:
np.linalg.norm(a, 1)
Out[8]:
16.0

$\infty$-Norm

The $\infty$-norm of a matrix is defined as the maximum absolute sum of the matrix rows.

$$ \large \Vert \vec{A} \Vert_\infty = \underset{\Vert \vec{x} \Vert_{\infty} = 1}{\max} \Vert \vec{Ax} \Vert_{\infty} = \underset{1 \leq i \leq n}{\max} \left(\sum^n_{j=1} \vert a_{ij} \vert \right) $$
In [9]:
rowsums = []
for i in np.arange(n):
    v = np.sum(np.absolute(a[i, :]))
    rowsums.append(v)

np.max(rowsums)
Out[9]:
17
In [10]:
np.max(np.sum(np.abs(a), axis=1))
Out[10]:
17
In [11]:
np.linalg.norm(a, np.inf)
Out[11]:
17.0

$2$-Norm

Unlike the other norms shown above, the matrix $2$-norm is more difficult to characterize.

$$ \large \Vert \vec{A} \Vert_2 = \underset{\Vert \vec{x} \Vert_2}{\max} \Vert \vec{Ax} \Vert_2 $$

After some tedious differentiation of a maximization function, it can be shown the matrix $2$-norm is defined as:

$$ \large \Vert \vec{A} \Vert_2 = \underset{1 \leq i \leq n}{\max} \sqrt{\lambda_i (\vec{A}^T \vec{A})} $$

Thus, the $2$-norm of a matrix is the square root of the maximum eigenvalue of the inner product of $\vec{A}^T \vec{A}$. Fortunately, numpy's eigvals function allows us to easily calculate the eigenvalues and find the $2$-norm.

In [12]:
np.sqrt(np.max(np.linalg.eigvals(np.inner(a, a))))
Out[12]:
14.398483819270972

We can confirm our result by comparing it to the output of numpy's norm function.

In [13]:
np.linalg.norm(a, 2)
Out[13]:
14.398483819270972

Inequalities and Some Matrix Properties

Now that the most commonly occurring matrix norms have been introduced, we turn to some interesting inequalities that arise in the analysis of matrix norms. The first inequality is a special case of Hölder's inequality and is defined as:

$$ \large \Vert \vec{A} \Vert_2 \leq \sqrt{\Vert \vec{A} \Vert_1 \Vert \vec{A} \Vert_{\infty}} $$

We can demonstrate this inequality with a straightforward Python function. The following function creates a randomly sized matrix no bigger than $6 \times 6$ and no smaller than $3 \times 3$ with random elements between $-10$ and $10$. The $2$-norm, $1$-norm, and $\infty$-norm are then computed and compared. The function is then run $100,000$ times with the results appended to a list. Using Python's any function, we can then verify that none of the appended results are False, which is the expectation of the inequality.

In [14]:
def holder_inequality():
    mat_size = np.random.randint(3, 6, 2)
    a = np.random.randint(-10, 10, size=mat_size)
    
    norm2 = np.sqrt(np.max(np.linalg.eigvals(np.inner(a, a))))
    norm1 = np.max(np.sum(np.abs(a), axis=0))
    inf_norm = np.max(np.sum(np.abs(a), axis=1))
    
    return norm2 <= np.sqrt(norm1 * inf_norm)

res = []
for i in range(0, 100000):
    res.append(holder_inequality())
    
any(x == False for x in res)
Out[14]:
False

Another inequality between the Frobenius and $2$-norm is defined as:

$$ \large \Vert \vec{A} \Vert_2 \leq \Vert \vec{A} \Vert_F \leq \sqrt{n} \Vert \vec{A} \Vert_2 $$

The following function demonstrates the inequality by creating a random matrix similar to before and then computing the inequality. The function is then run $100,000$ times with the results appended to a list. All of the appended results should be True, which we verify with the any function.

In [15]:
def frobenius_2_norm_inequality():
    mat_size = np.random.randint(3, 6, 2)
    a = np.random.randint(-10, 10, size=mat_size)
    
    n = a.shape[1]
    norm2 = np.sqrt(np.max(np.linalg.eigvals(np.inner(a, a))))
    frob = np.sqrt(np.sum(np.abs(a) ** 2))
    
    return norm2 <= frob <= np.sqrt(n) * norm2

res = []
for i in range(0, 100000):
    res.append(frobenius_2_norm_inequality())
    
any(x == False for x in res)
Out[15]:
False

The following are some more inequalities that arise in the study of matrix norms. Each inequality is demonstrated similarly to the above.

$$ \large \underset{i,j}{\max} \vert a_{ij} \vert \leq \Vert \vec{A} \Vert_2 \leq \sqrt{mn} \space \underset{i,j}{\max} \vert a_{ij} \vert $$
In [16]:
def inequality2():
    mat_size = np.random.randint(3, 6, 2)
    a = np.random.randint(-10, 10, size=mat_size)
    maximum = np.max(np.abs(a))
    norm2 = np.sqrt(np.max(np.linalg.eigvals(np.inner(a, a))))
    right = np.sqrt(a.shape[0] * a.shape[1]) * maximum
    
    return maximum <= norm2 <= right

res = []
for i in range(0, 100000):
    res.append(inequality2())
    
any(x == False for x in res)
Out[16]:
False
$$ \large \frac{1}{\sqrt{n}} \Vert \vec{A} \Vert_\infty \leq \Vert \vec{A} \Vert_2 \leq \sqrt{m} \space \Vert \vec{A} \Vert_\infty $$
In [17]:
def inequality3():
    mat_size = np.random.randint(3, 6, 2)
    a = np.random.randint(0, 10, size=mat_size)
    
    inf_norm = np.max(np.sum(a, axis=1))
    
    left = 1 / np.sqrt(a.shape[1]) * inf_norm
    norm2 = np.sqrt(np.max(np.linalg.eigvals(np.inner(a, a))))
    right = np.sqrt(a.shape[0]) * inf_norm

    return left <= norm2 <= right

res = []
for i in range(0, 100000):
    res.append(inequality3())
    
any(x == False for x in res)
Out[17]:
False
$$ \large \frac{1}{\sqrt{m}} \Vert \vec{A} \Vert_1 \leq \Vert \vec{A} \Vert_2 \leq \sqrt{n} \space \Vert \vec{A} \Vert_1 $$
In [18]:
def inequality4():
    mat_size = np.random.randint(3, 4, 2)
    a = np.random.randint(0, 10, size=mat_size)
    
    norm1 = np.max(np.sum(a, axis=1))
    norm2 = np.sqrt(np.max(np.linalg.eigvals(np.inner(a, a))))
    right = np.sqrt(a.shape[1]) * norm1
    
    return 1 / np.sqrt(a.shape[0]) * norm1 <= norm2 <= right

res = []
for i in range(0, 100000):
    res.append(inequality4())
    
any(x == False for x in res)
Out[18]:
False

Related Posts