Matrix norms are an extension of vector norms to matrices and are used to define a measure of distance on the space of a matrix. More specifically, a matrix norm is defined as a function $f: \mathbb{R}^{m \times n} \rightarrow \mathbb{R}$. The double bar notation used to denote vector norms is also used for matrix norms. The properties of a matrix norm are similar to those of a vector norm.
- $\Vert \vec{A} \Vert \geq 0$ for any matrix $\vec{A} \in \mathbb{R}^{m \times n}$
- $ \Vert \vec{A} \Vert = 0$ if the matrix $\vec{A} = 0$
- $ \Vert \alpha \vec{A} \Vert = \vert \alpha \vert \Vert \vec{A} \Vert$ for a $m \times n$ matrix and scalar $\alpha$
- $\Vert \vec{A} + \vec{B} \Vert \leq \Vert \vec{A} \Vert + \Vert \vec{B} \Vert$ for $m \times n$ matrices $\vec{A}$ and $\vec{B}$
The most commonly occurring matrix norms in matrix analysis are the Frobenius, $L_1$, $L_2$ and $L_\infty$ norms. The following will investigate these norms, along with some Python implementations of the calculation of the matrix norm. Before getting started, we create a $3 \times 2$ matrix with random elements between $-10$ and $10$ that will be used for the examples.
import numpy as np
a = np.random.randint(-10, 10, size=(3, 2))
a
The number of matrix rows $n$ and columns $m$ will be required throughout to calculate the matrix norms. To get these values, we take the shape
of the created matrix.
n, m = a.shape
print(n, m)
Frobenius Norm¶
The Frobenius norm, also known as the Hilbert-Schmidt norm or the Schur norm, is one of the most commonly employed matrix norms in linear algebra. The Frobenius norm is defined as:
$$ \large \Vert \vec{A} \Vert_F = \sqrt{\sum_{i=1}^m \sum_{j=1}^n \vert a_{ij} \vert^2} $$A demonstrative implementation of the Frobenius norm in Python is as follows.
f = 0
for i in np.arange(n):
for j in np.arange(m):
f = f + np.sum(np.power(np.abs(a[i, j]), 2))
np.sqrt(f)
As with vector norms, we can take advantage of numpy
's vectorization support to calculate the Frobenius norm with a one-liner.
np.sqrt(np.sum(np.abs(a) ** 2))
We can confirm our result by comparing it to the output of numpy
's norm
function. The ord
parameter is specified as 'fro'
to output the Frobenius norm, but this is the default behavior when a matrix is passed to the norm
function.
np.linalg.norm(a, 'fro')
$1$-Norm¶
The matrix 1-norm is defined as the maximum absolute column sum of a matrix.
$$ \large \Vert \vec{A} \Vert_1 = \underset{\Vert \vec{x} \Vert_1 = 1}{\max} \Vert \vec{Ax} \Vert_1 = \underset{1 \leq j \leq n}{\max}\left( \sum^n_{i=1} \vert a_{ij} \vert \right) $$Therefore, the $1$-Norm of a matrix is the maximum of the sum of each column.
colsums = []
for i in np.arange(m):
v = np.sum(np.abs(a[:, i]))
colsums.append(v)
np.max(colsums)
By setting the axis
parameter to 0
, numpy
's sum
function will sum column-wise. This gives us a faster, one-line implementation of the matrix $1$-norm.
np.max(np.sum(np.abs(a), axis=0))
Confirming our results with the norm
function in numpy
's linalg
module.
np.linalg.norm(a, 1)
$\infty$-Norm¶
The $\infty$-norm of a matrix is defined as the maximum absolute sum of the matrix rows.
$$ \large \Vert \vec{A} \Vert_\infty = \underset{\Vert \vec{x} \Vert_{\infty} = 1}{\max} \Vert \vec{Ax} \Vert_{\infty} = \underset{1 \leq i \leq n}{\max} \left(\sum^n_{j=1} \vert a_{ij} \vert \right) $$rowsums = []
for i in np.arange(n):
v = np.sum(np.absolute(a[i, :]))
rowsums.append(v)
np.max(rowsums)
np.max(np.sum(np.abs(a), axis=1))
np.linalg.norm(a, np.inf)
$2$-Norm¶
Unlike the other norms shown above, the matrix $2$-norm is more difficult to characterize.
$$ \large \Vert \vec{A} \Vert_2 = \underset{\Vert \vec{x} \Vert_2}{\max} \Vert \vec{Ax} \Vert_2 $$After some tedious differentiation of a maximization function, it can be shown the matrix $2$-norm is defined as:
$$ \large \Vert \vec{A} \Vert_2 = \underset{1 \leq i \leq n}{\max} \sqrt{\lambda_i (\vec{A}^T \vec{A})} $$Thus, the $2$-norm of a matrix is the square root of the maximum eigenvalue of the inner product of $\vec{A}^T \vec{A}$. Fortunately, numpy
's eigvals
function allows us to easily calculate the eigenvalues and find the $2$-norm.
np.sqrt(np.max(np.linalg.eigvals(np.inner(a, a))))
We can confirm our result by comparing it to the output of numpy
's norm
function.
np.linalg.norm(a, 2)
Inequalities and Some Matrix Properties¶
Now that the most commonly occurring matrix norms have been introduced, we turn to some interesting inequalities that arise in the analysis of matrix norms. The first inequality is a special case of Hölder's inequality and is defined as:
$$ \large \Vert \vec{A} \Vert_2 \leq \sqrt{\Vert \vec{A} \Vert_1 \Vert \vec{A} \Vert_{\infty}} $$We can demonstrate this inequality with a straightforward Python function. The following function creates a randomly sized matrix no bigger than $6 \times 6$ and no smaller than $3 \times 3$ with random elements between $-10$ and $10$. The $2$-norm, $1$-norm, and $\infty$-norm are then computed and compared. The function is then run $100,000$ times with the results appended to a list. Using Python's any
function, we can then verify that none of the appended results are False
, which is the expectation of the inequality.
def holder_inequality():
mat_size = np.random.randint(3, 6, 2)
a = np.random.randint(-10, 10, size=mat_size)
norm2 = np.sqrt(np.max(np.linalg.eigvals(np.inner(a, a))))
norm1 = np.max(np.sum(np.abs(a), axis=0))
inf_norm = np.max(np.sum(np.abs(a), axis=1))
return norm2 <= np.sqrt(norm1 * inf_norm)
res = []
for i in range(0, 100000):
res.append(holder_inequality())
any(x == False for x in res)
Another inequality between the Frobenius and $2$-norm is defined as:
$$ \large \Vert \vec{A} \Vert_2 \leq \Vert \vec{A} \Vert_F \leq \sqrt{n} \Vert \vec{A} \Vert_2 $$The following function demonstrates the inequality by creating a random matrix similar to before and then computing the inequality. The function is then run $100,000$ times with the results appended to a list. All of the appended results should be True
, which we verify with the any
function.
def frobenius_2_norm_inequality():
mat_size = np.random.randint(3, 6, 2)
a = np.random.randint(-10, 10, size=mat_size)
n = a.shape[1]
norm2 = np.sqrt(np.max(np.linalg.eigvals(np.inner(a, a))))
frob = np.sqrt(np.sum(np.abs(a) ** 2))
return norm2 <= frob <= np.sqrt(n) * norm2
res = []
for i in range(0, 100000):
res.append(frobenius_2_norm_inequality())
any(x == False for x in res)
The following are some more inequalities that arise in the study of matrix norms. Each inequality is demonstrated similarly to the above.
$$ \large \underset{i,j}{\max} \vert a_{ij} \vert \leq \Vert \vec{A} \Vert_2 \leq \sqrt{mn} \space \underset{i,j}{\max} \vert a_{ij} \vert $$def inequality2():
mat_size = np.random.randint(3, 6, 2)
a = np.random.randint(-10, 10, size=mat_size)
maximum = np.max(np.abs(a))
norm2 = np.sqrt(np.max(np.linalg.eigvals(np.inner(a, a))))
right = np.sqrt(a.shape[0] * a.shape[1]) * maximum
return maximum <= norm2 <= right
res = []
for i in range(0, 100000):
res.append(inequality2())
any(x == False for x in res)
def inequality3():
mat_size = np.random.randint(3, 6, 2)
a = np.random.randint(0, 10, size=mat_size)
inf_norm = np.max(np.sum(a, axis=1))
left = 1 / np.sqrt(a.shape[1]) * inf_norm
norm2 = np.sqrt(np.max(np.linalg.eigvals(np.inner(a, a))))
right = np.sqrt(a.shape[0]) * inf_norm
return left <= norm2 <= right
res = []
for i in range(0, 100000):
res.append(inequality3())
any(x == False for x in res)
def inequality4():
mat_size = np.random.randint(3, 4, 2)
a = np.random.randint(0, 10, size=mat_size)
norm1 = np.max(np.sum(a, axis=1))
norm2 = np.sqrt(np.max(np.linalg.eigvals(np.inner(a, a))))
right = np.sqrt(a.shape[1]) * norm1
return 1 / np.sqrt(a.shape[0]) * norm1 <= norm2 <= right
res = []
for i in range(0, 100000):
res.append(inequality4())
any(x == False for x in res)
References¶
Weisstein, Eric W. "Matrix Norm." From MathWorld--A Wolfram Web Resource. https://mathworld.wolfram.com/MatrixNorm.html