Similar to the real line concerning two real scalars and the distance between them, $|a - b|$, vector norms allow us to get a sense of the distance or magnitude of a vector. In fact, a vector of length one is simply a scalar. Norms are often used in regularization methods and other machine learning procedures, as well as many different matrix and vector operations in linear algebra.
Formally, given a vector space $\mathbb{R}^n$, a vector norm is defined as a function $f: \mathbb{R}^n \rightarrow \mathbb{R}$. Norms are represented by double-bar notation, for example, a norm $x$ would be denoted $||x||$. Vector norms have the following properties:
- $\Vert \vec{x} \Vert > 0$ for a vector $x \in \mathbb{R}^n$
- $\Vert \vec{x} \Vert = 0$ if the vector $\vec{x} = 0$
- $\Vert \alpha \vec{x} \vert = \vert \alpha \vert \Vert \vec{x} \Vert $ for a vector $\vec{x} \in \mathbb{R}^n$ and a scalar $\alpha \in \mathbb{R}$
- $\Vert \vec{x} + \vec{y} \Vert \leq \Vert \vec{x} \Vert + \Vert \vec{y} \Vert$ for vectors $\vec{x},\vec{y} \in \mathbb{R}^n$
The most commonly occurring vector norms are the $1$, $2$, and $\infty$ norms, which are a part of the $p$-norm class of vector norms.
Vector Norms¶
The following will explore the most frequently occurring vector norms $1$, $2$, and $\infty$ and some Python implementations of the norms. We first create an array using numpy
's randint
function that will be used as our example vector.
import numpy as np
x = np.random.randint(-10, 10, 10)
x
1-Norm¶
The 1-norm of a vector $x$ is defined as:
$$ \large \Vert \vec{x} \Vert_1 = \vert x_1 \vert + \vert x_2 \vert + \cdots + \vert x_n \vert $$The $1$-norm is often denoted $\textit{l}_1$ or $L^1$ and is also referred to as the Taxicab norm. The following is an implementation of the calculation of the $1$-norm.
norm1 = 0
for i in x:
norm1 += np.abs(i)
print(norm1)
Rather than writing a (comparatively slow) loop, we can take advantage of numpy
's vectorization to calculate the $1$-norm as a one-liner.
np.sum(np.abs(x))
We can confirm our results with Numpy
's norm
function.
np.linalg.norm(x, 1)
2-Norm¶
The $2$-norm of a vector $x$ is defined as:
$$ \large \Vert \vec{x} \Vert _2 = \sqrt{x_1^2 + x_2^2 + \cdots + x_n^2} = \sqrt{\vec{x}^T \vec{x}} $$The $2$-norm is often denoted as $\textit{l}_2$ or $L^2$ and is also referred to as the Euclidean length or distance. An example implementation of the $2$-norm calculation is somewhat similar to the $1$-norm computation.
norm2 = 0
for i in x:
norm2 += i ** 2
print(np.sqrt(norm2))
Another example of computing the $2$-norm of a vector as a one-liner.
np.sqrt(np.sum(np.power(x, 2)))
Given the definition of the $2$-norm above, we know the above computations is also equal to the square root of the inner product of the vector and the conjugate transpose. Numpy
provides an inner
function for computing the inner product.
np.sqrt(np.inner(x, x))
As before, we can verify our calculations of the $2$-norm by using numpy
's norm
function.
np.linalg.norm(x, 2)
P-Norm¶
The p-norm, which is considered a class of vector norms is defined as:
$$ \large \Vert \vec{x} \Vert_p = \sqrt[p]{\vert x_1 \vert^p + \vert x_2 \vert^p + \cdots + \vert x_n \vert^p} \qquad p \geq 1 $$By setting $p$ equal to $1$ or $2$, we can find the $1$ and $2$-norm of a vector without the need for separate equations and functions. Below we calculate the $2$-norm of a vector using the $p$-norm equation.
pnorm = 0
p = 2
for i in x:
pnorm += np.abs(i) ** p
pnorm ** (1. / p)
A more simple and one-liner implementation of the $p$-norm class of vector norms.
np.sum(np.abs(x) ** p) ** (1. / p)
The $p$-norm is not limited to $p = 1$ or $p = 2$, but $p$ cannot be less than $1$ or complex. Here, we compute the vector $3$-norm and compare the results with numpy
's norm
function.
p = 3
print(np.sum(np.abs(x) ** p) ** (1. / p))
print(np.linalg.norm(x, 3))
Inf-Norm¶
The $\infty$ norm of a vector $\vec{x}$ is defined as:
$$ \large \Vert \vec{x} \Vert_{\infty} = \underset{1 \leq i \leq n}{\max} \vert x_i \vert $$The $\infty$-norm of a vector is the maximum absolute value of the elements in the vector. Below is one possible example implementation of finding the $\infty$-norm of a vector.
x_abs = np.abs(x)
infnorm = x_abs[0]
for i in x_abs[1:]:
if infnorm < i:
infnorm = i
infnorm
A much more easy and straightforward approach to finding the $\infty$-norm using numpy
's max
and absolute
functions.
np.max(np.abs(x))
As before, we can compare our results with numpy
's norm
function by setting the function's ord
parameter ($p$) to numpy
's inf
constant.
np.linalg.norm(x, np.inf)
Lastly, we take a look at the $-\infty$-norm, which can be thought of as the opposite of the $\infty$-norm.
-Inf-Norm¶
The $-\infty$ norm of a vector $\vec{x}$ is defined as:
$$ \large \Vert \vec{x} \Vert_{\infty} = \underset{1 \leq i \leq n}{\min} \vert x_i \vert $$The $-\infty$-norm of a vector is the minimum absolute value of the elements in the vector. The following example implementation of finding the $-\infty$-norm is quite similar to the $\infty$-norm implementation example, except here we are looking for the smallest value rather than the maximum value.
neg_infnorm = x_abs[0]
for i in x_abs[1:]:
if neg_infnorm > i:
neg_infnorm = i
neg_infnorm
A much more easy and simple implementation of finding the $-\infty$-norm of a vector.
np.min(np.abs(x))
Again, setting the ord
parameter to numpy
's negative inf
constant, we can confirm our calculations of the $-\infty$-norm above.
np.linalg.norm(x, -np.inf)
Inequalities¶
Now that the most common and important vector norms have been introduced, we turn our attention towards some interesting inequalities that arise from the study of vector norms.
Cauchy-Schwartz Inequality¶
In the linear and vector algebra, The Cauchy-Schwartz inequality, sometimes referred to as the Cauchy–Bunyakovsky–Schwarz inequality states that for any two vectors, denoted $\vec{x}$ and $\vec{y}$, the following is true:
$$ \large \vert \langle \vec{x}, \vec{y} \rangle \vert^2 \geq \langle \vec{x}, \vec{x} \rangle \cdot \langle \vec{y}, \vec{y} \rangle $$The inequality can be rewritten for vector norms as the following:
$$ \large |\vec{x}^T \vec{y}| \leq \Vert \vec{x} \Vert_2\Vert \vec{y} \Vert_2 $$Which states that for any two vectors $\vec{x}$ and $\vec{y}$, the inner product of $\vec{x}$ and $\vec{y}$ will always be less than or equal to the product of the $2$-norms of the vectors.
The proof of the Cauchy-Schwartz inequality is beyond the scope of this post; however, we can write a simple function to demonstrate the inequality. The following function creates two vectors of length $10$ with elements between $1$ and $100$ using numpy
's randint
function. The inner product is computed with numpy
's inner
function and is then compared with the right side of the inequality. The function will return True
if the inequality holds and False
otherwise. Thus, the function should always return True
.
def cauchy_schwartz_inequality():
x = np.random.randint(1, 100, 10)
y = np.random.randint(1, 100, 10)
inner_prod = np.inner(x, y)
right = np.sqrt(np.sum(np.power(x, 2))) * np.sqrt(np.sum(np.power(y, 2)))
return inner_prod <= right
The following tests the inequality by running the cauchy_schwartz_inequality()
function created above $100,000$ times and appends the results to a list. Using Python's any
method, we can then see if any of the appended results in the list is False
.
res = []
for i in range(0, 100000):
res.append(cauchy_schwartz_inequality())
any(x == False for x in res)
Hölder's Inequality¶
It can be shown the Cauchy-Schwartz inequality is a special case of Hölder's Inequality, which in the linear algebra setting can be defined as:
$$ \large \vert \vec{x}^T \vec{y} \vert \leq \Vert \vec{x} \Vert_p \Vert \vec{y} \Vert_q, \qquad \frac{1}{p} + \frac{1}{q} = 1 $$The variables $p$ and $q$ are known as Hölder conjugates. The inequality thus states that the inner product of two vectors $\vec{x}$ and $\vec{y}$ will always be less than equal to the product of $p$-norms of the vectors where $\frac{1}{p} + \frac{1}{q} = 1$. As before, the proof of Hölder's inequality is beyond the scope of this post, but we can write a function to demonstrate the inequality. The following function creates two vectors of length ten with random elements between $1$ and $100$ and a number $p$ between $2$ and $10$. The conjugate number $q$ is then found, and the inner product and right side of the inequality are computed. The function then returns True
if the inner product is less than or greater to the right side of the inequality and False
otherwise. We also return the calculation $1 / p + 1 / q$ to ensure it is always equal to $1$.
def holder_inequality():
x = np.random.randint(1, 100, 10)
y = np.random.randint(1, 100, 10)
p = np.random.randint(2, 10)
q = p / (p - 1)
inner_prod = np.inner(x, y)
right = np.sum(np.abs(x) ** p) ** (1. / p) * np.sum(np.abs(y) ** q) ** (1. / q)
return inner_prod <= right, int(1 / p + 1 / q)
We then test the inequality by running the function $100,000$ times and append the results to a list. Given the inequality's definition, the result should always be True
and $1$ for the $1/p + 1/q$ calculation.
res = []
for i in range(0, 100000):
res.append(holder_inequality())
print('False values exist in result:', any(x[0] == False for x in res))
print('All conjugate calculations are equal to 1:', all(x[1] == 1 for x in res))
Minkowski's Inequality¶
The final inequality we shall investigate is known as Minkowski's inequality for norms. Minkowski's inequality for norms is also known as the triangle inequality. For two vectors, $\vec{x}$ and $\vec{y}$ of length $n$, Minkowski's inequality can be written as:
$$ \large \left( \sum^n_{i=1} \left(\vert \vec{x}_i + \vec{y}_i \vert \right)^p \right)^{\frac{1}{p}} \leq \left(\sum^n_{i=1} \vert \vec{x}_i \vert^p \right)^{\frac{1}{p}} + \left(\sum^n_{i=1} \vert \vec{y}_i \vert^p \right)^{\frac{1}{p}} $$Which can be more simply written using vector norms as:
$$ \large \Vert \vec{x} + \vec{y} \Vert_p \leq \Vert \vec{x} \Vert_p \Vert \vec{y} \Vert_p $$Similar to the other inequalities we examined previously, we can demonstrate Minkowski's inequality for norms by writing a fairly straightforward function. The function minkowski_inequality()
creates two vectors of length ten with random integer elements between $1$ and $100$ and a $p$ number between $1$ and $5$. The left and right sides of the inequality are then computed, and the function returns True
if the left side is less than or equal to the right side and False
otherwise.
def minkowski_inequality():
x = np.random.randint(1, 100, 10)
y = np.random.randint(1, 100, 10)
p = np.random.randint(1, 5)
left = np.sum(np.abs(x + y) ** p) ** (1. / p)
right = np.sum(np.abs(x) ** p) ** (1. / p) + np.sum(np.abs(y) ** p) ** (1. / p)
return left <= right
Given the definition of Minkowski's inequality, the function should always return True
. We test this by running the function $100,000$ times and appending the results to a list. Then, with Python's any
function, we check that the appended list of results does not contain a False
value.
res = []
for i in range(0, 100000):
res.append(minkowski_inequality())
any(x == False for x in res)
References¶
Weisstein, Eric W. "Vector Norm." From MathWorld--A Wolfram Web Resource. https://mathworld.wolfram.com/VectorNorm.