$$\renewcommand{\vec}[1]{\mathbf{#1}}$$

Similar to the real line concerning two real scalars and the distance between them, $|a - b|$, vector norms allow us to get a sense of the distance or magnitude of a vector. In fact, a vector of length one is simply a scalar. Norms are often used in regularization methods and other machine learning procedures, as well as many different matrix and vector operations in linear algebra.

Formally, given a vector space $\mathbb{R}^n$, a vector norm is defined as a function $f: \mathbb{R}^n \rightarrow \mathbb{R}$. Norms are represented by double-bar notation, for example, a norm $x$ would be denoted $||x||$. Vector norms have the following properties:

$\Vert \vec{x} \Vert > 0$ for a vector $x \in \mathbb{R}^n$
- $\Vert \vec{x} \Vert = 0$ if the vector $\vec{x} = 0$
$\Vert \alpha \vec{x} \vert = \vert \alpha \vert \Vert \vec{x} \Vert $ for a vector $\vec{x} \in \mathbb{R}^n$ and a scalar $\alpha \in \mathbb{R}$
$\Vert \vec{x} + \vec{y} \Vert \leq \Vert \vec{x} \Vert + \Vert \vec{y} \Vert$ for vectors $\vec{x},\vec{y} \in \mathbb{R}^n$

The most commonly occurring vector norms are the $1$, $2$, and $\infty$ norms, which are a part of the $p$-norm class of vector norms.

Vector Norms¶

The following will explore the most frequently occurring vector norms $1$, $2$, and $\infty$ and some Python implementations of the norms. We first create an array using numpy's randint function that will be used as our example vector.

In [1]:

import numpy as np
x = np.random.randint(-10, 10, 10)
x

Out[1]:

array([ 2, -3,  6, -3,  5, -9, -9,  8, -8,  2])

1-Norm¶

The 1-norm of a vector $x$ is defined as:

$$ \large \Vert \vec{x} \Vert_1 = \vert x_1 \vert + \vert x_2 \vert + \cdots + \vert x_n \vert $$

The $1$-norm is often denoted $\textit{l}_1$ or $L^1$ and is also referred to as the Taxicab norm. The following is an implementation of the calculation of the $1$-norm.

In [2]:

norm1 = 0

for i in x:
    norm1 += np.abs(i)
    
print(norm1)

Rather than writing a (comparatively slow) loop, we can take advantage of numpy's vectorization to calculate the $1$-norm as a one-liner.

In [3]:

np.sum(np.abs(x))

Out[3]:

We can confirm our results with Numpy's norm function.

In [4]:

np.linalg.norm(x, 1)

Out[4]:

55.0

2-Norm¶

The $2$-norm of a vector $x$ is defined as:

$$ \large \Vert \vec{x} \Vert _2 = \sqrt{x_1^2 + x_2^2 + \cdots + x_n^2} = \sqrt{\vec{x}^T \vec{x}} $$

The $2$-norm is often denoted as $\textit{l}_2$ or $L^2$ and is also referred to as the Euclidean length or distance. An example implementation of the $2$-norm calculation is somewhat similar to the $1$-norm computation.

In [5]:

norm2 = 0

for i in x:
    norm2 += i ** 2

print(np.sqrt(norm2))

19.4164878389476

Another example of computing the $2$-norm of a vector as a one-liner.

In [6]:

np.sqrt(np.sum(np.power(x, 2)))

Out[6]:

19.4164878389476

Given the definition of the $2$-norm above, we know the above computations is also equal to the square root of the inner product of the vector and the conjugate transpose. Numpy provides an inner function for computing the inner product.

In [7]:

np.sqrt(np.inner(x, x))

Out[7]:

19.4164878389476

As before, we can verify our calculations of the $2$-norm by using numpy's norm function.

In [8]:

np.linalg.norm(x, 2)

Out[8]:

19.4164878389476

P-Norm¶

The p-norm, which is considered a class of vector norms is defined as:

$$ \large \Vert \vec{x} \Vert_p = \sqrt[p]{\vert x_1 \vert^p + \vert x_2 \vert^p + \cdots + \vert x_n \vert^p} \qquad p \geq 1 $$

By setting $p$ equal to $1$ or $2$, we can find the $1$ and $2$-norm of a vector without the need for separate equations and functions. Below we calculate the $2$-norm of a vector using the $p$-norm equation.

In [9]:

pnorm = 0
p = 2

for i in x:
    pnorm += np.abs(i) ** p
pnorm ** (1. / p)

Out[9]:

19.4164878389476

A more simple and one-liner implementation of the $p$-norm class of vector norms.

In [10]:

np.sum(np.abs(x) ** p) ** (1. / p)

Out[10]:

19.4164878389476

The $p$-norm is not limited to $p = 1$ or $p = 2$, but $p$ cannot be less than $1$ or complex. Here, we compute the vector $3$-norm and compare the results with numpy's norm function.

In [11]:

p = 3
print(np.sum(np.abs(x) ** p) ** (1. / p))
print(np.linalg.norm(x, 3))

14.24894831676299
14.24894831676299

Inf-Norm¶

The $\infty$ norm of a vector $\vec{x}$ is defined as:

$$ \large \Vert \vec{x} \Vert_{\infty} = \underset{1 \leq i \leq n}{\max} \vert x_i \vert $$

The $\infty$-norm of a vector is the maximum absolute value of the elements in the vector. Below is one possible example implementation of finding the $\infty$-norm of a vector.

In [12]:

x_abs = np.abs(x)
infnorm = x_abs[0]

for i in x_abs[1:]:
    if infnorm < i:
        infnorm = i
        
infnorm

Out[12]:

A much more easy and straightforward approach to finding the $\infty$-norm using numpy's max and absolute functions.

In [13]:

np.max(np.abs(x))

Out[13]:

As before, we can compare our results with numpy's norm function by setting the function's ord parameter ($p$) to numpy's inf constant.

In [14]:

np.linalg.norm(x, np.inf)

Out[14]:

9.0

Lastly, we take a look at the $-\infty$-norm, which can be thought of as the opposite of the $\infty$-norm.

-Inf-Norm¶

The $-\infty$ norm of a vector $\vec{x}$ is defined as:

$$ \large \Vert \vec{x} \Vert_{\infty} = \underset{1 \leq i \leq n}{\min} \vert x_i \vert $$

The $-\infty$-norm of a vector is the minimum absolute value of the elements in the vector. The following example implementation of finding the $-\infty$-norm is quite similar to the $\infty$-norm implementation example, except here we are looking for the smallest value rather than the maximum value.

In [15]:

neg_infnorm = x_abs[0]

for i in x_abs[1:]:
    if neg_infnorm > i:
        neg_infnorm = i
        
neg_infnorm

Out[15]:

A much more easy and simple implementation of finding the $-\infty$-norm of a vector.

In [16]:

np.min(np.abs(x))

Out[16]:

Again, setting the ord parameter to numpy's negative inf constant, we can confirm our calculations of the $-\infty$-norm above.

In [17]:

np.linalg.norm(x, -np.inf)

Out[17]:

2.0

Inequalities¶

Now that the most common and important vector norms have been introduced, we turn our attention towards some interesting inequalities that arise from the study of vector norms.

Cauchy-Schwartz Inequality¶

In the linear and vector algebra, The Cauchy-Schwartz inequality, sometimes referred to as the Cauchy–Bunyakovsky–Schwarz inequality states that for any two vectors, denoted $\vec{x}$ and $\vec{y}$, the following is true:

$$ \large \vert \langle \vec{x}, \vec{y} \rangle \vert^2 \geq \langle \vec{x}, \vec{x} \rangle \cdot \langle \vec{y}, \vec{y} \rangle $$

The inequality can be rewritten for vector norms as the following:

$$ \large |\vec{x}^T \vec{y}| \leq \Vert \vec{x} \Vert_2\Vert \vec{y} \Vert_2 $$

Which states that for any two vectors $\vec{x}$ and $\vec{y}$, the inner product of $\vec{x}$ and $\vec{y}$ will always be less than or equal to the product of the $2$-norms of the vectors.

The proof of the Cauchy-Schwartz inequality is beyond the scope of this post; however, we can write a simple function to demonstrate the inequality. The following function creates two vectors of length $10$ with elements between $1$ and $100$ using numpy's randint function. The inner product is computed with numpy's inner function and is then compared with the right side of the inequality. The function will return True if the inequality holds and False otherwise. Thus, the function should always return True.

In [18]:

def cauchy_schwartz_inequality():
    x = np.random.randint(1, 100, 10)
    y = np.random.randint(1, 100, 10)
    
    inner_prod = np.inner(x, y)
    right = np.sqrt(np.sum(np.power(x, 2))) * np.sqrt(np.sum(np.power(y, 2)))
    
    return inner_prod <= right

The following tests the inequality by running the cauchy_schwartz_inequality() function created above $100,000$ times and appends the results to a list. Using Python's any method, we can then see if any of the appended results in the list is False.

In [19]:

res = []
for i in range(0, 100000):
    res.append(cauchy_schwartz_inequality())
    
any(x == False for x in res)

Out[19]:

False

Hölder's Inequality¶

It can be shown the Cauchy-Schwartz inequality is a special case of Hölder's Inequality, which in the linear algebra setting can be defined as:

$$ \large \vert \vec{x}^T \vec{y} \vert \leq \Vert \vec{x} \Vert_p \Vert \vec{y} \Vert_q, \qquad \frac{1}{p} + \frac{1}{q} = 1 $$

The variables $p$ and $q$ are known as Hölder conjugates. The inequality thus states that the inner product of two vectors $\vec{x}$ and $\vec{y}$ will always be less than equal to the product of $p$-norms of the vectors where $\frac{1}{p} + \frac{1}{q} = 1$. As before, the proof of Hölder's inequality is beyond the scope of this post, but we can write a function to demonstrate the inequality. The following function creates two vectors of length ten with random elements between $1$ and $100$ and a number $p$ between $2$ and $10$. The conjugate number $q$ is then found, and the inner product and right side of the inequality are computed. The function then returns True if the inner product is less than or greater to the right side of the inequality and False otherwise. We also return the calculation $1 / p + 1 / q$ to ensure it is always equal to $1$.

In [20]:

def holder_inequality():
    x = np.random.randint(1, 100, 10)
    y = np.random.randint(1, 100, 10)
    
    p = np.random.randint(2, 10)
    q = p / (p - 1)
    
    inner_prod = np.inner(x, y)
    right = np.sum(np.abs(x) ** p) ** (1. / p) * np.sum(np.abs(y) ** q) ** (1. / q)
    
    return inner_prod <= right, int(1 / p + 1 / q)

We then test the inequality by running the function $100,000$ times and append the results to a list. Given the inequality's definition, the result should always be True and $1$ for the $1/p + 1/q$ calculation.

In [21]:

res = []
for i in range(0, 100000):
    res.append(holder_inequality())

print('False values exist in result:', any(x[0] == False for x in res))
print('All conjugate calculations are equal to 1:', all(x[1] == 1 for x in res))

False values exist in result: False
All conjugate calculations are equal to 1: True

Minkowski's Inequality¶

The final inequality we shall investigate is known as Minkowski's inequality for norms. Minkowski's inequality for norms is also known as the triangle inequality. For two vectors, $\vec{x}$ and $\vec{y}$ of length $n$, Minkowski's inequality can be written as:

$$ \large \left( \sum^n_{i=1} \left(\vert \vec{x}_i + \vec{y}_i \vert \right)^p \right)^{\frac{1}{p}} \leq \left(\sum^n_{i=1} \vert \vec{x}_i \vert^p \right)^{\frac{1}{p}} + \left(\sum^n_{i=1} \vert \vec{y}_i \vert^p \right)^{\frac{1}{p}} $$

Which can be more simply written using vector norms as:

$$ \large \Vert \vec{x} + \vec{y} \Vert_p \leq \Vert \vec{x} \Vert_p \Vert \vec{y} \Vert_p $$

Similar to the other inequalities we examined previously, we can demonstrate Minkowski's inequality for norms by writing a fairly straightforward function. The function minkowski_inequality() creates two vectors of length ten with random integer elements between $1$ and $100$ and a $p$ number between $1$ and $5$. The left and right sides of the inequality are then computed, and the function returns True if the left side is less than or equal to the right side and False otherwise.

In [22]:

def minkowski_inequality():
    x = np.random.randint(1, 100, 10)
    y = np.random.randint(1, 100, 10)
    
    p = np.random.randint(1, 5)
    
    left = np.sum(np.abs(x + y) ** p) ** (1. / p)
    right = np.sum(np.abs(x) ** p) ** (1. / p) + np.sum(np.abs(y) ** p) ** (1. / p)
    
    return left <= right

Given the definition of Minkowski's inequality, the function should always return True. We test this by running the function $100,000$ times and appending the results to a list. Then, with Python's any function, we check that the appended list of results does not contain a False value.

In [23]:

res = []
for i in range(0, 100000):
    res.append(minkowski_inequality())

any(x == False for x in res)

Out[23]:

False

References¶

Golub, Gene; Van Loan, Charles F. (1996). Matrix Computations (Third ed.). Baltimore: The Johns Hopkins University Press.

Hardy, G. H.; Littlewood, J. E.; Pólya, G. (1952). Inequalities. Cambridge Mathematical Library (second ed.). Cambridge: Cambridge University Press.

Wikipedia contributors. "Cauchy–Schwarz inequality." Wikipedia, The Free Encyclopedia. Wikipedia, The Free Encyclopedia, 12 Apr. 2020. Web. 16 Apr. 2020.

Wikipedia contributors. "Minkowski inequality." Wikipedia, The Free Encyclopedia. Wikipedia, The Free Encyclopedia, 2 Apr. 2020. Web. 12 Apr. 2020.

Wikipedia contributors. "Norm (mathematics)." Wikipedia, The Free Encyclopedia. Wikipedia, The Free Encyclopedia, 2 Mar. 2020. Web. 5 Apr. 2020.)

Weisstein, Eric W. "Vector Norm." From MathWorld--A Wolfram Web Resource. https://mathworld.wolfram.com/VectorNorm.

https://www.cis.upenn.edu/~cis515/cis515-11-sl4.pdf

Vector Norms and Inequalities with Python