Proving Ky Fan norm (nuclear norm) kXk∗ is a norm
Andersen Ang
Mathématique et recherche opérationnelle
UMONS, Belgium
[email protected] Homepage: angms.science
First draft : December 28, 2018
Last update : October 16, 2019
Quick recall of Singular Value Decomposition
The SVD of a given matrix A ∈ IRm×n is a factorization in the form
A = UΣV> ,
U ∈ IRm×m
I the columns of U are the left-singular vectors of A
I these vectors are a set of orthonormal eigenvectors of AA>
Σ ∈ IRm×n
I Σ is diagonal
I the diagonal elements of Σ, denoted as σi , i ∈ [1, 2, . . . , min{m, n}],
are the singular values of A
I σi are all non-negative
I Non-zero σi are the the square roots of the non-zero eigenvalues of
both A> A and AA>
I convention : σi are sorted as σ1 ≥ σ2 ≥ · · · ≥ σmin{m,n} ≥ 0
V ∈ IRn×n
I the columns of V are the right-singular vectors of A
I these vectors are a set of orthonormal eigenvectors of A> A
2 / 11
Ky Fan norm (a.k.a. the Nuclear norm)
Ky Fan norm1 of a matrix A ∈ IRm×n is defined as
min{m,n}
X
kAk∗ = |σi (A)|.
i
Note. As σi ≥ 0, we can drop the absolute value sign.
Ky Fan norm of a matrix is the sum of the singular values of that matrix.
The Ky Fan k norm is defined as the sum of the k largest singular values.
Short hand notation : let σA be the vector holding all the singular values
of A, we can express Ky Fan norm as the l1 norm of σA
kAk∗ = kσA k1 .
1
Ky Fan. ”Maximum properties and inequalities for the eigenvalues of completely
continuous operators”. Proceedings of the National Academy of Sciences of the United
States of America. 37 (11): 760 - 766. 1951
3 / 11
Properties of Ky Fan norm
It is a norm, and therefore :
I k · k∗ is a convex function on the set of m × n matrices
I k · k∗ satisfies the triangle inequality
It is not differentiable
The dual norm of k · k∗ is the spectral norm k · k2
hX, Yi ≤ kXk∗ kYk2
It is the special case of Schatten p-norm where p = 1
This document : show the proof that Ky Fan norm is a norm.
4 / 11
Proving Ky Fan norm is a norm
Let the space IRm×n be V . Let f (X) = kXk∗ be a function on V .
To show kXk∗ is a norm, we need to show
1 f is a non-negative real-value function defined on V
2 f (X) = 0 if and only if X = 0
3 f (αX) = |α|f (X) for all X ∈ V and scalar α ∈ IR
4 f (X + Y) ≤ f (X) + f (Y) for all X, Y ∈ V
Items 1-3 are easy to show :
On 1 : by definition of Ky Fan norm as a sum of non-negative
singular values
On 2 : the singular values of zero matrix 0 are all zero, so f (0) = 0.
Furthermore, as singular values are always non-negative, there does
not exist a matrix A with negative singular values, so for f (X) = 0,
X can only be 0.
On 3 : by the fact that −X, X and kX have the same set of singular
values
5 / 11
Proving Ky Fan norm is a norm
To show kXk∗ is a norm, the hard part is to show the function
f (X) = kXk∗ satisfies the triangle inequality on V :
f (X + Y) ≤ f (X) + f (Y), ∀ X, Y ∈ V.
To prove this we need an equality between Ky Fan norm and a function :
sup hQ, Ai = kAk∗ .
σ1 (Q)≤1
The next 3 slides will prove this by
Showing sup hQ, Ai ≥ kAk∗
σ1 (Q)≤1
Showing sup hQ, Ai ≤ kAk∗
σ1 (Q)≤1
Conditions ≥ and ≤ means =
Note : this is exactly this reply made by David Speyer’s on a question in
stackexchange.. For the reference of the whole proof process for the next 4 slides,
see Michael Grant’s reply on this stackexchange thread.
6 / 11
Showing sup hQ, Ai ≥ kAk∗
σ1 (Q)≤1
Let A = UΣV> , now consider a matrix Q constructed as
Q = UΣ0 V> = U[I 0]V> = UV> ,
where Σ0 = [I 0] is a matrix in IRm×n with all diagonal elements equal to
1. Note that the largest singular value of Q is 1. Now the inner product
between Q and A is
(1) (2)
hQ, Ai = Tr(Q> A) = Tr(VU> UΣV> ) = TrΣ = kAk∗ ,
Q=UV>
where (1) is due to U is unitary U> U = Im and (2) is due to the property
Tr(ABC) = Tr(CAB) and V is also unitary.
7 / 11
Showing sup hQ, Ai ≥ kAk∗
σ1 (Q)≤1
Note that it is universal true that, for all function f (x) and a set C, we
always have the inequality :
sup f (x) ≥ f (x0 ), ∀x0 ∈ C.
x∈C
Therefore, the expression hQ, Ai = kAk∗ can be treated as a
Q=UV>
function f on Q evaluated at the specific Q0 = UV> . Hence, we have
sup f (Q) ≥ f (Q0 ) = f (UV> ) = kAk∗
σ1 (Q)≤1
The inequality above means, for all possible Q such that σ1 (Q) ≤ 1
(spectral norm of Q is at most 1), the function f at a point Q0 = UV>
(which fulfil σ1 (Q0 ) ≤ 1), is lower bounded by the supremum on f over all
possible Q such that σ1 (Q) ≤ 1.
That is, we now have
sup f (Q) ≥ kAk∗ (1)
σ1 (Q)≤1 8 / 11
Showing sup hQ, Ai ≤ kAk∗
σ1 (Q)≤1
sup Tr Q> A
sup f (Q) = Definition of inner product
σ1 (Q)≤1 σ1 (Q)≤1
sup Tr Q> UΣV >
= SVD of A
σ1 (Q)≤1
sup Tr V> Q> UΣ
= Property of trace of product
σ1 (Q)≤1
sup Tr (UQV)> Σ
=
σ1 (Q)≤1
= sup hUQV, Σi Definition of inner product
σ1 (Q)≤1
P
= sup (UQV)ii σi Σ is diagonal
σ1 (Q)≤1 i
σi u>
P
= sup i Qvi
σ1 (Q)≤1 i
P
≤ sup σi σ1 (Q)
σ1 (Q)≤1 i
P
= i σi = kAk∗
So we now have sup f (Q) ≤ kAk∗ , together with (1) we showed
σ1 (Q)≤1
sup hQ, Ai = kAk∗ .
σ1 (Q)≤1
9 / 11
Ky Fan norm satisfies the triangle inequality
Now we have
sup hQ, Ai = kAk∗ ,
σ1 (Q)≤1
we can now prove Ky Fan norm satisfies the triangle inequality.
Now consider two matrices A, B ∈ IRm×n , apply the equality we just
proved : replace A by A + B , we get
kA + Bk∗ = sup hQ, A + Bi
σ1 (Q)≤1
The supremum of inner product itself obeys the triangle inequality, thus
sup hQ, A + Bi ≤ sup hQ, Ai + sup hQ, Bi
σ1 (Q)≤1 σ1 (Q)≤1 σ1 (Q)≤1
Therefore,
kA + Bk∗ ≤ sup hQ, Ai + sup hQ, Bi
σ1 (Q)≤1 σ1 (Q)≤1
= kAk∗ + kBk∗ .
That is, Ky Fan norm satisfies the triangle inequality. 10 / 11
Last page - summary
The Ky Fan norm of matrix A ∈ IRm×n
min{m,n}
X
kAk∗ = σi .
i
Proving k · k∗ is a norm :
I It satisfies kAk∗ = 0 only if A = 0 and ktAk∗ = |t|kAk∗
I It satisfies kA + Bk∗ ≤ kAk∗ + kBk∗ .
The proof based on the equality sup hQ, Ai = kAk∗
σ1 (Q)≤1
Hence k · k∗ is a convex function on matrices
What’s next : showing the sub-differential of the Ky Fan norm is the set
n o
∂kXk∗ = UV> + W W ∈ IRm×n , U> W = 0, WV = 0, kWk2 ≤ 1
End of document
11 / 11