Lecture 11 Singular value decomposition
Mathématiques appliquées (MATH0504-1)
B. Dewals, Ch. Geuzaine
23/12/2022 1
Singular value decomposition (SVD) at a glance …
Motivation: the image of the unit sphere S under
any m n matrix transformation is a hyperellipse.
n x m A A x = Ax m
n
x Ax
S
Through the SVD, we will infer important
properties of matrix A from the shape of AS! 2
Singular value decomposition (SVD) at a glance …
The singular value decomposition (SVD)
is a particular matrix factorization.
n x m A A x = Ax m
n
x Ax
S
Through the SVD, we will infer important
properties of matrix A from the shape of AS! 3
Why is the singular value decomposition
of particular importance?
The reasons for looking at SVD are twofold:
1. The computation of the SVDSVD
is used
is used
as an
as an
intermediate step in many algorithms of
practical
of practical
interest.
interest.
2. From a conceptual point of view, SVD also
enables
the SVD aenables
deeperaunderstanding
deeper understanding
of many
problems
of many aspects
in linear
ofalgebra.
linear algebra.
4
Learning objectives & outline
Become familiar with the SVD and its geometric
interpretation, and get aware of its significance
1. Geometricof
Reminder observations
some fundamentals in linear algebra
2. Reduced SVD
Geometric interpretation
3. FFrom “reduced SVD” to “full SVD”,
ull SVD
and formal definition
4. Formal definition
Existence and uniqueness
5
1 - Reminder: fundamentals in linear algebra
In this section, we briefly review the concepts of adjoint matrix,
matrix rank, unitary matrix as well as matrix norms (Chapters 2 and 3
in Trefethen & Bau, 1997).
Adjoint of a matrix
The adjoint (or Hermitian conjugate)
of an m n matrix A, written A*, is the n m matrix
• whose i, j entry
• is the complex conjugate of the j, i entry of A.
If A = A*, A is Hermitian (or self-adjoint).
For a real matrix A,
• the adjoint is the transpose: A* = AT,
• if the matrix is Hermitian, that is A = AT,
then it is symmetric.
7
Matrix rank
The rank of a matrix is the number of linearly
independent columns (or rows) of a matrix.
The numbers of linearly independent columns
and rows of a matrix are equal.
An m n matrix of full rank is one that has the
maximal possible rank (the lesser of m and n).
If m ≥ n, such a matrix is characterized by the
property that it maps no two distinct vectors to the
same vector.
8
Matrix rank
2 0 AS
Image of the unit A
1
1
2
sphere S by a full-rank
matrix: no distinct
vectors are mapped to S
the same vector.
2 1
Image of the unit A
1
1 AS
2
sphere S by a rank-
deficient matrix: distinct
vectors are mapped to S
the same vector.
9
Unitary matrix
A square matrix Q ℂmm, is unitary
(or orthogonal, in the real case), if
Q* = Q−1,
i.e.
Q* Q = I.
The columns qi of a unitary matrix form
an othonormal basis of ℂm : (qi)* qj = δij,
with δij the Kronecker delta.
10
A rotation matrix
is a typical example of a unitary matrix
A rotation matrix R may write:
cos q sin q
R
sin q cos q
The image of a vector is the same vector,
rotated counter clockwise by an angle q.
Matrix R
• is orthogonal
• and R* R = RT R = I.
11
(Induced) matrix norms are defined
from the action of the matrix on vectors
For a matrix A ℂmn, and given vector norms
• ‖ ‧ ‖(n) on the domain of A
• ‖ ‧ ‖(m) on the range of A
the induced matrix norm ‖ ‧ ‖(n) is the smallest
number C for which the following inequality holds
for all x ℂn:
Ax m
C x n
It is the maximum factor by which A can “stretch” a
vector x. 12
(Induced) matrix norms are defined
from the action of the matrix on vectors
The matrix norm can be defined equivalently in
terms of the images of the unit vectors under A:
Ax m
A m ,n maxn maxn Ax m
x
x n 0
x n
x
x n 1
This form is convenient for visualizing induced
matrix norms, as in this example.
A 2 Ax 2 2.9208
1 2
A
0 2
x 2 1 13
2 – Geometric interpretation
In this section, we introduce conceptually the SVD, by means of a
simple geometric interpretation (Chapter 4 in Trefethen & Bau, 1997).
Geometric interpretation
Let S be the unit sphere in ℝn.
Consider any matrix A ℝmn, with m ≥ n.
Assume for the moment that A has full rank n.
x x : x 1
n
A
12
n
2
x 2 xi
S i 1
15
Geometric interpretation
The image AS is a hyperellipse in ℝm.
This fact is not obvious; but let us assume for now
that it is true. It will be proved later.
x Ax
A
S AS
16
A “hyperellipse” is the m-dimensional
generalization of an ellipse in 2D
In ℝm, an hyperellipse is a surface obtained by
• stretching the unit sphere in ℝm
• by some factors s1, …, sm (possibly zero)
• in some orthogonal directions u1, …, um ℝm
For convenience, let us take
the ui to be unit vectors,
i.e. ‖ui‖2 = 1. s2u2 AS
The vectors {si ui} are the
principal semiaxes
of the hyperellipse. s1u1
17
A “hyperellipse” is the m-dimensional
generalization of an ellipse in 2D
If A has rank r,
exactly r of the lengths si will be nonzero.
In particular, if m ≥ n,
at most n of them will be nonzero.
x s2u2 AS
A
S s1u1
18
Singular values
We stated at the beginning that the SVD enables
characterizing properties of matrix A from the
shape of AS. Here we go for three definitions …
We define the n singular values
of matrix A as the lengths of
the n principal semiaxes of AS,
noted s1, …, sn.
s2u2
It is conventional to number
the singular values in
descending order: s1u1 AS
s1 ≥ s2 ≥ … ≥ sn. 19
Left singular vectors
We also define the n left singular vectors of
matrix A as
• the unit vectors {u1, …, un}
• oriented in the directions of
the principal semiaxes of AS,
• numbered to correspond
with the singular values.
s2u2
Thus, the vector si ui is the AS
ith largest principal semiaxis. s1u1
20
Right singular vectors
We also define the n right singular vectors of
matrix A as the unit vectors {v1, …, vn} S that
are the preimages of the principal semiaxes of AS,
numbered so that A vj = sj uj.
v1 v 2
A s2u2
S s1u1 AS
21
Important remarks
The terms “left” and “right” singular vectors
will be understood later as we move forward
with a more formal description of the SVD.
In the geometric interpretation presented so far,
we assumed that matrix A is real and m = n = 2.
v1 v 2
Actually, the SVD appliesA s2u2
• to both real and complex matrices,
S • whatever the number of dimensions.
s1u1 AS
22
3 – From reduced to full SVD, and formal definition
In this section, we distinguish between the so-called “reduced SVD”,
often used in practice, and the “full SVD”. We also introduce the
formal definition of SVD (Chapter 4 in Trefethen & Bau, 1997).
The equations relating right and left singular
vectors can be expressed in matrix form
We just mentioned that the equations relating right
singular vectors {vj} and left singular vectors {uj}
can be written
A v j = sj u j 1≤ j≤n
This collection of vector equations can be
expressed as a matrix equation.
s1
A s2
v 1 v2 … v n = u1 u2 … u n
sn
24
The equations relating right and left singular
vectors can be expressed in matrix form
This matrix equation can be written in a more
compact form: to distinguish from
U, S in the “full SVD”
ˆ
AV U S ˆ
with
• S
Ŝ an n n diagonal matrix with real entries
(as A was assumed to have full rank n)
Û an m n matrix with orthonormal columns
• U
• V an n n matrix with orthonormal columnss1
A s2
v 1 v2 … v n = u 1 u 2 … u n
Thus, V is unitary (i.e. V* = V1), and we obtain: s
n
ˆ ˆ
A U SV *
25
Reduced SVD
The factorization of matrix A in the form
ˆ ˆ
A U SV *
is called a reduced singular values decomposition,
or reduced SVD, of matrix A.
Schematically, it looks like this (m ≥ n):
n n
n n
m = m n n
A Û Ŝ V* 26
From reduced SVD to … full SVD
The columns of Û are n orthonormal vectors in the
m-dimensional space ℂm.
Unless m = n, they do not form a basis of ℂm,
nor is Û a unitary matrix.
However, we may “upgrade” Û to a unitary matrix!
n n
n n
m = m n n
A Û Ŝ V* 27
From reduced SVD to … full SVD
Let us adjoin an additional m n orthonormal
columns to matrix Û, so that it becomes unitary.
The m n additional orthonormal columns are
chosen arbitrarily and the result is noted U.
However, S must change too …
n n m–n
n n
m =m Û n n
A U Ŝ V* 28
From reduced SVD to … full SVD
For the product to remain unaltered, the last m n
columns of U should be multiplied by zero.
Accordingly, let S be the m n matrix consisting of
^
• S in the upper n n block
• together with m n rows of zeros below.
n
m = “silent” columns
A U S V* 29
From reduced SVD to … full SVD
We get a new factorization of A, called full SVD:
A = U S V*
• U is an m m unitary matrix,
• V is an n n unitary matrix,
• S is an m n diagonal matrix with real entries
n
m =
A U S V* 30
Generalization to the case of a matrix A
which does not have full rank
If matrix A is rank-deficient (i.e. of rank r < n),
only r (instead of n) of the left singular vectors are
deduced from the size of the hyperellipse
BUT the full SVD still applies,
• by introducing m r (instead of m n)
n
additional arbitrary orthonormal columns
to construct the unitary matrix U;
• the matrix V also needs n r arbitrary
m =
orthonormal columns to extend the r columns
determined from the hyperellipse geometry
• Amatrix S has onlyUr non-zero diagonal
S entries.
V * 31
Formal definition of the SVD
Let m and n be arbitrary (we do not require m ≥ n).
Given A ℂmn, not necessarily of full rank,
a singular value decomposition of A is a factorization
A = U S V*
where s1 … sp nonnegative, in nonincreasing order
n
S ℝmn
U ℂmm V* ℂnn
is
m = is square, is square,
real
unitary unitary
diagonal
A U S V* 32
Consequently, the image of the unit sphere in ℝn
under a map A = U S V* is a hyperellipse in ℝm
1. The unitary map V* preserves the sphere
2. The diagonal matrix S stretches the sphere
into a hyperellipse
3. The final unitary map U rotates, or reflects,
the hyperellipse without changing its shape.
Thus,n
• if we can prove thatmm everySmatrix
ℂmnhas an* SVD,nn
Uℂ V ℂ
• we will have proved that the is
image of the unit
m = is square, is square,
sphere under any linear map real
unitary unitary
is indeed a hyperellipse. diagonal
A U S V * 33
4 – Existence and uniqueness
In this section, we demonstrate the existence of the SVD, the
uniqueness of the singular values, as well as under some specific
conditions, the uniqueness of the singular vectors (Chapter 4 in
Trefethen & Bau, 1997).
Every matrix A ℂmn
has a singular value decomposition A = U S V*
To prove the existence of the SVD,
• we first isolate the direction of the largest
action of A,
• then we proceed by induction on the
dimension of A.
The proof takes 5 steps.
35
Every matrix A ℂmn
has a singular value decomposition A = U S V*
Set s1 = ‖A‖2. A maxn Ax
x
x 1
From the definition of the matrix norm,
there must be a vector v1 ℂmn
with ‖v1‖2 = 1 and ‖Av1‖2 = s1
‖v1‖2 = 1
n
‖Av1‖2 = s1
m A v1 = Av1 We note:
Av1
u1
s1 36
Every matrix A ℂmn
has a singular value decomposition A = U S V*
Consider any extensions
• of v1 to an orthonormal basis {vj} of ℂn
• and of u1 to an orthonormal basis {uj} of ℂm
Let U1 and V1 denote the unitary matrices with
columns uj and vj , respectively.
n
m A v1 v j 1 and u1 u j 1
V1 U1 37
Every matrix A ℂmn
has a singular value decomposition A = U S V*
Then we have s w *
U1 AV1 S
* 1
0 B
where 0 is a column vector of dimension m − 1,
w* is a row vector of dimension n − 1, and
B has dimensions (m − 1) (n − 1).
u1* 1 s 1u1 s 1
*
n u
*
w
m A v1 = 0
B
V1 j 1s 1u1 0
* *
U1 u 38
Every matrix A ℂmn has a singular
value decomposition A = U S V* Ax m
A m,n maxn
x x
Furthermore, x n 0 n
s 1 w s 1
*
s 1
s 1 w w s 1 w w
2 * 2 * 12
w
0 B w 2 2
S
Implying (from the definition of matrix norms)
s w w
2 * 12
S 2 1
BUT, since U1 and V1 are unitary, we know that
S 2
U1* AV1 A 2 s1
2
This implies w = 0.
39
Every matrix A ℂmn
has a singular value decomposition A = U S V*
To sum up, this is what we know at this stage:
u*
1
s1 0
A v1 =
B
U1* V1 S
Hence,
A U1 S V1*
40
Every matrix A ℂmn
has a singular value decomposition A = U S V*
If n = 1 or m = 1, we are done!
Otherwise, the submatrix B describes the action
of A on the subspace orthogonal to v1.
By the induction hypothesis, B has an SVD S
s1
B = U2 S2 V2*.
B
Now it is easily verified that
*
1 0 s 1 0 1 0 *
A U1 V1
0 U 2 0 S 2 0 V2
S
is an SVD of A, completing the proof of existence. 41
Every matrix A ℂmn
has a singular value decomposition A = U S V*
Written *
1 0 s 1 0 1 0 *
out in full: U1 0 U 0 S 0 V V1
2 2 2
Unitary matrix Unitarymatrix
s 1 0 1 0 *
U1 V
* 1
0 U 2 S 2 0 V2
s 1 0 *
U1 V
* 1
0 U 2 S 2V2
S
s1
The product of two
unitary matrices is s 1 0 *
another unitary U1 V1 U1 S V1 A
* B
matrix. 0 B
42
Uniqueness
The singular values {sj} are uniquely determined.
If A is square and the sj are distinct, the left and
right singular vectors {uj} and {vj} are uniquely
determined up to complex signs.
v1 v 2
A s2u2
S s1u1 AS
43
Uniqueness
Geometrically, the proof is straightforward:
• if the semiaxis lengths of a hyperellipse are
distinct,
• then the semiaxes themselves are determined
by the geometry, up to signs.
v1 v 2
A s2u2
S s1u1 AS
44
Take-home messages
SVD is an important factorization method, which
applies for all rectangular, real or complex matrices
It decomposes the matrix into three factors
• a unitary matrix
• a real diagonal matrix, with nonnegative entries
• another unitary matrix
It has a broad range of implications and applications!
45
What’s next?
Every matrix is diagonal if only one uses the proper
bases for the domain and range spaces.
SVD vs. eigenvalue decomposition
• existence
• rectangular vs. square matrices
• orthonormal bases in the SVD, not eigenvectors
Link with matrix rank, range, null space, norm …
Low-rank approximations
46
Low-rank approximations of a matrix
~1% ~4%
~ 14 % 100 %
47