Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
67 views27 pages

Lecture Week 13a (Supplementary) : - Constrained Optimisation

1) The singular value decomposition (SVD) provides a factorization of any matrix A into three matrices: A = QDPT, where D is a diagonal matrix containing the singular values of A and Q and P are orthogonal matrices. 2) For a matrix A, the singular values are the square roots of the eigenvalues of the matrix AT A arranged in descending order. The largest singular value corresponds to the maximum length that A can map any unit vector to. 3) The SVD allows decomposing a matrix into orthogonal bases for its column and row spaces, with the singular values representing how much the matrix stretches vectors in those bases. The rank of the matrix is equal to the number of non-zero singular values

Uploaded by

dan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views27 pages

Lecture Week 13a (Supplementary) : - Constrained Optimisation

1) The singular value decomposition (SVD) provides a factorization of any matrix A into three matrices: A = QDPT, where D is a diagonal matrix containing the singular values of A and Q and P are orthogonal matrices. 2) For a matrix A, the singular values are the square roots of the eigenvalues of the matrix AT A arranged in descending order. The largest singular value corresponds to the maximum length that A can map any unit vector to. 3) The SVD allows decomposing a matrix into orthogonal bases for its column and row spaces, with the singular values representing how much the matrix stretches vectors in those bases. The rank of the matrix is equal to the number of non-zero singular values

Uploaded by

dan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Lecture Week 13a (Supplementary)

• Constrained Optimisation

• Singular value decomposition

1
Constrained Optimization
In many applications it is necessary to find the
maximum or minimum values of a quadratic
form Q(x) for the set of x given by kxk = 1.

Example: Find maximum and minimum val-


ues of Q(x) = 9x2 2 2
1 + 4x2 + 2x3 subject to the
condition xT x = kxk2 = 1.
Solution: Q(x) = 9x2 2
1 + 4x2 + 2x3
2

≤ 9x2 2
1 + 9x2 + 9x3
2

= 9(x2 2 2
1 + x2 + x3 )
=9
So the maximum value of Q(x) = 9. This is
the case when x = (1, 0, 0). The matrix for
this quadratic form is
 
9 0 0
A =  0 4 0 .
 
0 0 2
The largest eigenvalue is 9 and the correspond-
ing eigenvector is v1 = x = (1, 0, 0).

Thus the quadratic form is maximal along the


eigenvector with largest eigenvalue given the
constraint kxk = 1.
2
Constrained Optimization

To find the minimum we consider

Q(x) = 9x2 2 2
1 + 4x2 + 2x3
≥ 2x2 2 2
1 + 2x2 + 2x3
= 2(x2 2 2
1 + x2 + x3 )
= 2,
so the minimum value of Q(x) = 2. This value
is achieved when x = (0, 0, 1). Recall:
 
9 0 0
A =  0 4 0 .
 
0 0 2
The smallest eigenvalue is 2 and the corre-
sponding eigenvector is v3 = x = (0, 0, 1).

Thus the quadratic form is minimal along the


eigenvector with smallest eigenvalue given the
constraint kxk = 1.

3
Constrained Optimization

kxk = 1
The values of Q(x) satisfy 3 ≤ Q(x) ≤ 7.

The set of all possible values of xT Ax for kxk =


1 is a closed set.

4
Constrained Optimization

Theorem 66: Let A be an n × n symmetric


matrix and let

m = min{xT Ax : kxk = 1}
M = max{xT Ax : kxk = 1}.
Then M is the greatest eigenvalue λ1 of A and
m is the smallest eigenvalue λn of A.

The value of xT Ax is M = λ1 when x is ori-


ented along the corresponding eigenvector u1.

The value of xT Ax is m = λn when x is oriented


along the corresponding eigenvector un.

Theorem 67: Let A, λ1 and u1 be as in The-


orem 66. The maximum value of xT Ax subject
to the constraints

xT x = 1, xT u1 = 0
is the second greatest eigenvalue λ2 and this
maximum is attained when x is an eigenvector
u2 corresponding to λ2.
5
Constrained Optimization

Example: Find the maximum value of

9x2
1 + 4x 2 + 3x2
2 3
subject to the constraints xT x = 1 and xT u1 =
0, where u1 = (1, 0, 0).

Solution: If x = (x1, x2, x3) then the con-


straint xT u1 = 0 means x1 = 0.

For such unit vectors we have x2 2


2 + x3 = 1 and

9x2 2 2 2 2
1 + 4x2 + 3x3 = 4x2 + 3x3
≤ 4x2
2 + 4x 2
3
= 4(x2 2
2 + x3 )
= 4.
Therefore the constrained maximum does not
exceed 4.

This value is attained for x = (0, 1, 0) which is


the eigenvector for the second greatest eigen-
value of the matrix of the quadratic form.
6
The Singular Value Decomposition
Diagonalization plays important role in appli-
cations BUT not all matrices can be factored
in the form A = P DP −1 with D diagonal.

However a factorization (the singular value


decomposition) A = QDP −1 (with D diago-
nal) is possible for any m × n matrix A.

The |λ| of a symmetric matrix A measures the


amounts that A stretches or shrinks eigenvec-
tors: if Ax = λx and kxk = 1, then
kAxk = kλxk = |λ|kxk = |λ|.
If the eigenvalue λ1 is the eigenvalue with the
greatest magnitude then a corresponding unit
eigenvector v1 identifies the direction of the
greatest stretching.

The length of Ax is maximized when x = v1

This description has an analog for rectangular


matrices.

This will lead to the singular value decompo-


sition.
7
The Singular Value Decomposition

Example: If
" #
4 11 14
A=
8 7 −2
then the transformation x → Ax maps the unit
sphere {x : kxk = 1} in R3 onto an ellipse in
R2 .

Find a unit vector x at which the length kAxk


is maximized and compute its length.

8
The Singular Value Decomposition

Solution: The quantity kAxk2 is maximized at


the same x that maximizes kAxk.

Also,

kAxk2 = (Ax)T (Ax) = xT AT Ax


= xT (AT A)x
and we note that the matrix AT A is symmetric,
since
(AT A)T = AT AT T = AT A.

So the problem is now familiar: maximize the


quadratic form xT (AT A)x subject to the con-
straint kxk = 1.

By Theorem 66 the maximum value is the


greatest eigenvalue λ1 of AT A and the max-
imum value is attained at the unit eigenvector
of AT A corresponding to λ1.

9
The Singular Value Decomposition

We calculate AT A
   
4 8 " # 80 100 40
4 11 14
AT A =  11 7   100 170 140  .
   
8 7 −2
14 −2 40 140 200
The eigenvalues of this matrix are λ1 = 360,
λ2 = 90, λ3 = 0. The corresponding unit
eigenvectors are:
     
1/3 −2/3 2/3
 2/3  , v2 =  −1/3  , v3 =  −2/3  .
v1 =      
2/3 2/3 1/3
The maximum value of kAx||2 is 360, attained
when x is the unit vector v1.

The vector Av1 is a point on the ellipse fur-


thest from the origin
 
" 1/3 "# #
4 11 14  18
Av1 =  2/3  = .

8 7 −2 6
2/3
√ √
The maximum value of kAxk = 360 = 6 10.

10
The Singular Value of Matrix A

Notes: Let A be an m × n matrix. Then AT A


is symmetric and can be orthogonally diago-
nalized.

Let {v1, . . . , vn} be an orthonormal basis for


Rn consisting of unit eigenvectors of AT A, and
λ1, . . . , λn be the associated eigenvalues of AT A.
Then
kAvik2 = (Avi)T Avi = viAT Avi = viT (λi)vi
= λi,
so the eigenvalues of AT A are all nonnegative.

We can always arrange them in descending or-


der so that
λ1 ≥ λ2 ≥ . . . ≥ λn ≥ 0.
Definition: The singular values of A are
the square roots of the eigenvalues λ1, . . . , λn
of AT A, denoted by σ1, . . . , σn arranged in the
descending order. So
q
σi = λi, f or 1 ≤ i ≤ n.
With the notation above, the singular values of
A are the lengths of the vectors Av1, . . . , Avn.
11
The Singular Values of Matrix A

Example: Let A be as in the previous example:


" #
4 11 14
A= .
8 7 −2
The eigenvalues of AT A are λ1 = 360, λ2 = 90,
λ3 = 0 therefore the singular values of A are
√ √ √ √
σ1 = 360 = 6 10, σ2 = 90 = 3 10, σ3 = 0.
The first singular value of A is the maximum
of kAxk subject to kxk = 1, attained at x = v1.

The second singular value of A is the maximum


of kAxk over all unit vectors orthogonal to v1
and this is attained at x = v2.

 
" # 1/3 " #
4 11 14 3
Av2 =  2/3  = .
 
8 7 −2 −9
2/3
This point is on the minor axis of the ellipse.
12
The Singular Value of Matrix A

Theorem 68: Suppose {v1, . . . , vn} is an or-


thonormal basis of Rn consisting of eigenvec-
tors of AT A, arranged so that the correspond-
ing eigenvalues of AT A satisfy λ1 ≥ . . . ≥ λn,
and suppose A has r nonzero singular values.
Then {Av1, . . . , Avr } is an orthogonal basis for
Col A and rank A = r.

Proof: We have:

(Avi)T (Avj ) = viT AT Avj = viT (λj vj ) = λj (viT vj )



0, i 6= j,
=
λ , i = j.
j
Therefore {Av1, . . . , Avr } is an orthogonal set.

The lengths of the vectors {Av1, . . . , Avn} are


the singular values of A, of which the first r
are strictly positive, and hence {Av1, . . . , Avr }
are non-zero vectors. Thus {Av1, . . . , Avr } is a
linearly independent set of orthogonal vectors
in the column space Col A. We must show it
spans Col A.
13
The Singular Value of Matrix A

Proof (cont’d):
For any y in Col A we have y = Ax, where

x = c1v1 + . . . + cnvn
and so

y = Ax
= c1Av1 + . . . + cr Avr
+ cr+1Avr+1 + . . . + cnAvn
= c1Av1 + . . . + cr Avr
+ cr+10 + . . . + cn0.
Therefore y is in Span{Av1, . . . , Avr } which means
that the set {Av1, . . . , Avr } is an orthogonal ba-
sis for Col A.

Hence rank A = dim Col A = r.

14
The Singular Value Decomposition

Notes: The decomposition of A involves an


m × n “diagonal” matrix Σ of the form
" #
D 0
Σ= ,
0 0
where D is an r × r diagonal matrix for some
r not exceeding the smaller of m and n. The
second line in Σ contains m − r rows.

The second column in Σ contains n−r columns.

Theorem 69: Singular Value Decomposition


Let A be an m × n matrix with rank r. Then
there exists an m × n matrix Σ for which the
diagonal entries in D are the first r singular
values of A σ1 ≥ σ2 ≥ . . . ≥ σr > 0, and there
exist an m × m orthogonal matrix U and an
n × n orthogonal matrix V such that
A = U ΣV T .

This factorisation is called the singular value


factorisation of A.
15
The Singular Value Decomposition
Notes: The matrices U and V in A = U ΣV T
are not uniquely determined by A but the di-
agonal entries in Σ are uniquely determined by
the singular values of A.

The columns of U are called left singular vec-


tors of A and the columns of V are called the
right singular vectors of A.

Proof: Let λi be the eigenvalues and vi be


corresponding eigenvectors of AT A. Then
{Av1, . . . , Avr } is an orthogonal basis for Col A.

We normalize each Avi to obtain an orthonor-


mal basis {u1, . . . , ur } for Col A:
1 1
ui = Avi = Avi,
kAvik σi
from which we obtain
Avi = σiui, 1 ≤ i ≤ r.
The set {u1, . . . , ur } can be extended, if neces-
sary, to an orthonormal basis {u1, . . . , um} for
Rm by adding suitable vectors from the orthog-
onal complement of {u1, . . . , ur }.
16
The Singular Value Decomposition

Now let

U = [u1 . . . um], and V = [v1 . . . vn].


By construction, U and V are orthogonal ma-
trices. Now

AV = [Av1 . . . Avr 0 . . . 0] = [σ1u1 . . . σr ur 0 . . . 0]


and

U Σ = [σ u1 . . . σr ur 0 . . . 0] = AV,
that is,
AV = U Σ.
But V is an orthogonal matrix, so V T V = I
and
AV V T = U ΣV T ,
and so
A = U ΣV T .

17
The Singular Value Decomposition
Example: Construct a singular value decom-
position of A:
" #
4 11 14
A= .
8 7 −2
Solution: Step 1: Construct an orthogonal
diagonalization of AT A

We need the eigenvalues and corresponding


eigenvectors of AT A.

We have already calculated them in previous


examples. They are (in descending order): 360,
90, 0.

The corresponding eigenvectors are


     
1/3 −2/3 2/3
v1 = 
 2/3  ,  −1/3  ,
v2 =   −2/3  .
v3 = 
  
2/3 2/3 1/3
Thus
 
1/3 −2/3 2/3
V = [v1 v2 v3] =  2/3 −1/3 −2/3  .
 
2/3 2/3 1/3

18
The Singular Value Decomposition
Step 2: The singular values of A are:
√ √ √ √
σ1 = 360 = 6 10, σ2 = 90 = 3 10, σ3 = 0.
The nonzero values are diagonal values of D:
" √ #
6 10 √0
D= ,
0 3 10
" √ #
6 10 √0 0
Σ = [D0] = .
0 3 10 0
Step 3: Construct U . When A has rank r
the first r columns of U are normalized vectors
obtained from Av1, . . . , Avr . A has two nonzero
singular values so rank A = 2 and
kAv1k = σ1, kAv2k = σ2.
Thus the columns of U are
" # " √ #
1 1 18 3/√10
u1 = Av1 = √ =
σ1 6 10 6 1/ 10
" # " √ #
1 1 3 1/ √10
u2 = Av2 = √ = .
σ2 3 10 −9 −3/ 10
The set {u1, u2} is already a basis for R2. No
additional vectors are needed for U and
U = [u1 u2].
19
The Singular Value Decomposition

Thus the singular value decomposition is

A = U ΣV T
 1 2 2 
3 3 3 
 
√3 √1
" √ #
 10 10  6 10 0 0

 −2

=
  √  −1 2 


0 3 10 0  3 3 3 
 1
√ −3

  
10 10
 
2 −2 1
3 3 3

20
The Singular Value Decomposition

Example: Find a singular value decomposition


for A given by
 
1 −1
A =  −2 2 .
 
2 −2
Solution: A = U ΣV T . First we calculate AT A
" #
9 −9
AT A =
−9 9
Eigenvalues: λ1 = 18 and λ2 = 0 with corre-
sponding eigenvectors:
" √ # " √ #
1/ √2 1/√2
v1 = , v2 = .
−1/ 2 1/ 2
These unit vectors form columns of V :
" √ √ #
1/ √2 1/√2
V = [v1 v2] = .
−1/ 2 1/ 2

21
The Singular Value Decomposition

Singular values of A:
√ √
σ1 = 18 = 3 2, σ2 = 0.
Since there is only one nonzero singular value
the matrix D is of order 1 × 1:

D = (3 2).

The matrix Σ is the same size as A:


   √ 
D 0 3 2 0
Σ= 0 0 = 0 0 .
   
0 0 0 0
To construct U we first calculate Av1 and Av2:
 √   
2/ √2 0
Av1 =  −4/√ 2  , Av2 =  0  .
   

4/ 2 0

22
The Singular Value Decomposition
The only column found for U so far is therefore
 
Av1 1/3
u1 = √ =  −2/3  .
3 2 2/3
The other columns must be found by extending {u1 } to
an orthogonal basis for R3 .

We need two orthonormal vectors orthogonal to u1 .

Each vector must satisfy uT1 x = 0. This is equivalent to


x1 − 2x2 + 2x3 = 0.

Solution is
   
2 −2
w1 =  1  , w2 =  0  .
0 1

23
The Singular Value Decomposition

We apply the Gram-Schmidt process to {w1, w2}.

The result is
 √   √ 
2/√5 −2/√ 45
u2 =  1/ 5  , u3 =  0/√45 
  
.
0 5/ 45
Therefore U = [u1 u2 u3] and the singular value
decomposition takes the form
 √ √  √ 
1/3 2/√5 −2/√ 45 3 2 0
A =  −2/3 1/ 5 4/√45   0 0 ×
  

2/3 0 5/ 45 0 0
" √ √ #
1/√2 −1/√ 2
1/ 2 1/ 2

24
Bases for Fundamental Subspaces
Let A be an m × n matrix;
u1, . . . , um be the left singular vectors;
v1, . . . , vn be the right singular vectors;
σ1, . . . , σr be the singular values;
r be the rank of A;

Then
{u1, . . . , ur }
is an orthonormal basis for Col A (Thrm 68).
 
Also recall that (Col A)⊥ = N ul T
A , and so
{ur+1, . . . , um}
 
is an orthonormal basis for N ul T
A .

Since kAvik = 0 for i > r, the size of N ul A is


n − r and so the set
{vr+1, . . . , vn}
is an orthonormal basis for N ul A.

Finally, since (N ul A)⊥ = Col


 
AT = Row A so
{v1, . . . , vr }
is an orthonormal basis for Row A.
25
Bases for Fundamental Subspaces

   √   √ 
1/3 2/
√ 5 −2/√ 45
 −2/3  , u2 =  1 5/  , u3 =  0/√45 
u1 =      
2/3 0 5/ 45
" √ # " √ #
1/ √2 1/√2
v1 = , v2 = .
−1/ 2 1/ 2

26
The Singular Value Decomposition
The singular value decomposition of A is
A = U ΣV T .
We can write this as
  T 
σ1 . . . 0 0 v1 
 0 ... 0 0   0 

A = [u1 . . . um] 
 
  .. 
 0 0 σr 0   .  
0 0 0 0 T
vn
= σ1u1v1 T + σ u vT + . . . + σ u vT .
2 2 2 r r r
Notes:

The original matrix required m×n floating point


numbers to be saved, while this expansion only
m × r + n × r + r = r(m + n + 1).

Usually some of the singular values are very


small therefore
T + σ u vT + . . . + σ u vT ,
Ak ≈ σ1u1v1 2 2 2 k k k
where k < r.

This is why SVD-based image compression works.

k is the rank of the approximation.


27

You might also like