Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
6 views50 pages

Lec 8

The document discusses eigenvectors and their properties, including definitions, theorems, and applications in matrix theory. It explains how to find eigenvalues and eigenvectors, the significance of symmetric matrices, and the construction of matrices with specified eigenvectors and eigenvalues. Additionally, it explores the visualization of quadratic forms and the implications of eigenvalues on the curvature of these forms.

Uploaded by

Muhammad Furrukh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views50 pages

Lec 8

The document discusses eigenvectors and their properties, including definitions, theorems, and applications in matrix theory. It explains how to find eigenvalues and eigenvectors, the significance of symmetric matrices, and the construction of matrices with specified eigenvectors and eigenvalues. Additionally, it explores the visualization of quadratic forms and the implications of eigenvalues on the curvature of these forms.

Uploaded by

Muhammad Furrukh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

KASHIF JAVED

EED, UET, Lahore

1
Lecture 8
Eigenvectors and the Anisotropic
Multivariate Normal Distribution

KASHIF JAVED
Readings: EED, UET, Lahore
▪ https://people.eecs.berkeley.edu/~jrs/189/
2
Eigenvectors
• Given a square matrix 𝐴, if 𝐴𝑣 = 𝜆𝑣 for some vector 𝑣 ≠ 0, scalar 𝜆, then
𝑣 is an eigenvector of 𝐴 and 𝜆 is the eigenvalue of 𝐴 associated with 𝑣.

• It means that 𝑣 is a magical vector that, after being multiplied by 𝐴, still


points in the same direction, or in exactly the opposite direction

• The eigenvector 𝑣 is said to be normalized if 𝑣 = 1


𝑣 𝑇 𝐴𝑣 = 𝜆𝑣 𝑇 𝑣 = 𝜆
KASHIF JAVED
EED, UET, Lahore

3
Eigenvectors
3 5
4 4
• 𝐴= 5 3
4 4

• Find the eigenvalues and eigenvectors?

KASHIF JAVED
EED, UET, Lahore

4
Eigenvectors
3 5
4 4
• 𝐴= 5 3
4 4

• 𝐴 − 𝜆𝐼 . 𝑣 = 0
• Find the roots of 𝐴 − 𝜆𝐼

KASHIF JAVED
EED, UET, Lahore

5
Eigenvectors
3 5
4 4
• 𝐴= 5 3
4 4

• 𝜆2 − 1.5𝜆 − 1 = 0

• 𝜆1 = 2
• 𝜆2 = −1/2

KASHIF JAVED
EED, UET, Lahore

6
Eigenvectors
3 5
4 4
• 𝐴= 5 3
4 4

1/ 2
• 𝜆1 = 2 and its corresponding eigenvector: 𝑣=
1/ 2

−1/ 2
• 𝜆2 = −1/2 and its corresponding eigenvector: 𝑤 =
1/ 2
KASHIF JAVED
EED, UET, Lahore

7
Theorem
• Theorem: If 𝑣 is eigenvector of 𝐴 with eigenvalue 𝜆, then 𝑣 is eigenvector
of 𝐴𝑘 with eigenvalue 𝜆𝑘 , where 𝑘 is a +ve integer

• Proof: 𝐴2 𝑣 = 𝐴 𝜆𝑣 = 𝜆 𝐴𝑣 = 𝜆2 𝑣, etc

KASHIF JAVED
EED, UET, Lahore

8
Eigenvectors

KASHIF JAVED
EED, UET, Lahore
For most matrices, most vectors don’t have this property. So, the ones
that do are special, and we call them eigenvectors. 9
Eigenvectors

KASHIF JAVED
EED, UET, Lahore
Clearly, when you scale an eigenvector, it’s still an eigenvector. Only the
direction matters, not the length. 10
Theorem
• Theorem: If 𝐴 is invertible, then 𝑣 is eigenvector of 𝐴−1 with eigenvalue 1Τ𝜆

1 1
• Proof: 𝐴−1 𝑣 = 𝐴−1 𝐴𝑣 = 𝜆 𝑣
𝜆

KASHIF JAVED
EED, UET, Lahore

11
Eigenvectors

KASHIF JAVED
EED, UET, Lahore
Look at the figures but go from right to left.
12
Eigenvectors
• When you invert a matrix, the eigenvectors don’t change, but the
eigenvalues get inverted

• When you square a matrix, the eigenvectors don’t change, but the
eigenvalues get squared

KASHIF JAVED
EED, UET, Lahore

13
Spectral Theorem
• Every real, symmetric 𝑛 × 𝑛 matrix has real eigenvalues and 𝑛
eigenvectors that are mutually orthogonal, i.e., 𝑣𝑖𝑇 𝑣𝑗 = 0 for all 𝑖 ≠ 𝑗

• We can use them as a basis for ℝ𝑛 .

KASHIF JAVED
EED, UET, Lahore

14
Building a Matrix with Specified
Eigenvectors
• There are a lot of applications where you’re given a matrix, and you want
to extract the eigenvectors and eigenvalues.

• But when you’re learning the math, it’s more intuitive to go in the opposite
direction

• Suppose you know what eigenvectors and eigenvalues you want, and you
want to create the matrix that has those eigenvectors and eigenvalues
KASHIF JAVED
EED, UET, Lahore

15
Building a Matrix with Specified
Eigenvectors
• Choose 𝑛 mutually orthogonal unit 𝑛-vectors 𝑣1 , . . . , 𝑣𝑛 they specify an
orthonormal coordinate system
• Let 𝑉 = [𝑣1 , . . . , 𝑣𝑛 ] ⇐ 𝑛 × 𝑛 matrix
• Observe: 𝑉 𝑇 𝑉 = I off-diagonal 0’s because the vectors are orthogonal
diagonal 1’s because they’re unit vectors
• ⇒ 𝑉 𝑇 = 𝑉 −1 ⇒ 𝑉 𝑉 𝑇 = I
• 𝑉 is orthonormal matrix: acts like rotation (or reflection)
KASHIF JAVED
EED, UET, Lahore

16
Building a Matrix with Specified
Eigenvectors
• Choose some radii 𝜆𝑖 :

𝜆1 ⋯ 0
• Let Λ = ⋮ ⋱ ⋮
0 ⋯ 𝜆𝑛

• A diagonal matrix of eigenvalues

KASHIF JAVED
EED, UET, Lahore

17
Building a Matrix with Specified
Eigenvectors
• Definition of eigenvector: 𝐴𝑉 = 𝑉Λ

• This is the same definition of eigenvector that was given at the start of the
lecture—𝐴𝑣 = 𝜆𝑣—but this version covers all 𝑛 eigenvectors in one
statement
• How do we find the 𝐴 that satisfies this equation?

⇒ 𝐴𝑉𝑉 ⊤ = 𝑉Λ𝑉 ⊤ which leads us to . . .


KASHIF JAVED
EED, UET, Lahore

18
Theorem
𝐴 = 𝑉Λ𝑉 ⊤ = σ𝑛𝑖=1 𝜆𝑖 𝑣𝑖 𝑣𝑖 𝑇 has chosen eigenvectors/values

outer product: 𝑛 × 𝑛 matrix, rank 1

• This is a matrix factorization called the eigendecomposition


• Λ is the diagonalized version of 𝐴
• Every real, symmetric matrix has one
KASHIF JAVED
EED, UET, Lahore

19
Building a Matrix with Specified
Eigenvectors
• Example: Using the eigenvectors and eigenvalues from the start of the
lecture
1 −1 1 1 3 5
√2 √2
2 0 √2 √2 4 4
𝐴= 1 1 0
−1
−1 1 = 5 3
2
√2 √2 √2 √2 4 4

• This completes our task of finding a symmetric matrix with specified


orthonormal eigenvectors and eigenvalues
• It is more common in practice KASHIF
that you JAVED
need to compute eigenvectors and
EED, UET,
eigenvalues of a symmetric matrix, such Lahore
as a sample covariance matrix
20
Building a Matrix with Specified
Eigenvectors
• Observe: 𝐴2 = 𝑉Λ𝑉 ⊤ 𝑉Λ𝑉 ⊤ = 𝑉Λ2 𝑉 ⊤ 𝐴−2 = 𝑉Λ−2 𝑉 ⊤

• This is another way to see that squaring a matrix squares its eigenvalues
without changing its eigenvectors.

• It also suggests a way to define a matrix square root.

KASHIF JAVED
EED, UET, Lahore

21
Building a Matrix with Specified
Eigenvectors
• Given a symmetric PSD matrix Σ, we can find a symmetric square root
1
matrix 𝐴 = Σ : 2

▪ compute eigenvectors/values of Σ
▪ take square roots of Σ’s eigenvalues (A square root of a diagonal matrix is just
the square roots of the diagonal entries)
▪ reassemble matrix 𝐴 — with the same eigenvectors as Σ but changed
eigenvalues

KASHIF JAVED
EED, UET, Lahore

22
Visualizing Quadratic Forms
• To visualize a symmetric matrix, we can graph something called the
quadratic form, which shows how applying the matrix affects the length of
a vector

• The quadratic form of 𝑀 is 𝑥 ⊤ 𝑀𝑥

KASHIF JAVED
EED, UET, Lahore

23
Visualizing Quadratic Forms
• Let's compare two different functions

2
| 𝑧| = 𝑧𝑇 𝑧 ⇐ quadratic; isotropic; isosurfaces are spheres

| 𝐴−1 𝑥 |2 = 𝑥 𝑇 𝐴−2 𝑥 ⇐ quadratic form of 𝐴−2 (𝐴 is symmetric); anisotropic;


isosurfaces are ellipsoids

• We are going to use eigenvectorsKASHIF


as aJAVED
way to transform the shape of the
EED, UET, Lahore
first function into the shape of the second function
24
𝑧2 𝑥2
𝑧1 𝑥1

KASHIF JAVED
EED, UET, Lahore

25
𝑧2 𝑥2
𝑧1 𝑥1

𝑣1

𝜆1 =2
𝑣2
1
KASHIF JAVED 𝜆2 = −
2
EED, UET, Lahore

26
𝑧2 𝑥2
𝑧1 𝑥1

𝑣1

𝜆1 =2
𝑣2
1
Unit KASHIF JAVED 𝜆2 = −
2
vectors
EED, UET, Lahore

27
𝑧2 𝑥2
𝑧1 𝑥1

𝑣1

𝜆1 =2
𝑣2
1
KASHIF JAVED 𝜆2 = −
2
EED, UET, Lahore

28
𝑧2 𝑥2
𝑧1 𝑥1

𝑣1

𝜆1 =2
𝑣2
1
KASHIF JAVED 𝜆2 = −
2
EED, UET, Lahore

29
𝑧2 𝑥2
𝑧1 𝑥1

𝑣1

𝜆1 =2
𝑣2
1
Matrix is KASHIF JAVED 𝜆2 = −
2
mapping
circles on EED, UET, Lahore
ellipses
30
Visualizing Quadratic Forms
• The isocontours of the quadratic form 𝑥 ⊤ 𝐴−2 𝑥 are ellipsoids determined by
the eigenvectors/values of 𝐴

• {𝑥 ∶ | 𝐴−1 𝑥 |2 = 1} is an ellipsoid with axes 𝑣1 , 𝑣2 , … 𝑣𝑛 and radii


𝜆1 , 𝜆2 , … 𝜆𝑛 because if 𝐴−1 𝑥 = 𝑣𝑖 has length 1 (𝑣𝑖 lies on circle), 𝑥 = 𝐴𝑣𝑖 has
length 𝜆𝑖 (𝐴𝑣𝑖 lies on the ellipsoid)

• Special case: 𝐴 is diagonal ⇔ eigenvectors are coordinate axes


⇔ ellipsoids are axis-aligned KASHIF JAVED
EED, UET, Lahore

31
Visualizing Quadratic Forms
• A symmetric matrix 𝑀 is
• positive definite if 𝑤 𝑇 𝑀𝑤 > 0 for all 𝑤 ≠ 0. ⇔ all eigenvalues positive

• positive semidefnite if 𝑤 𝑇 𝑀𝑤 ≥ 0 for all 𝑤. ⇔ all eigenvalues nonnegative

• indefinite if at least one +ve eigenvalue & at least one -ve eigenvalue

• invertible if no zero eigenvalueKASHIF JAVED


EED, UET, Lahore

32
Visualizing Quadratic Forms
flat trough, whole line of minina

𝜆=0

KASHIF JAVED
EED, UET, Lahore
Positive eigenvalues correspond to axes where the curvature goes up;
negative eigenvalues correspond to axes where the curvature goes down. 33
Visualizing Quadratic Forms
• What does this tell us about 𝑥 ⊤ 𝐴−2 𝑥

• Every squared matrix is positive semidefinite, including 𝐴−2 .


• Eigenvalues of 𝐴−2 are squared, cannot be negative
• If 𝐴−2 exists, it is positive definite. An invertible matrix has no zero
eigenvalues

KASHIF JAVED
EED, UET, Lahore

34
Anisotropic Gaussians
• A multivariate normal distribution (Gaussian)

1 1
𝑋 ~ 𝒩 𝜇, Σ : 𝑓 𝑥 = 𝑒𝑥𝑝 − (𝑥 − 𝜇)𝑇 Σ −1 (𝑥 − 𝜇)
(2𝜋)𝑑 |Σ| 2

• 𝑋 and µ are 𝑑-vectors. 𝑋 is a random variable with mean µ


• |Σ| is determinant of Σ, which is a 𝑑 × 𝑑 SPD covariance matrix
• Σ −1 is a 𝑑 × 𝑑 SPD precision matrix
KASHIF JAVED
EED, UET, Lahore

35
Anisotropic Gaussians
• A multivariate normal distribution (Gaussian)

1 1
𝑋 ~ 𝒩 𝜇, Σ : 𝑓 𝑥 = 𝑒𝑥𝑝 − (𝑥 − 𝜇)𝑇 Σ −1 (𝑥 − 𝜇)
(2𝜋)𝑑 |Σ| 2

1
• Write 𝑓(𝑥) = 𝑛(𝑞(𝑥)) = 𝑒𝑥𝑝 −1/2(𝑞(𝑥)) and 𝑛 is a scalar
(2𝜋)𝑑 |Σ|

• where 𝑞(𝑥) = (𝑥 − 𝜇)𝑇 Σ −1 (𝑥 − 𝜇)


KASHIF JAVED
EED, UET, Lahore

36
Anisotropic Gaussians
• Write 𝑓(𝑥) = 𝑛(𝑞(𝑥)), where 𝑞(𝑥) = (𝑥 − 𝜇)𝑇 Σ −1 (𝑥 − 𝜇)
↑ ↑
ℝ → ℝ, exponential ℝ𝑑 → ℝ, quadratic

• Now 𝑞(𝑥) is a function we understand—it’s just a quadratic bowl centered


at 𝜇, the quadratic form of the precision matrix Σ −1 .
• The other function 𝑛(·) is a simple, monotonic, function, an exponential of
the negation of half its argument
• This mapping 𝑛(·) does not change theJAVED
KASHIF isosurfaces.
EED, UET, Lahore

37
Anisotropic Gaussians
• Principle: given monotonic 𝑛 : ℝ → ℝ, isosurfaces of 𝑛(𝑞(𝑥)) are same as
𝑞(𝑥) (different isovalues)

KASHIF JAVED
EED, UET, Lahore

38
KASHIF JAVED
A paraboloid
EED, (left) becomes
UET, a
Lahore
bivariate Gaussian (right) after you
compose it with a scalar function
39
(center).
Anisotropic Gaussians
• One of the main ideas is that if you understand the isosurfaces of a
quadratic function, then you understand the isosurfaces of a Gaussian,
because they’re the same.

• The differences are in the isovalues—in particular, the Gaussian achieves


its maximum at the mean, and decreases to zero as you move infinitely far
away from the mean

KASHIF JAVED
EED, UET, Lahore

40
Anisotropic Gaussians
• The isocontours of (𝑥 − 𝜇)𝑇 Σ −1 (𝑥 − 𝜇) are determined by
eigenvectors/values of Σ1/2 .

• Next lecture, we’ll consider the implications of this a bit more.

KASHIF JAVED
EED, UET, Lahore

41
Covariance
• Let 𝑅, 𝑆 be random variables—column vectors or scalars

• 𝐶𝑜𝑣 𝑅, 𝑆 = 𝐸 𝑅 − 𝐸 𝑅 𝑆 − 𝐸𝑆 𝑇
= 𝐸 𝑅𝑆 𝑇 − 𝜇𝑅 𝜇𝑆𝑇
• 𝑉𝑎𝑟(𝑅) = 𝐶𝑜𝑣(𝑅, 𝑅)
• If 𝑅 is a vector, covariance matrix for 𝑅 is

𝑣𝑎𝑟(𝑅1 ) ⋯ 𝐶𝑜𝑣(𝑅1 , 𝑅𝑑 )
𝑣𝑎𝑟(𝑅) = ⋮ ⋱ ⋮
𝐶𝑜𝑣(𝑅𝑑 , 𝑅1 ) ⋯ 𝑣𝑎𝑟(𝑅𝑑 )
KASHIF JAVED
EED, UET, Lahore

42
Iris Flower: Pairwise Scatter plot

KASHIF JAVED
EED, UET, Lahore

43
Iris Flower: Pairwise Scatter plot

KASHIF JAVED
EED, UET, Lahore

44
Covariance
• For a Gaussian 𝑅 ~ 𝒩 𝜇, Σ , one can show that 𝑉𝑎𝑟(𝑅) = Σ

• 𝑅𝑖 , 𝑅𝑗 independent ⇒ 𝐶𝑜𝑣(𝑅𝑖 , 𝑅𝑗 ) = 0

• the reverse implication is not generally true, but . . .

• 𝐶𝑜𝑣(𝑅𝑖 , 𝑅𝑗 ) = 0 AND multivariate normal dist. ⇒ 𝑅𝑖 , 𝑅𝑗 independent

KASHIF JAVED
EED, UET, Lahore

45
Covariance
• all features pairwise independent ⇒ 𝑉𝑎𝑟(𝑅) is diagonal

• the reverse is not generally true, but . . .

• 𝑉𝑎𝑟(𝑅) is diagonal AND joint normal


⇔ axis-aligned Gaussian; squared radii on diagonal of Σ = 𝑉𝑎𝑟(𝑅)
⇔ 𝑓(𝑥) = 𝑓(𝑥1 ) 𝑓(𝑥2 ) · · · 𝑓(𝑥2 )

multivariate
KASHIF
JAVED
Univariate Gaussians
EED, UET, Lahore

46
Covariance
• When the features are independent, you can write the multivariate
Gaussian PDF as a product of univariate Gaussian PDFs

• When they aren’t, you can do a change of coordinates to the eigenvector


coordinate system, and write it as a product of univariate Gaussian PDFs
in eigenvector coordinates

KASHIF JAVED
EED, UET, Lahore

47
Bivariate Gaussian Distribution

KASHIF JAVED
𝑥2EED, UET, Lahore
𝑥1

48
Bivariate Gaussian Distribution

KASHIF JAVED
When the variables areEED, UET, Lahore
independent, the major axes of the
density are parallel to the input axes - the density becomes
an ellipse if the variances are different 49
Bivariate Gaussian Distribution

KASHIF JAVED
EED, UET,
The density rotates depending Lahore
on the sign of the covariance.

50

You might also like