0% found this document useful (0 votes)

12 views46 pages

Lecture11 Pca

The document is a lecture on Principal Component Analysis (PCA) presented by Matt Gormley at Carnegie Mellon University, covering its role in dimensionality reduction for high-dimensional data. It discusses the mathematical foundation of PCA, including eigenvectors, eigenvalues, and algorithms for computation, along with practical applications in areas like face recognition and image compression. Key reminders include upcoming quizzes and homework assignments related to linear algebra and matrix calculus.

Uploaded by

Nguyễn Tiến Hiệp

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views46 pages

Lecture11 Pca

Uploaded by

Nguyễn Tiến Hiệp

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 46

10-606 Mathematical Foundations for Machine Learning

Machine Learning Department

School of Computer Science
Carnegie Mellon University

Deriving Principal
Component Analysis
(PCA)
Matt Gormley
Lecture 11
Oct. 3, 2018

1
Reminders
• Quiz 1: Linear Algebra (today)
• Homework 3: Matrix Calculus + Probability
– Out: Wed, Oct. 3
– Due: Wed, Oct. 10 at 11:59pm
• Quiz 2: Matrix Calculus + Probability
– In-class, Wed, Oct. 10

3
Q&A
4
DIMENSIONALITY REDUCTION

6
PCA Outline
• Dimensionality Reduction
– High-dimensional data
– Learning (low dimensional) representations
• Principal Component Analysis (PCA)
– Examples: 2D and 3D
– Data for PCA
– PCA Definition
– Objective functions for PCA
– PCA, Eigenvectors, and Eigenvalues
– Algorithms for finding Eigenvectors /
Eigenvalues
• PCA Examples
– Face Recognition
– Image Compression

7
High Dimension Data
Examples of high dimensional data:
– High resolution images (millions of pixels)

8
High Dimension Data
Examples of high dimensional data:
– Multilingual News Stories
(vocabulary of hundreds of thousands of words)

9
High Dimension Data
Examples of high dimensional data:
– Brain Imaging Data (100s of MBs per scan)

Image from (Wehbe et al., 2014)

10
Image from https://pixabay.com/en/brain-mrt-magnetic-resonance-imaging-1728449/
High Dimension Data
Examples of high dimensional data:
– Customer Purchase Data

11
Learning Representations
PCA, Kernel PCA, ICA: Powerful unsupervised learning techniques
for extracting hidden (potentially lower dimensional) structure
from high dimensional datasets.
Useful for:
• Visualization
• More efficient use of resources
(e.g., time, memory, communication)

• Statistical: fewer dimensions à better generalization

• Noise removal (improving data quality)
• Further processing by machine learning algorithms
Slide from Nina Balcan
PRINCIPAL COMPONENT
ANALYSIS (PCA)

16
PCA Outline
• Dimensionality Reduction
– High-dimensional data
– Learning (low dimensional) representations
• Principal Component Analysis (PCA)
– Examples: 2D and 3D
– Data for PCA
– PCA Definition
– Objective functions for PCA
– PCA, Eigenvectors, and Eigenvalues
– Algorithms for finding Eigenvectors / Eigenvalues
• PCA Examples
– Face Recognition
– Image Compression
17
Principal Component Analysis (PCA)

In case where data lies on or near a low d-dimensional linear subspace,

axes of this subspace are an effective representation of the data.

Identifying the axes is known as Principal Components Analysis, and can be

obtained by using classic matrix computation tools (Eigen or Singular Value
Decomposition).

Slide from Nina Balcan

2D Gaussian dataset

Slide from Barnabas Poczos

1st PCA axis

Slide from Barnabas Poczos

2nd PCA axis

Slide from Barnabas Poczos

Principal Component Analysis (PCA)
Whiteboard
– Data for PCA
– PCA Definition
– Objective functions for PCA

22
Data for PCA (t(1) )T
D= (i) N
{t }i=1 (t(2) )T
s= ..
.
(t(N ) )T
We assume the data is centered, and that each
axis has sample variance equal to one.
N
1
µ= t (i)
=0
N i=1
N
2 1 (i) 2
j = (xj ) =1
N i=1 23
Sample Covariance Matrix
The sample covariance matrix is given by:
N
1 (i) (i)
jk = (xj µj )(xk µk )
N i=1

Since the data matrix is centered, we rewrite as:

1 T
= s s
N

24
Maximizing the Variance
Quiz: Consider the two projections below
1. Which maximizes the variance?
2. Which minimizes the reconstruction error?

Option A Option B

25
PCA
Equivalence of Maximizing Variance and Minimizing Reconstruction Error

26
Principal Component Analysis (PCA)
Whiteboard
– PCA, Eigenvectors, and Eigenvalues
– Algorithms for finding Eigenvectors /
Eigenvalues
– SVD: Relation of Singular Vectors to
Eigenvectors

27
SVD for PCA

28
SVD for PCA

29
Principal Component Analysis (PCA)
X X # v = λv , so v (the first PC) is the eigenvector of
sample correlation/covariance matrix ' ' (
Sample variance of projection v ( ' ' ( v = )v ( v = )

Thus, the eigenvalue ) denotes the amount of variability

captured along that dimension (aka amount of energy along that
dimension).

Eigenvalues )* ≥ ), ≥ )- ≥ ⋯

• The 1st PC /* is the the eigenvector of the sample covariance matrix ' ' (
associated with the largest eigenvalue
• The 2nd PC /, is the the eigenvector of the sample covariance matrix
' ' ( associated with the second largest eigenvalue
• And so on …
Slide from Nina Balcan
How Many PCs?
• For M original dimensions, sample covariance matrix is MxM, and has
up to M eigenvectors. So M PCs.
• Where does dimensionality reduction come from?
Can ignore the components of lesser significance.

20
Variance (%)

0
PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10

• You do lose some information, but if the eigenvalues are small, you don’t lose
much
– M dimensions in original data
– calculate M eigenvectors and eigenvalues
– choose only the first D eigenvectors, based on their eigenvalues
– final data set has only D dimensions

© Eric Xing @ CMU, 2006-2011 32

Slides from Barnabas Poczos

Original sources include:

• Karl Booksh Research group
• Tom Mitchell
• Ron Parr

PCA EXAMPLES

33
Face recognition

Slide from Barnabas Poczos

Challenge: Facial Recognition
• Want to identify specific person, based on facial image
• Robust to glasses, lighting,…
Þ Can’t just use the given 256 x 256 pixels

Slide from Barnabas Poczos

Applying PCA: Eigenfaces
Method: Build one PCA database for the whole dataset and
then classify based on the weights.

• Example data set: Images of faces

– Famous Eigenface approach
[Turk & Pentland], [Sirovich & Kirby]
x1, …, xm • Each face x is …
– 256 ´ 256 values (luminance at location)
– x in Â256´256
real values
256 x 256

(view as 64K dim vector)

m faces

Slide from Barnabas Poczos

Principle Components

Slide from Barnabas Poczos

Reconstructing…

• … faster if train with…

– only people w/out glasses
– same lighting conditions
Slide from Barnabas Poczos
Shortcomings
• Requires carefully controlled data:
– All faces centered in frame
– Same size
– Some sensitivity to angle
• Alternative:
– “Learn” one set of PCA vectors for each angle
– Use the one with lowest error

• Method is completely knowledge free

– (sometimes this is good!)
– Doesn’t know that faces are wrapped around 3D objects
(heads)
– Makes no effort to preserve class distinctions

Slide from Barnabas Poczos

Image Compression

Slide from Barnabas Poczos

Original Image

• Divide the original 372x492 image into patches:

• Each patch is an instance that contains 12x12 pixels on a grid
• View each as a 144-D vector

Slide from Barnabas Poczos

L2 error and PCA dim

Slide from Barnabas Poczos

PCA compression: 144D à 60D

Slide from Barnabas Poczos

PCA compression: 144D à 16D

Slide from Barnabas Poczos

16 most important eigenvectors
2 2 2 2
4 4 4 4
6 6 6 6
8 8 8 8
10 10 10 10
12 12 12 12
2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12

2 2 2 2
4 4 4 4
6 6 6 6
8 8 8 8
10 10 10 10
12 12 12 12
2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12

Slide from Barnabas Poczos

PCA compression: 144D à 6D

Slide from Barnabas Poczos

6 most important eigenvectors

2 2 2
4 4 4
6 6 6
8 8 8
10 10 10
12 12 12
2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12

Slide from Barnabas Poczos

PCA compression: 144D à 3D

Slide from Barnabas Poczos

3 most important eigenvectors
2 2

4 4

6 6

8 8

10 10

12 12
2 4 6 8 10 12 2 4 6 8 10 12

12
2 4 6 8 10 12

Slide from Barnabas Poczos

PCA compression: 144D à 1D

Slide from Barnabas Poczos

60 most important eigenvectors

Looks like the discrete cosine bases of JPG!...

Slide from Barnabas Poczos
2D Discrete Cosine Basis

http://en.wikipedia.org/wiki/Discrete_cosine_transform

Slide from Barnabas Poczos

BSC Computer Science PDF
100% (1)
BSC Computer Science PDF
56 pages
UNIT 5 Matrices and System of Equations
No ratings yet
UNIT 5 Matrices and System of Equations
8 pages
REF. 6 - T.L. Saaty The Analytic Hierarchy Process Planning Priority Setting Resource Allocation McGraw-Hill New York 1980
No ratings yet
REF. 6 - T.L. Saaty The Analytic Hierarchy Process Planning Priority Setting Resource Allocation McGraw-Hill New York 1980
10 pages
MLT Week2
No ratings yet
MLT Week2
41 pages
08 Biometrics Lecture 8 Part3 2009-11-09
No ratings yet
08 Biometrics Lecture 8 Part3 2009-11-09
24 pages
Pages 141-210
No ratings yet
Pages 141-210
70 pages
PCA
100% (1)
PCA
33 pages
Computer Vision and Image Processing - Fundamentals and Applications
No ratings yet
Computer Vision and Image Processing - Fundamentals and Applications
34 pages
Module 2-PCA-1
No ratings yet
Module 2-PCA-1
26 pages
Presentation
No ratings yet
Presentation
31 pages
CS464 Ch6 FeatureExtraction
No ratings yet
CS464 Ch6 FeatureExtraction
46 pages
14 Pca
No ratings yet
14 Pca
18 pages
Face Detection Using PCA
No ratings yet
Face Detection Using PCA
32 pages
Machine Learning (CSO851) - Lecture 03
No ratings yet
Machine Learning (CSO851) - Lecture 03
71 pages
Dim Reduction & Pattern Recognition
No ratings yet
Dim Reduction & Pattern Recognition
63 pages
10-601 Machine Learning (Fall 2010) Principal Component Analysis
No ratings yet
10-601 Machine Learning (Fall 2010) Principal Component Analysis
8 pages
کتاب نهم بارگزاری شده
No ratings yet
کتاب نهم بارگزاری شده
55 pages
Module3 OTML
No ratings yet
Module3 OTML
67 pages
IDS 4 (Week 14)
No ratings yet
IDS 4 (Week 14)
66 pages
Mat 211 - 7
No ratings yet
Mat 211 - 7
14 pages
Pca Lda Lobo
No ratings yet
Pca Lda Lobo
20 pages
Transmission Lines: 1 A Lossless Transmission Line
No ratings yet
Transmission Lines: 1 A Lossless Transmission Line
20 pages
FALLSEM2024-25 SWE1015 ETH VL2024250103260 2024-09-18 Reference-Material-I
No ratings yet
FALLSEM2024-25 SWE1015 ETH VL2024250103260 2024-09-18 Reference-Material-I
62 pages
Presentation A I STD 2
No ratings yet
Presentation A I STD 2
63 pages
Mathematics Applications and Interpretation Paper 2 HL
No ratings yet
Mathematics Applications and Interpretation Paper 2 HL
17 pages
Module 5.2 Principal Component Analysis - V1
No ratings yet
Module 5.2 Principal Component Analysis - V1
4 pages
Face Recognition PAC
No ratings yet
Face Recognition PAC
24 pages
GATE Mechanical Solved Paper (2000-2013)
92% (12)
GATE Mechanical Solved Paper (2000-2013)
509 pages
AI Unsupervised Learning Guide
No ratings yet
AI Unsupervised Learning Guide
44 pages
Pca 1
No ratings yet
Pca 1
3 pages
PCA Dev
No ratings yet
PCA Dev
16 pages
Data Pre-Processing-IV (Feature Extraction-PCA)
No ratings yet
Data Pre-Processing-IV (Feature Extraction-PCA)
23 pages
UploadFile 9116
No ratings yet
UploadFile 9116
21 pages
Clustering and Dimensionality Reduction Techniques PCA T SNE K Means
No ratings yet
Clustering and Dimensionality Reduction Techniques PCA T SNE K Means
15 pages
Principal Component Analysis (PCA) : Gundimeda Venugopal
No ratings yet
Principal Component Analysis (PCA) : Gundimeda Venugopal
17 pages
Ai (PCA)
No ratings yet
Ai (PCA)
3 pages
PCA for High-Dimensional Data
No ratings yet
PCA for High-Dimensional Data
14 pages
08 HighDimensional PDF
No ratings yet
08 HighDimensional PDF
88 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
6 pages
Project LA
No ratings yet
Project LA
13 pages
Pattern Recognition (CSE4213) : Principal Components Analysis (PCA)
No ratings yet
Pattern Recognition (CSE4213) : Principal Components Analysis (PCA)
38 pages
Dimension Reduction
No ratings yet
Dimension Reduction
23 pages
Love Report
No ratings yet
Love Report
7 pages
Love Report 1
No ratings yet
Love Report 1
10 pages
P-3.1.4 - Pca
No ratings yet
P-3.1.4 - Pca
44 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
15 pages
3.2 Pca
No ratings yet
3.2 Pca
27 pages
Dimensionality Reduction Explained
No ratings yet
Dimensionality Reduction Explained
60 pages
PCA ChrisDing4
No ratings yet
PCA ChrisDing4
74 pages
Dimensionality Reduction Using Principal Component Analysis
No ratings yet
Dimensionality Reduction Using Principal Component Analysis
32 pages
MLPDF 2
No ratings yet
MLPDF 2
9 pages
DIP06-PCA Face Recognition
No ratings yet
DIP06-PCA Face Recognition
28 pages
PCA Finds Representation Through Linear Transformation
No ratings yet
PCA Finds Representation Through Linear Transformation
28 pages
Principal Components Analysis: Vida Movahedi
No ratings yet
Principal Components Analysis: Vida Movahedi
18 pages
PCA: Dimensionality Reduction Explained
No ratings yet
PCA: Dimensionality Reduction Explained
47 pages
Principal Component Analysis (PCA) in Machine Learning
No ratings yet
Principal Component Analysis (PCA) in Machine Learning
20 pages
08 HighDimensional PDF
No ratings yet
08 HighDimensional PDF
88 pages
Lecture 7: Unsupervised Learning: C19 Machine Learning Hilary 2013 A. Zisserman
No ratings yet
Lecture 7: Unsupervised Learning: C19 Machine Learning Hilary 2013 A. Zisserman
20 pages
Lecture 14: Principal Component Analysis: Computing The Principal Components
No ratings yet
Lecture 14: Principal Component Analysis: Computing The Principal Components
6 pages
Applications of Eigen Values and Vectors
No ratings yet
Applications of Eigen Values and Vectors
11 pages
Principal Component Analysis PCA in Machine Learning
No ratings yet
Principal Component Analysis PCA in Machine Learning
20 pages
Seminar PPT On Pca
No ratings yet
Seminar PPT On Pca
17 pages
FR Pca Lda
No ratings yet
FR Pca Lda
52 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
16 pages
Cambridge Mathematics 2025 IA Tripos Paper 1
No ratings yet
Cambridge Mathematics 2025 IA Tripos Paper 1
9 pages
NX Nastran Basic Dynamic Analysis Users Guid
No ratings yet
NX Nastran Basic Dynamic Analysis Users Guid
368 pages
Course Plan Even 2023 2024 Dela All (B.tech. Cse 2)
No ratings yet
Course Plan Even 2023 2024 Dela All (B.tech. Cse 2)
13 pages
HELM Project: Enhancing Math for Engineers
No ratings yet
HELM Project: Enhancing Math for Engineers
6 pages
(Ebooks PDF) Download Building Bridges II Mathematics of László Lovász Imre Bárány Full Chapters
100% (5)
(Ebooks PDF) Download Building Bridges II Mathematics of László Lovász Imre Bárány Full Chapters
55 pages
TCS 343
No ratings yet
TCS 343
2 pages
Indian Civil Service Exam Maths
No ratings yet
Indian Civil Service Exam Maths
4 pages
Noc20-Cs28 Week 02 Assignment 001 PDF
No ratings yet
Noc20-Cs28 Week 02 Assignment 001 PDF
6 pages
BSCS 4 1 - COSC 110 Syllabus
No ratings yet
BSCS 4 1 - COSC 110 Syllabus
11 pages
Computers in Factor Analysis
No ratings yet
Computers in Factor Analysis
11 pages
IJEART01014
No ratings yet
IJEART01014
5 pages
The Intuition Behind PCA: Machine Learning Assignment
No ratings yet
The Intuition Behind PCA: Machine Learning Assignment
11 pages
Numerical Stability in Rotor Dynamics
No ratings yet
Numerical Stability in Rotor Dynamics
24 pages
Civil Engineering Course Outline
No ratings yet
Civil Engineering Course Outline
24 pages
Row-Reduced Echlon Format of A Matrix: Linear Algebra For Everyone
No ratings yet
Row-Reduced Echlon Format of A Matrix: Linear Algebra For Everyone
19 pages
Handbook of Convergence Theorems
No ratings yet
Handbook of Convergence Theorems
70 pages
Alkylation 2
No ratings yet
Alkylation 2
22 pages
Warped Products - Petersen
No ratings yet
Warped Products - Petersen
12 pages
L-2 - Controllability and Observabilty
No ratings yet
L-2 - Controllability and Observabilty
13 pages
M.E.communication & NW
No ratings yet
M.E.communication & NW
60 pages
Matrix Algebra - MTH-174 - Beyond Syllabus-2
No ratings yet
Matrix Algebra - MTH-174 - Beyond Syllabus-2
6 pages
1 s2.0 S2352484723008685 Main
No ratings yet
1 s2.0 S2352484723008685 Main
8 pages
Exact Effective Hamiltonian Method
No ratings yet
Exact Effective Hamiltonian Method
6 pages
Convert Distributed Parameter Lines To Pi Section Lines - MATLAB Answers - MATLAB Central
No ratings yet
Convert Distributed Parameter Lines To Pi Section Lines - MATLAB Answers - MATLAB Central
4 pages

Lecture11 Pca

Uploaded by

Lecture11 Pca

Uploaded by

10-606 Mathematical Foundations for Machine Learning

Machine Learning Department

Image from (Wehbe et al., 2014)

• Statistical: fewer dimensions à better generalization

In case where data lies on or near a low d-dimensional linear subspace,

Identifying the axes is known as Principal Components Analysis, and can be

Slide from Nina Balcan

Slide from Barnabas Poczos

Slide from Barnabas Poczos

Slide from Barnabas Poczos

Since the data matrix is centered, we rewrite as:

Thus, the eigenvalue ) denotes the amount of variability

© Eric Xing @ CMU, 2006-2011 32

Original sources include:

Slide from Barnabas Poczos

Slide from Barnabas Poczos

• Example data set: Images of faces

(view as 64K dim vector)

Slide from Barnabas Poczos

Slide from Barnabas Poczos

• … faster if train with…

• Method is completely knowledge free

Slide from Barnabas Poczos

Slide from Barnabas Poczos

• Divide the original 372x492 image into patches:

Slide from Barnabas Poczos

Slide from Barnabas Poczos

Slide from Barnabas Poczos

Slide from Barnabas Poczos

Slide from Barnabas Poczos

Slide from Barnabas Poczos

Slide from Barnabas Poczos

Slide from Barnabas Poczos

Slide from Barnabas Poczos

Slide from Barnabas Poczos

Looks like the discrete cosine bases of JPG!...

Slide from Barnabas Poczos

You might also like