Machine Learning
and Network Analysis
MA4207
Principal Component Analysis
◼ Curse of Dimensionality
◼ Dimensionality Reduction
◼ PCA
Curse of Dimensionality
N=1
N=2
N=3
Dimensionality Reduction
Feature Selection Feature Extraction
Linear Feature Extraction
Feature Extraction
Given a feature space 𝑥𝑖∈ℜ𝑁 find a mapping 𝑦=𝑓(𝑥):𝑅𝑁→𝑅𝑀
with 𝑀<𝑁 such that the transformed feature vector 𝑦∈𝑅𝑀
preserves (most of) the information or structure in 𝑅𝑁
Mapping of 𝑦=𝑓(𝑥) such that P(error) does not increase
Bayes Classification rate remains same or improves when
moving from high dimensional data space to low dimensional
feature space
Principal Component Analysis (PCA)
◼ a way to describe multivariate data by maximizing
variances
◼ useful tool for data compression and information
extraction
◼ to find combinations of variables (factors) that
describe major trends in the data
◼ an eigenvalue decomposition process of the
covariance (correlation) matrix of the variables
PCA
◼ PCA finds an orthogonal basis that best represents given data
set. y y’
x’
◼ The sum of distances2 from the x’ axis is minimized.
PCA Example
• 3D Gaussian distribution
𝜇 = [0 5 2 ]𝑇 and Σ = [25 −1 7
0 4 −4
0 0 10 ]
-- PCA decorrelates the axes
PCA example
PCA
Look for the projection vector w that maximizes the variance of the
projected data items, WTX:
Solution: w are the dominant eigenvectors of the covariance matrix Σx.
Notes on PCA