0% found this document useful (0 votes)

14 views8 pages

Chapter 2 - Machine Learning 2. Principal Component Analysis

This document provides a simplified explanation of Principal Component Analysis (PCA), a dimensionality-reduction method used to transform large data sets into smaller ones while preserving information. It outlines the step-by-step process of PCA, including standardization, covariance matrix computation, eigenvector and eigenvalue calculation, feature vector formation, and data recasting along principal components axes. The goal of PCA is to reduce the number of variables in a data set, making it easier to analyze and visualize without losing significant information.

Uploaded by

bharathidevarapu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views8 pages

Chapter 2 - Machine Learning 2. Principal Component Analysis

Uploaded by

bharathidevarapu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Solves problem of

overfitting
Dimensionality Reductio
TECH JOBS TECH TOPICS TECH HUBS FOR EMPLOYERS

A STEP-BY-STEP EXPLANATION OF
PRINCIPAL COMPONENT ANALYSIS
Zakaria Jaadi September 4, 2019 Updated: December 5, 2020

Join the Expert Contributor Network

T
he purpose of this post is to provide a complete and simplified explanation of
Principal Component Analysis, and especially to answer how it works step by step,
so that everyone can understand it and make use of it, without necessarily having a
strong mathematical background.

PCA is actually a widely covered method on the web, and there are some great articles
about it, but only few of them go straight to the point and explain how it works without
diving too much into the technicalities and the ‘why’ of things. That’s the reason why i
decided to make my own post to present it in a simplified way.

Before getting to the explanation, this post provides logical explanations of what PCA is
doing in each step and simplifies the mathematical concepts behind it, as standardization,
covariance, eigenvectors and eigenvalues without focusing on how to compute them.
Find out who's hiring.
See all Data + Analytics jobs at top tech companies & startups

VIEW 2012 JOBS

WHAT IS PRINCIPAL COMPONENT ANALYSIS?

Principal Component Analysis, or PCA, is a dimensionality-reduction method that is often
used to reduce the dimensionality of large data sets, by transforming a large set of
variables into a smaller one that still contains most of the information in the large set.

Reducing the number of variables of a data set naturally comes at the expense of
accuracy, but the trick in dimensionality reduction is to trade a little accuracy for
simplicity. Because smaller data sets are easier to explore and visualize and make
analyzing data much easier and faster for machine learning algorithms without
extraneous variables to process.

So to sum up, the idea of PCA is simple — reduce the number of variables of a data set,
while preserving as much information as possible.

STEP BY STEP EXPLANATION OF PCA

STEP 1: STANDARDIZATION

The aim of this step is to standardize the range of the continuous initial variables so that
each one of them contributes equally to the analysis.

More specifically, the reason why it is critical to perform standardization prior to PCA, is
that the latter is quite sensitive regarding the variances of the initial variables. That is, if
there are large differences between the ranges of initial variables, those variables with
larger ranges will dominate over those with small ranges (For example, a variable that
ranges between 0 and 100 will dominate over a variable that ranges between 0 and 1),
which will lead to biased results. So, transforming the data to comparable scales can
prevent this problem.

Mathematically, this can be done by subtracting the mean and dividing by the standard
deviation for each value of each variable.
Once the standardization is done, all the variables will be transformed to the same scale.

STEP 2: COVARIANCE MATRIX COMPUTATION

The aim of this step is to understand how the variables of the input data set are varying
from the mean with respect to each other, or in other words, to see if there is any
relationship between them. Because sometimes, variables are highly correlated in such a
way that they contain redundant information. So, in order to identify these correlations,
we compute the covariance matrix.

The covariance matrix is a p × p symmetric matrix (where p is the number of dimensions)

that has as entries the covariances associated with all possible pairs of the initial
variables. For example, for a 3-dimensional data set with 3 variables x, y, and z, the
covariance matrix is a 3×3 matrix of this from:

Covariance Matrix for 3-Dimensional Data Since the covariance of a variable with itself is its variance
(Cov(a,a)=Var(a)), in the main diagonal (Top left to bottom
right) we actually have the variances of each initial variable. And since the covariance is
commutative (Cov(a,b)=Cov(b,a)), the entries of the covariance matrix are symmetric with
respect to the main diagonal, which means that the upper and the lower triangular
portions are equal.

What do the covariances that we have as entries of the matrix tell us about the
correlations between the variables?

It’s actually the sign of the covariance that matters :

if positive then : the two variables increase or decrease together (correlated)

if negative then : One increases when the other decreases (Inversely correlated)

Now, that we know that the covariance matrix is not more than a table that summaries
the correlations between all the possible pairs of variables, let’s move to the next step.

STEP 3: COMPUTE THE EIGENVECTORS AND EIGENVALUES OF

THE COVARIANCE MATRIX TO IDENTIFY THE PRINCIPAL
COMPONENTS

Eigenvectors and eigenvalues are the linear algebra concepts that we need to compute
from the covariance matrix in order to determine the principal components of the data.
Before getting to the explanation of these concepts, let’s first understand what do we
mean by principal components.

Principal components are new variables that are constructed as linear combinations or
mixtures of the initial variables. These combinations are done in such a way that the new
variables (i.e., principal components) are uncorrelated and most of the information within
the initial variables is squeezed or compressed into the first components. So, the idea is
10-dimensional data gives you 10 principal components, but PCA tries to put maximum
possible information in the first component, then maximum remaining information in the
second and so on, until having something like shown in the scree plot below.

Percentage of Variance (Information) for each by PC Organizing information in principal components

this way, will allow you to reduce dimensionality
without losing much information, and this by discarding the components with low
information and considering the remaining components as your new variables.

An important thing to realize here is that, the principal components are less interpretable
and don’t have any real meaning since they are constructed as linear combinations of the
initial variables.

Geometrically speaking, principal components represent the directions of the data that
explain a maximal amount of variance, that is to say, the lines that capture most
information of the data. The relationship between variance and information here, is that,
the larger the variance carried by a line, the larger the dispersion of the data points along
it, and the larger the dispersion along a line, the more the information it has. To put all
this simply, just think of principal components as new axes that provide the best angle to
see and evaluate the data, so that the differences between the observations are better
visible.

STAY UP TO DATE ON THE LATEST TECH TRENDS

HOW PCA CONSTRUCTS THE PRINCIPAL

COMPONENTS
As there are as many principal components as there are variables in the data, principal
components are constructed in such a manner that the first principal component
accounts for the largest possible variance in the data set. For example, let’s assume that
the scatter plot of our data set is as shown below, can we guess the first principal
component ? Yes, it’s approximately the line that matches the purple marks because it
goes through the origin and it’s the line in which the projection of the points (red dots) is
the most spread out. Or mathematically speaking, it’s the line that maximizes the variance
(the average of the squared distances from the projected points (red dots) to the origin).

The second principal component is calculated in the same way, with the condition that it
is uncorrelated with (i.e., perpendicular to) the first principal component and that it
accounts for the next highest variance.

This continues until a total of p principal components have been calculated, equal to the
original number of variables.

Now that we understood what we mean by principal components, let’s go back to

eigenvectors and eigenvalues. What you firstly need to know about them is that they
always come in pairs, so that every eigenvector has an eigenvalue. And their number is
equal to the number of dimensions of the data. For example, for a 3-dimensional data set,
there are 3 variables, therefore there are 3 eigenvectors with 3 corresponding
eigenvalues.

Without further ado, it is eigenvectors and eigenvalues who are behind all the magic
explained above, because the eigenvectors of the Covariance matrix are
actually the directions of the axes where there is the most variance(most information) and
that we call Principal Components. And eigenvalues are simply the coefficients attached
to eigenvectors, which give the amount of variance carried in each Principal Component.

By ranking your eigenvectors in order of their eigenvalues, highest to lowest, you get the
principal components in order of significance.

Example:

let’s suppose that our data set is 2-dimensional with 2 variables x,y and that the
eigenvectors and eigenvalues of the covariance matrix are as follows:

If we rank the eigenvalues in descending order, we get λ1>λ2, which means that the
eigenvector that corresponds to the first principal component (PC1) is v1 and the one that
corresponds to the second component (PC2) isv2.

After having the principal components, to compute the percentage of variance

(information) accounted for by each component, we divide the eigenvalue of each
component by the sum of eigenvalues. If we apply this on the example above, we find that
PC1 and PC2 carry respectively 96% and 4% of the variance of the data.

Find out who's hiring.

See all Data + Analytics jobs at top tech companies & startups

VIEW 2012 JOBS

STEP 4: FEATURE VECTOR

As we saw in the previous step, computing the eigenvectors and ordering them by their
eigenvalues in descending order, allow us to find the principal components in order of
significance. In this step, what we do is, to choose whether to keep all these components
or discard those of lesser significance (of low eigenvalues), and form with the remaining
ones a matrix of vectors that we call Feature vector.

So, the feature vector is simply a matrix that has as columns the eigenvectors of the
components that we decide to keep. This makes it the first step towards dimensionality
reduction, because if we choose to keep only p eigenvectors (components) out of n, the
final data set will have only p dimensions.

Example:

Continuing with the example from the previous step, we can either form a feature vector
with both of the eigenvectors v1 and v2:

Or discard the eigenvector v2, which is the one of lesser significance, and form a feature
vector with v1 only:

Discarding the eigenvector v2 will reduce dimensionality by 1, and will consequently

cause a loss of information in the final data set. But given that v2 was carrying only 4% of
the information, the loss will be therefore not important and we will still have 96% of the
information that is carried by v1.

So, as we saw in the example, it’s up to you to choose whether to keep all the components
or discard the ones of lesser significance, depending on what you are looking for. Because
if you just want to describe your data in terms of new variables (principal components)
that are uncorrelated without seeking to reduce dimensionality, leaving out lesser
significant components is not needed.

LAST STEP: RECAST THE DATA ALONG THE PRINCIPAL

COMPONENTS AXES

In the previous steps, apart from standardization, you do not make any changes on the
data, you just select the principal components and form the feature vector, but the input
data set remains always in terms of the original axes (i.e, in terms of the initial variables).

In this step, which is the last one, the aim is to use the feature vector formed using the
eigenvectors of the covariance matrix, to reorient the data from the original axes to the
ones represented by the principal components (hence the name Principal Components
Analysis). This can be done by multiplying the transpose of the original data set by the
transpose of the feature vector.

***

Zakaria Jaadi is a data scientist and machine learning engineer. Check out more of his
content on Data Science topics on Medium.

References:

[Steven M. Holland, Univ. of Georgia]: Principal Components Analysis

[skymind.ai]: Eigenvectors, Eigenvalues, PCA, Covariance and Entropy
[Lindsay I. Smith] : A tutorial on Principal Component Analysis

A Step by Step Explanation of Principal Component Analysis
No ratings yet
A Step by Step Explanation of Principal Component Analysis
7 pages
PCA Explained Stepbystep
No ratings yet
PCA Explained Stepbystep
4 pages
Principal Component Analysis
100% (1)
Principal Component Analysis
10 pages
PCA Notes
No ratings yet
PCA Notes
3 pages
A Step-By-Step Explanation of Principal Component Analysis (PCA) - Built in
No ratings yet
A Step-By-Step Explanation of Principal Component Analysis (PCA) - Built in
8 pages
Steps For PCA
No ratings yet
Steps For PCA
5 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
10 pages
Remote Sensing Assignment
No ratings yet
Remote Sensing Assignment
10 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
5 pages
CH 6
No ratings yet
CH 6
11 pages
PCA Guide for Data Scientists
No ratings yet
PCA Guide for Data Scientists
11 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
6 pages
Unit 3
No ratings yet
Unit 3
28 pages
U4 - PCA - 5th Sem - DS
No ratings yet
U4 - PCA - 5th Sem - DS
14 pages
DR Pca
No ratings yet
DR Pca
22 pages
PCA: Step-by-Step Guide to Dimensionality Reduction
No ratings yet
PCA: Step-by-Step Guide to Dimensionality Reduction
13 pages
Principal Component Analysis: by Eesha Tur Razia Babar
No ratings yet
Principal Component Analysis: by Eesha Tur Razia Babar
38 pages
Need of Principal Component Analysis
No ratings yet
Need of Principal Component Analysis
8 pages
4 1 Pca
No ratings yet
4 1 Pca
21 pages
Pca
No ratings yet
Pca
18 pages
Qrm2024 Topic5 Pca Fa
No ratings yet
Qrm2024 Topic5 Pca Fa
67 pages
Principal Computer Analysis (PCA)
No ratings yet
Principal Computer Analysis (PCA)
25 pages
PCA Dev
No ratings yet
PCA Dev
16 pages
DimensionalitY Reduction
No ratings yet
DimensionalitY Reduction
29 pages
Lecture 6 - PCA - Lecturefin
No ratings yet
Lecture 6 - PCA - Lecturefin
71 pages
Unit V Foml
No ratings yet
Unit V Foml
18 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
34 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
15 pages
Principal Component Analysis (PCA)
No ratings yet
Principal Component Analysis (PCA)
18 pages
IDS 4 (Week 14)
No ratings yet
IDS 4 (Week 14)
66 pages
P-3.1.4 - Pca
No ratings yet
P-3.1.4 - Pca
44 pages
PCA Finds Representation Through Linear Transformation
No ratings yet
PCA Finds Representation Through Linear Transformation
28 pages
Module 2 Lab 2
No ratings yet
Module 2 Lab 2
5 pages
Module 2-PCA-1
No ratings yet
Module 2-PCA-1
26 pages
PCA for Data Simplification
No ratings yet
PCA for Data Simplification
70 pages
PCA Complete
No ratings yet
PCA Complete
8 pages
Principle Component Analysis
No ratings yet
Principle Component Analysis
7 pages
PCA Concepts and Techniques
No ratings yet
PCA Concepts and Techniques
16 pages
Pca 1
No ratings yet
Pca 1
3 pages
Dimensionality Reduction Techniques in Data Mining Aim To Reduce The Number of Features
No ratings yet
Dimensionality Reduction Techniques in Data Mining Aim To Reduce The Number of Features
9 pages
1501589578da Mod15 Q1 e Text
No ratings yet
1501589578da Mod15 Q1 e Text
9 pages
Princomps George Dallas
No ratings yet
Princomps George Dallas
9 pages
03 Principal Components Analysis
No ratings yet
03 Principal Components Analysis
3 pages
6 Principal Component Analysis
No ratings yet
6 Principal Component Analysis
7 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
8 pages
Principal Component Analysis1
No ratings yet
Principal Component Analysis1
26 pages
PCA Guide and R Implementation
No ratings yet
PCA Guide and R Implementation
11 pages
Kinya Sharon - Ass2 - Machine Learning
No ratings yet
Kinya Sharon - Ass2 - Machine Learning
12 pages
PCA Basics for Social Scientists
100% (1)
PCA Basics for Social Scientists
8 pages
What Is Principal Component Analysis (PCA) ?
No ratings yet
What Is Principal Component Analysis (PCA) ?
13 pages
3.2 Pca
No ratings yet
3.2 Pca
27 pages
PCA for Data Analysis Beginners
No ratings yet
PCA for Data Analysis Beginners
6 pages
Mlfa Autumn 2023 Pca
No ratings yet
Mlfa Autumn 2023 Pca
32 pages
STAT502
No ratings yet
STAT502
13 pages
ML Unit - 3 DimensionalitY Reduction
No ratings yet
ML Unit - 3 DimensionalitY Reduction
39 pages
PC A Tutorial
No ratings yet
PC A Tutorial
12 pages
PCA Guide: Eigenvectors & Dimension Reduction
No ratings yet
PCA Guide: Eigenvectors & Dimension Reduction
10 pages
Linear Algebra - Section 5.3 - Math 231, Section 001, Summer 1 2021 - WebAssign
No ratings yet
Linear Algebra - Section 5.3 - Math 231, Section 001, Summer 1 2021 - WebAssign
13 pages
Ch-16 - Matrices & Determinants - Arjuna JEE AIR 2.0 2026
No ratings yet
Ch-16 - Matrices & Determinants - Arjuna JEE AIR 2.0 2026
11 pages
Pre-Calculus / Math Notes (Unit 15 of 22)
No ratings yet
Pre-Calculus / Math Notes (Unit 15 of 22)
14 pages
Class 12 Maths Practice Paper: Determinants
No ratings yet
Class 12 Maths Practice Paper: Determinants
8 pages
Asign 2
No ratings yet
Asign 2
8 pages
MT132 Tma 2
No ratings yet
MT132 Tma 2
6 pages
Unit 9 Matrices and Determinants: Structure
No ratings yet
Unit 9 Matrices and Determinants: Structure
32 pages
Matrices Test
No ratings yet
Matrices Test
2 pages
Gershgorin Circle Theorem - Wikipedia
No ratings yet
Gershgorin Circle Theorem - Wikipedia
5 pages
Board Practice Test (Relations To Determinants) XII 2024-25 DPS
No ratings yet
Board Practice Test (Relations To Determinants) XII 2024-25 DPS
1 page
Eigenvalues and Eigenvectors Guide
No ratings yet
Eigenvalues and Eigenvectors Guide
10 pages
Gaussian Elimination Guide
No ratings yet
Gaussian Elimination Guide
16 pages
Symmetric and Anti-Symmetric Tensors
No ratings yet
Symmetric and Anti-Symmetric Tensors
2 pages
2.4 Change of Basis Theorem: ? Coordinates
No ratings yet
2.4 Change of Basis Theorem: ? Coordinates
3 pages
Matrix Algebra With MATLAB
No ratings yet
Matrix Algebra With MATLAB
29 pages
2011 Final
No ratings yet
2011 Final
15 pages
Maths Class Xii Chapter 04 Determinants Practice Paper 04 2024
No ratings yet
Maths Class Xii Chapter 04 Determinants Practice Paper 04 2024
4 pages
Matrix Operations and Solutions
No ratings yet
Matrix Operations and Solutions
8 pages
Fp3 Matrices
No ratings yet
Fp3 Matrices
26 pages
Radical Academy (II) : Semester Maths Home Take Assignment For Grade 11
No ratings yet
Radical Academy (II) : Semester Maths Home Take Assignment For Grade 11
3 pages
Lecture-19 - Stability of LTI Systems
No ratings yet
Lecture-19 - Stability of LTI Systems
15 pages
JEE Mains Exam 25-01-2023 Shift-2 Mathematics PDF
No ratings yet
JEE Mains Exam 25-01-2023 Shift-2 Mathematics PDF
3 pages
MATLAB Matrix Basics
No ratings yet
MATLAB Matrix Basics
51 pages
Midterm Lecture Part 2
No ratings yet
Midterm Lecture Part 2
35 pages
Matrices Complete Test 12
No ratings yet
Matrices Complete Test 12
2 pages
Cramer's Rule Explained
No ratings yet
Cramer's Rule Explained
14 pages
Technological Institute of The Philippines - Manila
No ratings yet
Technological Institute of The Philippines - Manila
5 pages
106 Tut
No ratings yet
106 Tut
2 pages
Math 149 - Week 9-10
No ratings yet
Math 149 - Week 9-10
84 pages
Matrix Last 5 Year Pyq
No ratings yet
Matrix Last 5 Year Pyq
13 pages

Chapter 2 - Machine Learning 2. Principal Component Analysis

Uploaded by

Chapter 2 - Machine Learning 2. Principal Component Analysis

Uploaded by

Solves problem of

Join the Expert Contributor Network

VIEW 2012 JOBS

WHAT IS PRINCIPAL COMPONENT ANALYSIS?

STEP BY STEP EXPLANATION OF PCA

STEP 2: COVARIANCE MATRIX COMPUTATION

The covariance matrix is a p × p symmetric matrix (where p is the number of dimensions)

It’s actually the sign of the covariance that matters :

if positive then : the two variables increase or decrease together (correlated)

STEP 3: COMPUTE THE EIGENVECTORS AND EIGENVALUES OF

Percentage of Variance (Information) for each by PC Organizing information in principal components

STAY UP TO DATE ON THE LATEST TECH TRENDS

HOW PCA CONSTRUCTS THE PRINCIPAL

Now that we understood what we mean by principal components, let’s go back to

After having the principal components, to compute the percentage of variance

Find out who's hiring.

VIEW 2012 JOBS

STEP 4: FEATURE VECTOR

Discarding the eigenvector v2 will reduce dimensionality by 1, and will consequently

LAST STEP: RECAST THE DATA ALONG THE PRINCIPAL

[Steven M. Holland, Univ. of Georgia]: Principal Components Analysis

Read More About Data Science

You might also like