0% found this document useful (0 votes)

103 views4 pages

Pca Implementation Notebook

This document demonstrates how to perform principal component analysis (PCA) on economic data with 6 variables and 16 observations. It loads and explores the data, applies PCA to reduce the data to 3 principal components, visualizes the first two components, and calculates that the first component explains 75.6% of the variance in the data.

Uploaded by

Walid Sassi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

103 views4 pages

Pca Implementation Notebook

Uploaded by

Walid Sassi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

13/09/2023, 21:06 principal-component-analysis

Import all the libraries :

In [22]:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

Loading Data :

In [23]:

df = pd.read_csv('/kaggle/input/principal-component-analysis/Longley (1).csv')

In [24]:

Out[24]:

GNP.deflator GNP Unemployed Armed.Forces Population Employed

0 83.0 234.289 235.6 159.0 107.608 60.323

1 88.5 259.426 232.5 145.6 108.632 61.122

2 88.2 258.054 368.2 161.6 109.773 60.171

3 89.5 284.599 335.1 165.0 110.929 61.187

4 96.2 328.975 209.9 309.9 112.075 63.221

5 98.1 346.999 193.2 359.4 113.270 63.639

6 99.0 365.385 187.0 354.7 115.094 64.989

7 100.0 363.112 357.8 335.0 116.219 63.761

8 101.2 397.469 290.4 304.8 117.388 66.019

9 104.6 419.180 282.2 285.7 118.734 67.857

10 108.4 442.769 293.6 279.8 120.445 68.169

11 110.8 444.546 468.1 263.7 121.950 66.513

12 112.6 482.704 381.3 255.2 123.366 68.655

13 114.2 502.601 393.1 251.4 125.368 69.564

14 115.7 518.173 480.6 257.2 127.852 69.331

15 116.9 554.894 400.7 282.7 130.081 70.551

https://htmtopdf.herokuapp.com/ipynbviewer/temp/4518ed9eb5f5a3af5f67858dbb1814e4/principal-component-analysis.html?t=1694619339010 1/4
13/09/2023, 21:06 principal-component-analysis

In [25]:

df.dtypes

Out[25]:

GNP.deflator float64
GNP float64
Unemployed float64
Armed.Forces float64
Population float64
Employed float64
dtype: object

In [26]:

X = df.drop('Employed', axis=1)
Y = df['Employed']

In [27]:

correlation = df.corr()
correlation

Out[27]:

GNP.deflator GNP Unemployed Armed.Forces Population Employed

GNP.deflator 1.000000 0.991589 0.620633 0.464744 0.979163 0.970899

GNP 0.991589 1.000000 0.604261 0.446437 0.991090 0.983552

Unemployed 0.620633 0.604261 1.000000 -0.177421 0.686552 0.502498

Armed.Forces 0.464744 0.446437 -0.177421 1.000000 0.364416 0.457307

Population 0.979163 0.991090 0.686552 0.364416 1.000000 0.960391

Employed 0.970899 0.983552 0.502498 0.457307 0.960391 1.000000

Apply PCA :

In [28]:

from sklearn.preprocessing import StandardScaler

In [29]:

# Scale data before applying PCA

scaling=StandardScaler()

https://htmtopdf.herokuapp.com/ipynbviewer/temp/4518ed9eb5f5a3af5f67858dbb1814e4/principal-component-analysis.html?t=1694619339010 2/4
13/09/2023, 21:06 principal-component-analysis

In [30]:

# Use fit and transform method

scaling.fit(df)
Scaled_data=scaling.transform(df)

In [31]:

from sklearn.decomposition import PCA

In [32]:

# Set the n_components=3

principal=PCA(n_components=3)
principal.fit(Scaled_data)
x=principal.transform(Scaled_data)

In [33]:

# Check the dimensions of data after PCA

print(x.shape)

(16, 3)

Check Components :

In [34]:

# Check the values of eigen vectors

# prodeced by principal components
principal.components_

Out[34]:

array([[-0.46695493, -0.46748987, -0.30646472, -0.21200613, -0.4656055

6,
-0.45579661],
[ 0.02628724, 0.02306569, -0.62227098, 0.77353962, -0.0762474
5,
0.08589854],
[-0.04906877, -0.16405382, 0.67228378, 0.58400807, -0.0917922
6,
-0.41136586]])

Plot the components (Visualization) :

https://htmtopdf.herokuapp.com/ipynbviewer/temp/4518ed9eb5f5a3af5f67858dbb1814e4/principal-component-analysis.html?t=1694619339010 3/4
13/09/2023, 21:06 principal-component-analysis

In [35]:

# plt.figure(figsize=(10,10))
plt.scatter(x[:,0],x[:,1],c=df['Employed'],cmap='plasma')
plt.xlabel('pc1')
plt.ylabel('pc2')

Out[35]:

Text(0, 0.5, 'pc2')

Calculate variance ratio :

In [36]:

# check how much variance is explained by each principal component

print(principal.explained_variance_ratio_)

[0.75584735 0.19778211 0.0419845 ]

In [ ]:

https://htmtopdf.herokuapp.com/ipynbviewer/temp/4518ed9eb5f5a3af5f67858dbb1814e4/principal-component-analysis.html?t=1694619339010 4/4

Intro Gen AI 6p
100% (1)
Intro Gen AI 6p
6 pages
Pca Handwritten
No ratings yet
Pca Handwritten
13 pages
Customer Churn Prediction
100% (1)
Customer Churn Prediction
32 pages
Scientific Python Guide 2024
100% (3)
Scientific Python Guide 2024
687 pages
Fabric Get Started
No ratings yet
Fabric Get Started
99 pages
Education - Post 12th Standard - CSV
88% (16)
Education - Post 12th Standard - CSV
11 pages
Getting Data Science Done: Managing Projects From Ideas To Products
No ratings yet
Getting Data Science Done: Managing Projects From Ideas To Products
40 pages
K Means Clustering
100% (1)
K Means Clustering
10 pages
10 Objective Questions On AI
No ratings yet
10 Objective Questions On AI
2 pages
Predictive Modelling Alternative Firm Level PDF
100% (4)
Predictive Modelling Alternative Firm Level PDF
26 pages
Computing Scheme of Work and Planning: All Saints Upton Primary School Computing Curriculum
No ratings yet
Computing Scheme of Work and Planning: All Saints Upton Primary School Computing Curriculum
30 pages
Crowd Management Main
No ratings yet
Crowd Management Main
33 pages
Writing For The Web
No ratings yet
Writing For The Web
10 pages
Aim: Theory: Experiment 3
No ratings yet
Aim: Theory: Experiment 3
3 pages
Do Now Lesson 2
No ratings yet
Do Now Lesson 2
1 page
MScFE 600 Financial Data GWP1 - GRP - 7982 - Ques3
No ratings yet
MScFE 600 Financial Data GWP1 - GRP - 7982 - Ques3
6 pages
Computing Scheme of Work and Planning: All Saints Upton Primary School Computing Curriculum
No ratings yet
Computing Scheme of Work and Planning: All Saints Upton Primary School Computing Curriculum
29 pages
Generative AI With LArge Language Models
No ratings yet
Generative AI With LArge Language Models
36 pages
It Journal
No ratings yet
It Journal
30 pages
Practical Guide To Principal Component N R
No ratings yet
Practical Guide To Principal Component N R
43 pages
Computing Scheme of Work and Planning: All Saints Upton Primary School Computing Curriculum
No ratings yet
Computing Scheme of Work and Planning: All Saints Upton Primary School Computing Curriculum
24 pages
Computing Scheme of Work and Planning: All Saints Upton Primary School Computing Curriculum
No ratings yet
Computing Scheme of Work and Planning: All Saints Upton Primary School Computing Curriculum
28 pages
DV Journal
No ratings yet
DV Journal
30 pages
FOUND. DATA SCIENCE Practical
No ratings yet
FOUND. DATA SCIENCE Practical
15 pages
Computing Scheme of Work and Planning: All Saints Upton Primary School Computing Curriculum
No ratings yet
Computing Scheme of Work and Planning: All Saints Upton Primary School Computing Curriculum
24 pages
AI & Data Science Lab Record
No ratings yet
AI & Data Science Lab Record
28 pages
Analisis Peubah Ganda: Pertemuan VIII
No ratings yet
Analisis Peubah Ganda: Pertemuan VIII
163 pages
Practical 5
No ratings yet
Practical 5
6 pages
Intro HTML Css Preso 2
No ratings yet
Intro HTML Css Preso 2
8 pages
116 Principal Components Analysis
No ratings yet
116 Principal Components Analysis
6 pages
Principal Component Analysis Python
No ratings yet
Principal Component Analysis Python
7 pages
Boston House Prediction - Colab1
No ratings yet
Boston House Prediction - Colab1
10 pages
Remote Sensing Assignment
No ratings yet
Remote Sensing Assignment
10 pages
Python EDA Workshop with Olympics Data
No ratings yet
Python EDA Workshop with Olympics Data
12 pages
Principal Component Analysis: Economics Working Paper Series Working Paper No. 1856
No ratings yet
Principal Component Analysis: Economics Working Paper Series Working Paper No. 1856
25 pages
ModuleAr Merged
No ratings yet
ModuleAr Merged
42 pages
IP Practical 2024-25 (1 To 34)
No ratings yet
IP Practical 2024-25 (1 To 34)
33 pages
ML Labmanual
No ratings yet
ML Labmanual
33 pages
Analyse en Composants Principales TP
No ratings yet
Analyse en Composants Principales TP
45 pages
Section 1: Introduction To Software Lifecycle
No ratings yet
Section 1: Introduction To Software Lifecycle
44 pages
Machine Learning Numpy
No ratings yet
Machine Learning Numpy
39 pages
Numpy Day7
No ratings yet
Numpy Day7
12 pages
Lec 17 - Principal Component Analysis PDF
No ratings yet
Lec 17 - Principal Component Analysis PDF
30 pages
ML Lab - Exp1-10
No ratings yet
ML Lab - Exp1-10
4 pages
Matplotlib Library in Python
No ratings yet
Matplotlib Library in Python
85 pages
G10 Week4 T2 2024 2025
No ratings yet
G10 Week4 T2 2024 2025
35 pages
G10-Week1-T2-2024-2025 (Electricity and Electronics)
No ratings yet
G10-Week1-T2-2024-2025 (Electricity and Electronics)
34 pages
Code
No ratings yet
Code
2 pages
How To Create A Wireframe: Adobe Photoshop Guide
No ratings yet
How To Create A Wireframe: Adobe Photoshop Guide
8 pages
Python Statistical Modeling Lab
No ratings yet
Python Statistical Modeling Lab
33 pages
Exp 12 and 15
No ratings yet
Exp 12 and 15
4 pages
Machine Learning Laboratory
No ratings yet
Machine Learning Laboratory
23 pages
Introduction To Kernel PCA
No ratings yet
Introduction To Kernel PCA
1 page
PCA Review Reset
No ratings yet
PCA Review Reset
24 pages
What Is PCA: When Should You Use PCA?
No ratings yet
What Is PCA: When Should You Use PCA?
21 pages
Eda - 1@3pm 8th Nov
No ratings yet
Eda - 1@3pm 8th Nov
2 pages
Unit1 ML Programs
No ratings yet
Unit1 ML Programs
5 pages
RNN LSTM
No ratings yet
RNN LSTM
16 pages
Practical 10
No ratings yet
Practical 10
2 pages
Stream Project: Creating An Autonomous Vehicle Control Circuit
No ratings yet
Stream Project: Creating An Autonomous Vehicle Control Circuit
18 pages
DS Prac 9
No ratings yet
DS Prac 9
3 pages
Transformer & GPT Model Basics
No ratings yet
Transformer & GPT Model Basics
69 pages
Advertising in ML
No ratings yet
Advertising in ML
9 pages
Introduction To Data Analysis
No ratings yet
Introduction To Data Analysis
72 pages
DA Exp2output
No ratings yet
DA Exp2output
3 pages
Chapter 11 KNN Naive Bayes and LDA
No ratings yet
Chapter 11 KNN Naive Bayes and LDA
15 pages
Automated Analog Circuit Design with ML
No ratings yet
Automated Analog Circuit Design with ML
6 pages
Final Report - Rahma Ahme (P-EM0295-23)
No ratings yet
Final Report - Rahma Ahme (P-EM0295-23)
42 pages
BCG Virtual Experience Task 3 Feature Engineering1
No ratings yet
BCG Virtual Experience Task 3 Feature Engineering1
12 pages
Data Science Case Study Options 1.0
No ratings yet
Data Science Case Study Options 1.0
2 pages
ML IU48prac1,2
No ratings yet
ML IU48prac1,2
16 pages
Web Authoring for Beginners
No ratings yet
Web Authoring for Beginners
1 page
Lesson 1 Week 18 Do Now
No ratings yet
Lesson 1 Week 18 Do Now
1 page
230103-ECON209 S2025 Lab 2.ipynb-Colab
No ratings yet
230103-ECON209 S2025 Lab 2.ipynb-Colab
10 pages
Text Summarization Research Paper
No ratings yet
Text Summarization Research Paper
28 pages
Email Spam Filtering Using Machine Learning in Python Ex No: 1 Date: 20/6/25
No ratings yet
Email Spam Filtering Using Machine Learning in Python Ex No: 1 Date: 20/6/25
5 pages
Principal Components for Analysts
No ratings yet
Principal Components for Analysts
25 pages
Data Analyzer
No ratings yet
Data Analyzer
10 pages
Towards An Efficient Model For Network Intrusion Detection System (IDS) : Systematic Literature Review
No ratings yet
Towards An Efficient Model For Network Intrusion Detection System (IDS) : Systematic Literature Review
30 pages
AI and Human Resources in A Literature-Driven Investigation Into Emerging Trends
No ratings yet
AI and Human Resources in A Literature-Driven Investigation Into Emerging Trends
20 pages
Data Import Techniques Guide
No ratings yet
Data Import Techniques Guide
6 pages
Programa
No ratings yet
Programa
2 pages
External
No ratings yet
External
11 pages
PCA Basics for Predictive Analytics
No ratings yet
PCA Basics for Predictive Analytics
18 pages
01 - Lesson - Visualization - Jupyter Notebook
No ratings yet
01 - Lesson - Visualization - Jupyter Notebook
18 pages
Practical Guide To Principal Component Analysis (PCA) in R & Python
No ratings yet
Practical Guide To Principal Component Analysis (PCA) in R & Python
33 pages
How Hedge Funds Are Leveraging Gen AI To Get Ahead
No ratings yet
How Hedge Funds Are Leveraging Gen AI To Get Ahead
24 pages
CampusTrail Sample SOP For MS in CS
No ratings yet
CampusTrail Sample SOP For MS in CS
2 pages
Resume Mayank Yadav
No ratings yet
Resume Mayank Yadav
2 pages
Data Analysis Process
No ratings yet
Data Analysis Process
95 pages
Chacha: Leveraging Large Language Models To Prompt Children To Share Their Emotions About Personal Events
No ratings yet
Chacha: Leveraging Large Language Models To Prompt Children To Share Their Emotions About Personal Events
20 pages
Career With AI - Himanshu Ramchandani
No ratings yet
Career With AI - Himanshu Ramchandani
19 pages
Analyzing Cloud Security and Cybersecurity Performance Using Data
No ratings yet
Analyzing Cloud Security and Cybersecurity Performance Using Data
32 pages
Problem 1:: Readingcsv PD Read - Excel (Readingcsv) Readingcsv Head
No ratings yet
Problem 1:: Readingcsv PD Read - Excel (Readingcsv) Readingcsv Head
18 pages
Final Action Items-ASQ
No ratings yet
Final Action Items-ASQ
7 pages
Kinya Sharon - Ass2 - Machine Learning
No ratings yet
Kinya Sharon - Ass2 - Machine Learning
12 pages
Hybrid Feature Selection Models For Machine
No ratings yet
Hybrid Feature Selection Models For Machine
5 pages
Germany Credit Analysis
No ratings yet
Germany Credit Analysis
41 pages
Education - Post 12th Standard - CSV
No ratings yet
Education - Post 12th Standard - CSV
11 pages
Data Science Masters Brochure 2024 C21acc94be
No ratings yet
Data Science Masters Brochure 2024 C21acc94be
23 pages
Greppo Executive Search - L'Atelier BNPP - Head of Data Science
No ratings yet
Greppo Executive Search - L'Atelier BNPP - Head of Data Science
16 pages
Smart Sensors in Industry 4.0
No ratings yet
Smart Sensors in Industry 4.0
13 pages
Optimalisasi Klasifikasi Kanker Payudara Menggunakan Forward Selection Pada Naive Bayes
No ratings yet
Optimalisasi Klasifikasi Kanker Payudara Menggunakan Forward Selection Pada Naive Bayes
5 pages
First: Lego League UK and Ireland Operational Partner
No ratings yet
First: Lego League UK and Ireland Operational Partner
12 pages
Math For ML
No ratings yet
Math For ML
6 pages
Computer & Network Security Syllabus
No ratings yet
Computer & Network Security Syllabus
45 pages
Comprehensive Viva Amit Rawat
No ratings yet
Comprehensive Viva Amit Rawat
12 pages
Roadmap:: Six Months To Machine Learning
No ratings yet
Roadmap:: Six Months To Machine Learning
22 pages
Terror Casualty Attack
No ratings yet
Terror Casualty Attack
6 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
72 pages
Time Series Analysis Group 9
No ratings yet
Time Series Analysis Group 9
16 pages
NB 15
No ratings yet
NB 15
20 pages

Pca Implementation Notebook

Uploaded by

Pca Implementation Notebook

Uploaded by

13/09/2023, 21:06 principal-component-analysis

Import all the libraries :

GNP.deflator GNP Unemployed Armed.Forces Population Employed

0 83.0 234.289 235.6 159.0 107.608 60.323

1 88.5 259.426 232.5 145.6 108.632 61.122

2 88.2 258.054 368.2 161.6 109.773 60.171

3 89.5 284.599 335.1 165.0 110.929 61.187

4 96.2 328.975 209.9 309.9 112.075 63.221

5 98.1 346.999 193.2 359.4 113.270 63.639

6 99.0 365.385 187.0 354.7 115.094 64.989

7 100.0 363.112 357.8 335.0 116.219 63.761

8 101.2 397.469 290.4 304.8 117.388 66.019

9 104.6 419.180 282.2 285.7 118.734 67.857

10 108.4 442.769 293.6 279.8 120.445 68.169

11 110.8 444.546 468.1 263.7 121.950 66.513

12 112.6 482.704 381.3 255.2 123.366 68.655

13 114.2 502.601 393.1 251.4 125.368 69.564

14 115.7 518.173 480.6 257.2 127.852 69.331

15 116.9 554.894 400.7 282.7 130.081 70.551

GNP.deflator GNP Unemployed Armed.Forces Population Employed

GNP.deflator 1.000000 0.991589 0.620633 0.464744 0.979163 0.970899

GNP 0.991589 1.000000 0.604261 0.446437 0.991090 0.983552

Unemployed 0.620633 0.604261 1.000000 -0.177421 0.686552 0.502498

Armed.Forces 0.464744 0.446437 -0.177421 1.000000 0.364416 0.457307

Population 0.979163 0.991090 0.686552 0.364416 1.000000 0.960391

Employed 0.970899 0.983552 0.502498 0.457307 0.960391 1.000000

from sklearn.preprocessing import StandardScaler

# Scale data before applying PCA

# Use fit and transform method

from sklearn.decomposition import PCA

# Set the n_components=3

# Check the dimensions of data after PCA

# Check the values of eigen vectors

array([[-0.46695493, -0.46748987, -0.30646472, -0.21200613, -0.4656055

Plot the components (Visualization) :

Text(0, 0.5, 'pc2')

Calculate variance ratio :

# check how much variance is explained by each principal component

[0.75584735 0.19778211 0.0419845 ]

You might also like