Pca 2

The document discusses the limitations of Principal Component Analysis (PCA), highlighting scenarios where it may not be suitable, such as non-linear relationships and sensitivity to outliers. It also explores alternatives and extensions to PCA, including non-linear methods like Kernel PCA and t-SNE, as well as supervised techniques like Linear Discriminant Analysis. Additionally, advanced PCA concepts such as Incremental PCA and Robust PCA are introduced, along with practical considerations for effective implementation.

Uploaded by

luna luna

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views3 pages

Pca 2

Uploaded by

luna luna

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Yes, there's definitely more to delve into regarding Principal Component Analysis, especially

when considering its practical implications and advanced variations. Here's a further breakdown:

Principal Component Analysis (PCA): Further Insights

8. When is PCA Not Suitable? (Limitations Beyond the Basics)
While powerful, PCA isn't a one-size-fits-all solution. There are specific scenarios where its
application might be problematic or sub-optimal:
● Non-linear Relationships: PCA is inherently a linear dimensionality reduction technique.
If the underlying structure of your data is non-linear (e.g., data points forming a spiral or a
sphere in higher dimensions), PCA will fail to capture this intrinsic structure, leading to
distorted or uninformative principal components.
● Emphasis on Variance, Not Class Separation (for Supervised Tasks): PCA is
unsupervised; it doesn't consider any class labels or target variables. In a classification
problem, it's possible that the directions of highest variance are not the directions that
best separate your classes. Components with low variance might actually contain crucial
discriminatory information that PCA would discard.
● Interpretability is Paramount: If understanding the exact meaning and contribution of
each original feature is critical for your problem, PCA's transformed, abstract components
can be a major drawback. While loading plots can help, they don't fully restore the original
interpretability.
● Outlier Sensitivity: As mentioned, PCA is sensitive to outliers. Extreme data points can
heavily influence the calculation of the covariance matrix and, consequently, the principal
component directions, leading to skewed results.
● Categorical Data: PCA is designed for numerical data. Applying it directly to one-hot
encoded or other forms of categorical data can be problematic, as the concept of
"variance" might not translate meaningfully for discrete categories.
● When Noise is Important: In some niche applications, small variations (which PCA might
see as low-variance components) could actually be the signal of interest (e.g., detecting
subtle anomalies). Blindly removing low-variance components could remove the very
information you need.

9. Alternatives and Extensions to PCA

Recognizing PCA's limitations has led to the development of several alternative and extended
dimensionality reduction techniques:
● Non-linear Dimensionality Reduction Methods (Manifold Learning): These methods
aim to uncover non-linear structures (manifolds) in high-dimensional data.
○ Kernel PCA (KPCA): An extension of PCA that uses the "kernel trick." It implicitly
maps the data into a higher-dimensional feature space where it is hoped that linear
PCA can then be effectively applied. This allows KPCA to capture non-linear
relationships that standard PCA would miss.
○ t-Distributed Stochastic Neighbor Embedding (t-SNE): Excellent for visualizing
high-dimensional data in 2D or 3D, preserving local neighborhood structures. It's
often used for clustering visualization.
○ Uniform Manifold Approximation and Projection (UMAP): Similar to t-SNE but
often faster and better at preserving global data structure.
○ Isomap, Locally Linear Embedding (LLE): Other manifold learning techniques
that aim to preserve geodesic distances or local linearity.
● Supervised Dimensionality Reduction: Unlike PCA, these methods consider the target
variable (labels) during dimensionality reduction.
○ Linear Discriminant Analysis (LDA): A supervised technique that finds
projections that maximize class separability rather than total variance. It's often
used for classification problems.
● Other Dimensionality Reduction Techniques:
○ Independent Component Analysis (ICA): Aims to separate a multivariate signal
into additive subcomponents that are statistically independent of each other (e.g.,
separating mixed audio sources).
○ Non-Negative Matrix Factorization (NMF): Decomposes a non-negative matrix
into two non-negative matrices. Useful for data where features are inherently
additive (e.g., text analysis, image processing).
○ Autoencoders: Neural networks trained to reconstruct their input. The bottleneck
layer in an autoencoder can learn a lower-dimensional representation of the data.

10. Advanced PCA Concepts and Variations

Beyond the standard PCA, there are specialized versions designed for specific challenges:
● Incremental PCA (IPCA): Designed for very large datasets that cannot fit into memory.
IPCA processes data in small batches, updating the principal components incrementally.
This is crucial for handling big data.
● Probabilistic PCA (PPCA): Provides a probabilistic framework for PCA. It assumes that
the observed data is generated from a lower-dimensional latent space with added
Gaussian noise. This formulation allows for handling missing values and offers a more
robust estimation of principal components.
● Sparse PCA: Encourages the principal components to have many zero loadings. This
results in components that are more interpretable, as they depend on a smaller subset of
the original features. Useful when interpretability is a key concern and you want to identify
specific contributing features.
● Robust PCA: Designed to handle outliers and noisy data more effectively than standard
PCA. It often decomposes the data matrix into a low-rank component (representing the
clean data) and a sparse component (representing outliers or noise).
● Weighted PCA: Assigns different weights to observations or features, allowing you to
emphasize certain aspects of the data during the PCA process.

11. Practical Considerations

● Feature Scaling is Non-Negotiable: Re-emphasizing this point, standardization (or
normalization) is almost always required before PCA to prevent variables with larger
scales from dominating the principal components.
● Choosing the Number of Components: This is a crucial decision.
○ Scree Plot: Visually identifying the "elbow" where the explained variance plateaus.
○ Cumulative Explained Variance: Selecting enough components to reach a certain
threshold of explained variance (e.g., 80%, 90%, or 95%).
○ Cross-validation: For supervised tasks, you can use cross-validation to find the
number of components that optimizes your model's performance.
○ Domain Knowledge: Expert knowledge can guide the selection if certain
components are known to be physically or logically important.
● Interpretation Challenges: While loading plots (which show the correlation between
original features and principal components) can help, fully interpreting the meaning of a
principal component (a linear combination of many variables) can still be challenging.

AP Statistics Confidence Intervals
No ratings yet
AP Statistics Confidence Intervals
11 pages
ML Module 6
No ratings yet
ML Module 6
6 pages
Love Report 1
No ratings yet
Love Report 1
10 pages
Dimensionality Reduction Technique
No ratings yet
Dimensionality Reduction Technique
17 pages
Love Report
No ratings yet
Love Report
7 pages
Principal Component Analysis Limitations and How To Overcome Them Let's Talk A
No ratings yet
Principal Component Analysis Limitations and How To Overcome Them Let's Talk A
5 pages
1 Principal Component Analysis (PCA) : Complete Lecture Notes
No ratings yet
1 Principal Component Analysis (PCA) : Complete Lecture Notes
22 pages
Pca 1
No ratings yet
Pca 1
3 pages
PCA in Machine Learning Explained
No ratings yet
PCA in Machine Learning Explained
33 pages
PCA Guide for B.Tech Students
No ratings yet
PCA Guide for B.Tech Students
10 pages
Principal Component Analysis (PCA)
No ratings yet
Principal Component Analysis (PCA)
2 pages
Cheat Sheet
No ratings yet
Cheat Sheet
2 pages
Principal Component Analysis (PCA)
No ratings yet
Principal Component Analysis (PCA)
3 pages
UNIT-4 Machine Learning
No ratings yet
UNIT-4 Machine Learning
20 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
11 pages
Ai (PCA)
No ratings yet
Ai (PCA)
3 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
16 pages
The Intuition Behind PCA: Machine Learning Assignment
No ratings yet
The Intuition Behind PCA: Machine Learning Assignment
11 pages
ML Mod 4 & 6 Pyq
No ratings yet
ML Mod 4 & 6 Pyq
11 pages
Dimensionality Reduction, PCA, and Kernel Methods
No ratings yet
Dimensionality Reduction, PCA, and Kernel Methods
3 pages
Dimensionality Reduction: Key Concepts
No ratings yet
Dimensionality Reduction: Key Concepts
13 pages
Aiml Exp 5 Viva
No ratings yet
Aiml Exp 5 Viva
4 pages
Day 8
No ratings yet
Day 8
25 pages
Dimensionality Reduction Algorithms
No ratings yet
Dimensionality Reduction Algorithms
7 pages
Principal Component Analysis and Cluster Analysis
No ratings yet
Principal Component Analysis and Cluster Analysis
14 pages
Data Science Module 5
No ratings yet
Data Science Module 5
28 pages
Module 3
No ratings yet
Module 3
41 pages
Unit 3
No ratings yet
Unit 3
21 pages
Pca - Principal Component Analysis 1233
No ratings yet
Pca - Principal Component Analysis 1233
30 pages
PCA & LDA for Engineering Students
No ratings yet
PCA & LDA for Engineering Students
5 pages
Dimensionality Reduction Guide
No ratings yet
Dimensionality Reduction Guide
79 pages
Module 4
No ratings yet
Module 4
48 pages
ML 6
No ratings yet
ML 6
7 pages
Module3 OTML
No ratings yet
Module3 OTML
67 pages
Feature Extraction Techniques
No ratings yet
Feature Extraction Techniques
32 pages
U5@-Data Reduction
No ratings yet
U5@-Data Reduction
22 pages
Unit 3: Discriminant Analysis and Cluster Analysis
No ratings yet
Unit 3: Discriminant Analysis and Cluster Analysis
43 pages
ML Mod 4 Part 2
No ratings yet
ML Mod 4 Part 2
32 pages
Day School 03
No ratings yet
Day School 03
32 pages
10-601 Machine Learning (Fall 2010) Principal Component Analysis
No ratings yet
10-601 Machine Learning (Fall 2010) Principal Component Analysis
8 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
9 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
2 pages
PCA Guide: Usage, Python Implementation, Feature Importance
No ratings yet
PCA Guide: Usage, Python Implementation, Feature Importance
9 pages
Face Recognition Using PCA
No ratings yet
Face Recognition Using PCA
8 pages
Principle Component Analysis (PCA) : Purpose of This Project
No ratings yet
Principle Component Analysis (PCA) : Purpose of This Project
30 pages
IRJMETS443407
No ratings yet
IRJMETS443407
7 pages
Data Reduction
No ratings yet
Data Reduction
9 pages
Clustering and Dimensionality Reduction Techniques PCA T SNE K Means
No ratings yet
Clustering and Dimensionality Reduction Techniques PCA T SNE K Means
15 pages
1.variable Reduction 2.principal Component Analysis: Topic UNIT-4
No ratings yet
1.variable Reduction 2.principal Component Analysis: Topic UNIT-4
19 pages
Dimensionality Reduction: Motivation I: Data Compression
No ratings yet
Dimensionality Reduction: Motivation I: Data Compression
35 pages
PCA Dev
No ratings yet
PCA Dev
16 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
33 pages
Feature Extraction & PCA Guide
No ratings yet
Feature Extraction & PCA Guide
10 pages
What Is PCA?: Image Source
No ratings yet
What Is PCA?: Image Source
17 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
8 pages
CHBE413CDS Lecture 12 Unsupervised DimRed
No ratings yet
CHBE413CDS Lecture 12 Unsupervised DimRed
30 pages
ANOVA Analysis for Researchers
No ratings yet
ANOVA Analysis for Researchers
32 pages
Numerical Descriptive Measure, Lecture-2
No ratings yet
Numerical Descriptive Measure, Lecture-2
21 pages
Finley 2004
No ratings yet
Finley 2004
22 pages
Statistical Machine Learning W4400 Lecture Slides PDF
No ratings yet
Statistical Machine Learning W4400 Lecture Slides PDF
520 pages
Mod 2
No ratings yet
Mod 2
11 pages
423 - ShreyaKumari - TSA - 2 - Shreya Kumari
No ratings yet
423 - ShreyaKumari - TSA - 2 - Shreya Kumari
5 pages
Sampling Distribution: Definition
No ratings yet
Sampling Distribution: Definition
39 pages
STA 405: Linear Modelling 2: Dr. Idah
100% (1)
STA 405: Linear Modelling 2: Dr. Idah
30 pages
Wooldridge 6e Ch10 IM
No ratings yet
Wooldridge 6e Ch10 IM
15 pages
M.Tech Research Methodologies Exam
No ratings yet
M.Tech Research Methodologies Exam
3 pages
Data Analysis and Graphics Using R 1st Edition Matthew Norman Download
No ratings yet
Data Analysis and Graphics Using R 1st Edition Matthew Norman Download
53 pages
Maswali Ya Applied Econometric Tutorial Set 1
No ratings yet
Maswali Ya Applied Econometric Tutorial Set 1
4 pages
ChoiceModelR Manual
No ratings yet
ChoiceModelR Manual
17 pages
Chapter 8 Multiple Regression - Oct21
No ratings yet
Chapter 8 Multiple Regression - Oct21
24 pages
This Study Resource Was
No ratings yet
This Study Resource Was
4 pages
I Wayan Agus Wirya Pratama - UjianWord
No ratings yet
I Wayan Agus Wirya Pratama - UjianWord
23 pages
Math 810 Applied Statistics
No ratings yet
Math 810 Applied Statistics
8 pages
Year Wise GDP, Growth of Agriculture Sector, Manufacturing and Services Sector of Pakistan
No ratings yet
Year Wise GDP, Growth of Agriculture Sector, Manufacturing and Services Sector of Pakistan
25 pages
Wagner Chapter 5
No ratings yet
Wagner Chapter 5
10 pages
Logistic Regression Monograph - DSBA v2
No ratings yet
Logistic Regression Monograph - DSBA v2
54 pages
Slides 0
No ratings yet
Slides 0
21 pages
Statistics
No ratings yet
Statistics
6 pages
Result Prediction For European Football Games: Xiaowei Liang Zhuodi Liu Rongqi Yan
No ratings yet
Result Prediction For European Football Games: Xiaowei Liang Zhuodi Liu Rongqi Yan
5 pages
Python & ML Course: AI, EDA, DL Basics
No ratings yet
Python & ML Course: AI, EDA, DL Basics
4 pages
Homework 3
No ratings yet
Homework 3
7 pages
Table VII Critical Values For The Sign Test
No ratings yet
Table VII Critical Values For The Sign Test
1 page
Levels of Data
100% (1)
Levels of Data
26 pages
ST 3901
No ratings yet
ST 3901
3 pages
Measures of Dispersion Modified PPT Sheet-3
No ratings yet
Measures of Dispersion Modified PPT Sheet-3
61 pages

Pca 2

Uploaded by

Pca 2

Uploaded by

Yes, there's definitely more to delve into regarding Principal Component Analysis, especially

Principal Component Analysis (PCA): Further Insights

9. Alternatives and Extensions to PCA

10. Advanced PCA Concepts and Variations

11. Practical Considerations

You might also like