Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
48 views17 pages

Chapter 1.2. Overview of ML

Uploaded by

Sơn Trịnh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views17 pages

Chapter 1.2. Overview of ML

Uploaded by

Sơn Trịnh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 17

Nhân bản – Phụng sự – Khai phóng

Introduction to Dimensionality
Reduction for Machine Learning
Machine Learning
Problems
• Data have many features
• Training extremely slow,
• Harder to find a good solution.
• This problem is often referred to as the curse of dimensionality.
• Possible to reduce the number of features

Machine Learning
Techniques for Dimensionality Reduction

• Feature Selection Methods


• Matrix Factorization
• Manifold Learning
• Autoencoder Methods

Machine Learning
What is Dimensionality Reduction

• Dimensionality reduction means reducing feature


• is a way of converting the higher dimensions dataset into lesser
dimensions dataset ensuring that it provides similar information

Machine Learning
Dimensionality Reduction Methods and Approaches

Machine Learning
The Curse of Dimensionality
• Handling the high-dimensional data is very difficult in practice, commonly
known as the curse of dimensionality.

• If the dimensionality of the input dataset increases, any machine learning


algorithm and model becomes more complex.

Machine Learning
Why Dimensionality Reduction is Important
• Few features mean less complexity
• Less storage space because you have fewer data
• Fewer features require less computation time
• Model accuracy improves due to less misleading data
• Algorithms train faster
• Reducing the data set’s feature dimensions helps visualize the data faster
• It removes noise and redundant features

Machine Learning
Disadvantages of dimensionality Reduction

• Some data may be lost due to dimensionality reduction.

• In the PCA dimensionality reduction technique, sometimes the principal


components required to consider are unknown.

Machine Learning
Approaches of Dimension Reduction

• Feature Selection
• Feature Extraction

Machine Learning
Feature Selection
• selecting the subset of the relevant features and leaving out the
irrelevant features

• it is a way of selecting the optimal features from the input dataset.

• => to build a model of high accuracy

Machine Learning
Methods are used for the feature selection
• Filters Methods
• Correlation
• Chi-Square Test
• ANOVA
• Wrappers Methods
• Forward Selection
• Backward Selection
• Both-directional
• Embedded Methods:
• LASSO
• Elastic Net
• Ridge Regression
Machine Learning
Feature Extraction
• Feature extraction is the process of transforming the space containing
many dimensions into space with fewer dimensions.
• feature extraction techniques
• Principal Component Analysis
• Linear Discriminant Analysis
• Kernel PCA
• Quadratic Discriminant Analysis

Machine Learning
Principal Component Analysis (PCA)
• Principal Component Analysis is a statistical process that converts the
observations of correlated features into a set of linearly uncorrelated
features

• PCA works by considering the variance of each attribute because the


high attribute shows the good split between the classes, and hence it
reduces the dimensionality.

Machine Learning
Backward Feature Elimination
• The backward feature elimination technique is mainly used while
developing Linear Regression or Logistic Regression model.
• all the n variables of the given dataset are taken to train the model.
• The performance of the model is checked.
• remove one feature each time and train the model on n-1 features for
n times, and will compute the performance of the model.
• check the variable that has made the smallest or no change in the
performance of the model, and then we will drop that variable or
features
• Repeat the complete process until no feature can be dropped.

Machine Learning
Forward Feature Selection
• Forward feature selection follows the inverse process of the backward
elimination process.
• find the best features that can produce the highest increase in the
performance of the model.
• start with a single feature only, and progressively we will add each
feature at a time.
• Train the model on each feature separately.
• The feature with the best performance is selected.
• The process will be repeated until we get a significant increase in the
performance of the model.

Machine Learning
• Example and Exercises
https://github.com/ageron/handsonml2/blob/master/
08_dimensionality_reduction.ipynb

Machine Learning
• Demo

Machine Learning

You might also like