Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
7 views11 pages

Lec 02

The document provides an overview of machine learning (ML) applications in core engineering disciplines, detailing various types of ML algorithms including supervised, unsupervised, and reinforcement learning. It explains the distinctions between regression and classification in supervised learning, as well as clustering and dimensionality reduction in unsupervised learning. Additionally, it covers the importance of training and test sets, loss functions, and cross-validation in developing effective ML models.

Uploaded by

vinayak457
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views11 pages

Lec 02

The document provides an overview of machine learning (ML) applications in core engineering disciplines, detailing various types of ML algorithms including supervised, unsupervised, and reinforcement learning. It explains the distinctions between regression and classification in supervised learning, as well as clustering and dimensionality reduction in unsupervised learning. Additionally, it covers the importance of training and test sets, loss functions, and cross-validation in developing effective ML models.

Uploaded by

vinayak457
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Machine Learning for Core Engineering Disciplines

EL
Model Prediction
Selection
Data
Training Generation
Feature

PT
Engineering

Prof. Ananth Govind Rajan


Department of Chemical Engineering
Indian Institute of Science, Bengaluru
Website: https://agrgroup.org
Email: [email protected]
Broad categorization of ML algorithms

EL
ML

Supervised Unsupervised Reinforcement


Learning Learning Learning

Training happens based on Target variables aren’t A virtual agent receives rewards
specified, and labels aren’t

PT
labels assigned to a target based on favourable outcomes
variable required
Maximization of the cumulative
Applies to the prediction of Model “learns” the underlying reward leads to an optimal
both continuous and discrete structure of the data based on policy for the problem
variables some similarity metric

2
Broad categorization of ML algorithms

EL
• Supervised learning
• Seeks to map each input feature vector to an output value
• Training based on specified values (labels) of the target variable

Feature 2
• Unsupervised learning
• Seeks to learn underlying structure and inherent
patterns of the data in the feature space

PT
• Dimensionality reduction falls into this class of ML

• Reinforcement learning (RL)

Agent
Action

Reward or
Environment
Feature 1

Penalty
3
Types of supervised learning

EL
Regression Classification
Used to predict a continuous variable Used to classify data into a fixed number of
categories
Example: Given plant data, what is the
temperature of the reactor? Can be “binary” or “multiclass”

PT
Example: Given plant data, has the reactor
overheated or not?

T=?
Given a microscopy image:
Overheated (i) Is the material brittle?
or not? (ii) What is the hardness of the
material?

4
Broad types of unsupervised learning

EL
• Clustering
• Groups datapoints to uncover the similarities and differences between them
• Examples: k-means clustering, density-based clustering

• Dimensionality reduction
• Reduces the number of features in the dataset without losing important relationships

PT
• Examples: principal component analysis (PCA), t-distributed stochastic neighbor
embedding (t-SNE), uniform manifold approximation and projection (UMAP)

• Generative algorithms
• Learn the distribution underlying the data and thus generate new samples which are
similar to the input data
• Examples: generative adversarial networks (GANs), variational autoencoders (VAEs),
restricted Boltzmann machines (RBMs)
5
Examples of supervised learning

EL
Is the molecule
What is the voltage
soluble in
provided by a
water?
battery?

PT Is the person suffering from


chest congestion or not?

In each case, you would have to train the computer with some labeled data
Various algorithms: linear/nonlinear regression, decision trees, random
forests, support vector machines, neural networks & deep learning, …
6
Examples of supervised learning

EL
Example Type of Input Features Target Label
Problem

water?

PT
Molecule soluble in

Voltage provided by
battery?

Person suffering from


Binary
Classification

Regression

Binary
Molecular descriptors/
fingerprints

Battery type, temperature,


chemistry, age, load
conditions

Chest X-ray and/or


Soluble / not
soluble

Voltage
(continuous value)

Yes / no
chest congestion or Classification symptoms
not? 7
Examples of unsupervised
learning

EL
Categorizing elements to find Generating molecules similar
similarities between them based to those in the training dataset
on their features

PT
In each case, you would like the computer to figure out the structure of the data itself
based on the underlying features

Various algorithms: k-means clustering, PCA, GANs, VAEs, etc. 8


Supervised ML as a function approximator

EL
𝑦 = 𝒇(𝑥1 , 𝑥2 , … , 𝑥𝑛 ; 𝛽1 , 𝛽2 , … , 𝛽𝑝 ; 𝛼1 , 𝛼2 , … , 𝛼ℎ )
Target Features Parameters Hyperparameters
variable

Target variable: variable of interest that one desires to predict using a supervised ML model; it
could be a continuous or a discrete variable

PT
Features: independent variables that characterize each datapoint in the dataset

Parameters: learnable coefficients (weights) that the model learns during model training using
multivariate optimization

Hyperparameters: model configuration variables specified before training which determine the
model architecture and training process
9
Training and test sets in ML models

EL
Features Data

Predicted data
(𝒙𝟏𝟏 , 𝒙𝟏𝟐 , … , 𝒙𝟏𝒏 ) 𝒚𝟏
(𝒙𝟐𝟏 , 𝒙𝟐𝟐 , … , 𝒙𝟐𝒏 ) 𝒚𝟐
⋮ ⋮
(𝒙𝒅𝟏 , 𝒙𝒅𝟐 , … , 𝒙𝒅𝒏 ) 𝒚𝒅

PT
target variable values used to train the ML model
Actual data

Training set: collection of data points, i.e., the set of feature and corresponding

Test set: collection of unseen datapoints used to independently verify the


performance of the final model after training is completed

10
How do we enable ML models to learn?

EL
Linear model in 1D

Data

Data
PT Feature Feature

Loss function: metric used to quantify the performance of an ML model in terms of the errors in
the predicted values of the target variable versus its actual values

Cross validation: a method to determine the generalizability of the model across various realizations
of training data and unseen (validation) data; allows one to rationally choose hyperparameters

11

You might also like