Lecture 4 - Basics of ML
Prof. Ankit Gangwal
Assistant Professor, IIIT-H, India
email:
[email protected] Web: CiaoAnkit.github.io
Introduction Prof. Ankit Gangwal 93
Announcements
Introduction Prof. Ankit Gangwal 94
Table of Contents
● What is ML?
● Why ML?
● Basic Workflow of ML
● Basic Terminology in ML
● Tasks involved in Supervised Learning
○ Regression
○ Classification
● Perceptron
● Activation Functions
● Multi-layer perceptron
Introduction Prof. Ankit Gangwal 95
Table of Contents
● Learning parameters of a neural network
● Loss Functions in neural networks
○ Mean square error
○ Cross entropy
● Backpropagation, mini-batch training, SGD,
● Evaluating ML models
● Overfitting vs Underfitting
● Early Stopping
● Convolutional neural nets
● Vision CNN models for Image classification
● Miscellaneous ML topics
Introduction Prof. Ankit Gangwal 96
Basics of Machine Learning
What is Artificial Intelli ence?
Introduction Prof. Ankit Gangwal 97
Basics of Machine Learning
What is Machine Learnin ?
Introduction Prof. Ankit Gangwal 98
Basics of Machine Learning
What is Machine Learnin ?
Machine learning (ML) is a field o study in artificial intelli ence
concerned with the development and study o statistical al orithms
that can learn rom data and eneralize to unseen data, and thus
per orm tasks without explicit instructions.[1]
Introduction Prof. Ankit Gangwal 99
Basics of Machine Learning
Deep Learning??
Introduction Prof. Ankit Gangwal 100
Basics of Machine Learning
Introduction Prof. Ankit Gangwal 101
Basics of Machine Learning
AGI??
Introduction Prof. Ankit Gangwal 102
Basics of Machine Learning
AGI??
ANGI??
Introduction Prof. Ankit Gangwal 103
Why do we study ML?
● Because hard-coding is not always feasible
● We do not have an pattern algorithm for every problem
● Because when intelligently designed they can be scalable
● Adaptibility - Reinforcement Learning
● Can be applied in almost every area!
○ Healthcare - Biology, Microbiology
○ Finance
○ Robotics(Driving,Drones)
○ NLP(Chatgpt,claude)
○ Computer Vision
○ Graphs(2d and 3d)
Introduction Prof. Ankit Gangwal 104
Why do we study ML?
Introduction Prof. Ankit Gangwal 105
Basics of Machine Learning
Introduction Prof. Ankit Gangwal 106
Basic Workflow in ML
What is an ML model?
An ML model can be thought of as a function F(x).
Every function has an input and an output
Take an example of recommending movies on netflix.
In this scenario, input is the history of all the previous
movies watched by you. Output would be recommending a
movie you like. So, ML is basically trying to find the best
possible estimate for F(x).
Introduction Prof. Ankit Gangwal 107
Basic Workflow in ML
How do we find the best estimate of F(x)?
Ans: We learn!
How do we learn?
Ans: By Training
Sounds Familiar?
This is exactly how we humans learn anything
Introduction Prof. Ankit Gangwal 108
Basic Workflow in ML
Let’s take the example of humans learning how to drive..
Day 1: Terrible
Day 2: Decent This is exactly how
Day 3: Good ML models learn!!
Day 4: Expert
Introduction Prof. Ankit Gangwal 109
Basic Workflow in ML
Now let’s try to find the most important aspects of ‘learning how to drive’
Task/Goal: Learn to drive Task
Medium to learn: Driving Instructor Algorithm
Experience: longer you learn, better you get Data
This is literally how you train an ML model as well!
Introduction Prof. Ankit Gangwal 110
Basic Workflow in ML
Now, let’s try to build a model which outputs 1 if there
exists the word ‘Chubby cat’ in a given paragraph…
Now, let’s try to build a model which outputs 1 if there
exists a ‘Chubby cat’ in a given image…
Introduction Prof. Ankit Gangwal 111
Some Terminology
Now, let’s define some terminology before we explore more
interesting topics…
Introduction Prof. Ankit Gangwal 112
Taxonomy of Data
Introduction Prof. Ankit Gangwal 113
Taxonomy of ML Tasks
Introduction Prof. Ankit Gangwal 114
Regression
Predicts continuous numerical values.
Ex: Stock Market Prediction
Housing price prediction
Estimating Blood sugar
Estimated Delivery time in e-commerce
Introduction Prof. Ankit Gangwal 115
Classification
Categorizes data into classes or groups
Ex: Image Classification
Video Classification
Hate Speech Detection
Link Prediction in social networks
Recommendation system
Introduction Prof. Ankit Gangwal 116
Building a Neural network
● Now, our goal is to build a mathematical function F(x) such that
it is learnable and can process input.
● So, we basically want to mimic the human brain.
● We start with the smallest unit of the human brain.
Introduction Prof. Ankit Gangwal 117
Neuron
3 main parts:
● Dendrites
● Soma
● Axon terminal
This is similar to a function.
● Input
● Processing
● Output
Introduction Prof. Ankit Gangwal 118
Neuron
We can start with the simplest mathematical function which can take an
input, process it and produce an output.
This is obviously not enough complexity to learn complex equations.
Let’s try to list out some of the problems.
1. Multiple input features
2. Linearity
Now, let’s try fix these problems one by one.
Introduction Prof. Ankit Gangwal 119
Neuron
Before we fix the problems of Neuron, let’s try to visualize what it does.
Introduction Prof. Ankit Gangwal 120
Multiple Input features
This is an easy fix. We can simply assign weights to every single feature.
We can simplify this representation by using matrices.
Introduction Prof. Ankit Gangwal 121
Multiple input features - Visualization
Introduction Prof. Ankit Gangwal 122
Linearity
The capabilities of non-linear equations is limited.
To mitigate this issue, we introduce activation functions.
An activation function is a mathematical function applied to the output of a
neuron. It introduces non-linearity into the model, allowing the network to
learn and represent complex patterns in the data.
Examples include:
● Linear
● Sigmoid
● Tanh
● ReLU
● GeLU
● Softmax etc..
Introduction Prof. Ankit Gangwal 123
Activation Functions
Linear Activation Function
Introduction Prof. Ankit Gangwal 124
Activation Functions
Sigmoid Activation Function
Characterized by it’s S shape.
Output is restricted to between 0 and 1.
Introduction Prof. Ankit Gangwal 125
Activation Functions
Tanh Activation Function
Shifted version of sigmoid that is stretched along the y-axis
Output is between -1 and 1.
Known for it’s 0-centered distribution
Introduction Prof. Ankit Gangwal 126
Activation Functions
ReLU Activation Function
Output range is between 0 and -inf
Introduction Prof. Ankit Gangwal 127
Neural Network
Now, let’s try to piece all of them together to create the most basic form of a neural
network that can model non-linear functions
Introduction Prof. Ankit Gangwal 128
Deep Neural Network
Introduction Prof. Ankit Gangwal 129
Deep Neural Network (Multilayer Perceptron)
Why do we need deeper neural networks?
Introduction Prof. Ankit Gangwal 130
Universal approximation theorem
In simple words, the universal approximation theorem says that neural
networks(with at least one hidden layer) can approximate any function.
Introduction Prof. Ankit Gangwal 131
Deep Neural Network
Now, let’s try to build a simple feedforward neural network with
multiple hidden layers.
Introduction Prof. Ankit Gangwal 132
Forward Propagation
Introduction Prof. Ankit Gangwal 133
Forward Propagation
Introduction Prof. Ankit Gangwal 134
Forward Propagation
Introduction Prof. Ankit Gangwal 135
Forward Propagation
Introduction Prof. Ankit Gangwal 136
Forward Propagation
Introduction Prof. Ankit Gangwal 137
Forward Propagation
Introduction Prof. Ankit Gangwal 138
Forward Propagation
Introduction Prof. Ankit Gangwal 139
Forward Propagation
Introduction Prof. Ankit Gangwal 140
Forward Propagation
Introduction Prof. Ankit Gangwal 141
Designing Neural Networks
How do you design an MLP for Regression task?
How do you design an MLP for Classification task?
Introduction Prof. Ankit Gangwal 142
Training a Neural Network
Now, we built a function that is capable of acting as an approximator for any
continuous function when you assign appropriate weights to it
How do we find these weights?
Revisiting how humans learn a task…
Whenever we make a mistake, we try to learn from it and adjust ourselves
accordingly.
For MLPs also we need to define a metric that shows how far off we are from
the actual output
Introduction Prof. Ankit Gangwal 143
Loss Functions
Introduction Prof. Ankit Gangwal 144
Loss Functions
Introduction Prof. Ankit Gangwal 145
Loss Functions
Regression
● Mean Square Error
● Mean Absolute Error
● Relative Mean Square Error
Introduction Prof. Ankit Gangwal 146
Loss Functions
Classification
● Binary Cross entropy loss
● Cross entropy loss
Introduction Prof. Ankit Gangwal 147
Loss Functions
● Now, we have defined the metric for error.
Introduction Prof. Ankit Gangwal 148
Loss Functions
● Now, we have defined the metric for error.
● Next, we have to come up with an algorithm such that weights can be adjusted
in order to decrease this error.
Introduction Prof. Ankit Gangwal 149
Loss Functions
● Now, we have defined the metric for error.
● Next, we have to come up with an algorithm such that weights can be adjusted
in order to decrease this error.
● We can now rewrite this as an optimization problem.
Introduction Prof. Ankit Gangwal 150
Thank you!
Introduction Prof. Ankit Gangwal 151