ML Module I
ML Module I
Module I
➢ BY:
DR. ARUNDHATI DAS
Module I: Introduction to Machine Learning
• 1.1 Introduction to Machine Learning, Issues in Machine Learning,
Application of Machine Learning, Steps of developing a Machine
Learning Application.
• 1.2 Supervised and Unsupervised Learning: Concepts of
Classification, Clustering and prediction, Training, Testing and
validation dataset, cross validation, overfitting and underfitting of
model
• 1.3 Performance Measures: Measuring Quality of model- Confusion
Matrix, Accuracy, Recall, Precision, Specificity, F1 Score, RMSE
2
Module I: Introduction to Machine Learning
• 1.1 Introduction to Machine Learning, Issues in Machine Learning,
Application of Machine Learning, Steps of developing a Machine
Learning Application.
3
Module No. 1: Introduction
➢
Courtesy: Internet 4
Module No. 1: Introduction
➢ Brief history of machine learning
➢ 1950s
➢ Samuel’s checker playing game
➢ Samuel coined the term Machine Learning
➢ 1960s
➢ Neural network: Rosenblatt’s perceptron
➢ Perceptron is a simple neural network unit. It resembles how a biological neuron works
➢ Window and Hoff’s Delta Rule (Least Mean Square Rule)
➢ Delta rule is used for learning perceptron
➢ It is a specific type of backpropagation in neural network training
➢ All these combinations gave birth to a good linear classifier
➢ Minsky and Ppert pointed out some limitations of perceptron
➢ Following this nueral network research went on a pause until 1980s
5
Module No. 1: Introduction
➢ Brief history of machine learning
➢ 1970s
➢ Symbolic concept induction
➢ J. R Quinlan in 1986 came up with decision tree learning ID3 algorithm
➢ Subsequently IDs improvement, alternatives developed such as cart, regression trees
➢ 1980s
➢ Advanced decision tree and rule learning developed
➢ Learning, planning, problem solving were developed
➢ Revival of neural network
➢ Multilayer perceptron in 1981
➢ Backpropagation algorithm specific to neural network was developed
➢ Lot of research are done on Multilayer perceptron after that
➢ Theoretical framework of machine learning developed
➢ Vallant’s PAC (Probably Approximately Correct) learning
➢ Focus shifted to experimental methodologies
6
Module No. 1: Introduction
➢ Brief history of machine learning
➢ 1990s
➢ Machine learning (ML) and statistics
➢ SVM proposed by Vapnik and Cortes
➢ S V Hem provided theoretical and experimental/empirical standings
➢ Adaptive agents, web applications
➢ Text learning
➢ Reinforcement learning
➢ Ensembles or boosting, Adaboost
➢ 1994 first self driving car road test
➢ 2000s
➢ Kernel SVM
➢ Random Forest
➢ Baye’s net learning
7
Module No. 1: Introduction
➢ Why machine learning popular today
➢ New software/algorithms
➢ Neural network
➢ Deep learning
➢ New hardware
➢ GPUs
➢ Cloud enable
➢ Big data available
8
Definition of ML
➢ Formal definition: It is about building the computer systems that automatically improves with experience
➢ Definition by Tom M Mitchell: A computer program is said to learn from experience E with respect to some
class of Tasks T and performance measure P, if its performance on Tasks T as measured by P improves with
experience E
➢ Measure of improvement in performance P: eg. You want to improve accuracy in prediction, you might
want to have new skills to the agents which it did not earlier process, or improve efficiency of problem
solving.
9
DL vs ML vs AI
Courtesy: Internet 10
DL vs ML vs AI vs DS
Courtesy: Internet 11
DL vs ML vs AI vs GenAI vs LLM
Courtesy: Internet 12
➢Applications of machine learning
➢
➢ Medicine
➢ Diagnose a disease
➢ i/p → symptoms, lab measurements, test results, DNA tests etc
➢ o/p→ one from the possible set of diseases or none of the above
➢ Background knowledge (examples)→ learn from past medical records
➢ Computer vision
➢ What/ where objects appear in an image
➢ Convert hand-written digits/texts to characters/recognize text written in image (OCR optical character
recognition)
➢ Robot control
➢ Designing autonomous robots
13
➢Applications of machine learning
➢ Financial
➢ Predict if a stock will rise or fall
➢ Predict if a user will click on an ad or not
➢ Improve customer’s experience in online shopping
➢ Miscellaneous applications
➢ Identify price sensitivity of a customer product, identify optimum price point that maximizes profit
➢ Optimize product location at a super market retail outlet
➢ Credit card fraud detection
14
Steps of developing a Machine Learning
Application.
Data preparation or data pre-processing
• The process of preparing raw data so that it is suitable for further
processing and analysis.
• Data preparation or data pre-processing include collecting, reshaping,
filtering, merging, cleaning, outlier removal, handling missing values,
feature normalization and labeling raw data into a form suitable for
machine learning (ML) algorithms and then exploring and visualizing the
data.
16
Data preparation steps
• Data preparation steps: 1. Gather data, 2. Discover and access the data, 3. Cleanse the data,
4. Transform and enrich the data, 5. Store the data.
• 1. Gather data
• The data preparation process begins with finding the right data. This can come from an existing data catalog or data
sources can be created for a particular application.
• 2. Discover and assess data
• After collecting the data, it is important to discover each dataset. This step is about getting to know the data and
understanding what has to be done before the data becomes useful in a particular context.
• 3. Cleanse data
• Cleaning up the data is traditionally the most time-consuming part of the data
preparation process, but it’s crucial for removing faulty data and filling in gaps.
Important tasks here include:
• Removing extraneous data and outliers
• Filling in missing values
• Conforming data to a standardized pattern mostly by normalizing
17
Data preparation steps
• 4. Transform and enrich data
• Data transformation is the process of updating the format or value entries in
order to reach a well-defined outcome, or to make the data more easily
understood by a wider audience. Enriching data refers to adding and
connecting data with other related information to provide deeper insights.
• 5. Store data
• Once prepared, the data can be stored safely.
19
Module I: Introduction to Machine Learning
• 1.2 Supervised and Unsupervised Learning: Concepts of
Classification, Clustering and prediction, Training, Testing and
validation dataset, cross validation, overfitting and underfitting of
model
20
➢How to create a learning system
➢ Choose the training experience/data
➢ Choose the target function that is to be learnt
➢ Choose how we want to represent the model
➢ Choose a learning algorithm to infer the target function
21
Different types of machine learning
22
➢Different types of machine learning
➢ Supervised (inductive) learning
Training data includes desired outputs
➢ X,y
➢ Given a new observation x, what is the best label of y?
➢ Unsupervised learning
Training data does not include desired outputs
➢ X
➢ Given a set of x’s, cluster or group or summarize them
➢ Semi-supervised learning
Training data includes a few desired outputs(few labelled and few unlabeled)
➢ Reinforcement learning
Rewards from sequence of actions(rewards for correct action, penalty for incorrect action)
➢ Determine what to do based on rewards and penalty
23
Supervised (inductive) learning (classifcation)
➢ Supervised (inductive) learning (classifcation)
Training data includes desired outputs
➢ X,y
➢ Given a new observation x, what is the best label of y?
24
Unsupervised learning (clustering)
➢ Unsupervised learning (clustering)
Training data does not include desired outputs
➢ X
➢ Given a set of x’s, cluster or group or summarize them
➢
25
Semi-supervised learning
➢ Semi-supervised learning
Training data includes a few desired outputs(few labelled and few unlabeled)
➢ Reinforcement learning
Rewards from sequence of actions (rewards for correct action, penalty for incorrect action)
➢ Determine what to do based on rewards and penalty
➢
27
➢Supervised (inductive) learning
➢ A set of training examples: values for i/p feature and target features are given for each example
➢ Test data: A new example: values for the i/p features are given
➢ Predict the values for the target features for the new example
➢ Classification when y is discrete
➢ Regression when y is continuous
➢ Feature:
➢ Categorical: eg. Color red, yellow, blue, green; Blood group A, B, AB, O
➢ Ordinal: eg. Small, medium, large
➢ Integer valued: eg. Class 1, 2,3,…
➢ Real valued (continuous): eg. height
28
Supervised Learning
➢ Model or hypothesis
29
Supervised Learning
➢
30
➢ Representation
Fig. (2) Linear function (3) Multivariate linear function (Courtesy S Haykin book) 31
➢ Representation
➢ Hypothesis
➢ A machine learning hypothesis is a candidate model that approximates a target function for mapping inputs
to outputs.
➢ Target function
➢ The target function is essentially the formula that an algorithm feeds data to in order to calculate predictions.
➢ Hypothesis space
➢ Supervised learning algorithm/machine can be considered as a device that explores a hypothesis space
➢ Each setting of the parameters in the machine is a different hypothesis about the function that maps
input vectors to output vectors
➢ Features: Distinct traits that can be used to describe each item in a quantitative manner.
➢ Feature vector: n-dimensional vector of numerical features that represent some object/class.
➢ Feature space: set of all features
33
➢Some Terminologies
➢ Inductive learning (or prediction or concept learning): process where learner discovers rules by observing
examples or on the basis of past experience, formulating a generalized concept.
34
Hypothesis and hypotheses space
35
➢ IRIS data set
➢ Feature values, class labels (target feature)
36
Outline:
Model Validation in Classification: Cross Validation -
Holdout Method, K-Fold, Stratified K-Fold, Leave-One-Out
Cross Validation. Bias-Variance tradeoff, Regularization,
Overfitting, Underfitting.
Model Validation in Classification
• What is model validation?
• Model validation means checking whether the designed Machine learning
model is capable of correctly classifying/performing well on unseen data.
• Validating means checking whether a correct classification is done randomly!
• Why model validation is required?
• To see how it will perform on unseen data.
• To confirm whether the model will perform equally well outside the
laboratory datasets.
• How model validation is done?
• There are various ways of validating a model among which the mostly famous
techniques are Train/Test split and Cross Validation.
➢ Experimental evaluation of machine learning algorithms
➢ Importance of evaluation
➢ We need to evaluate the performance of the trained model
➢ To predict the class labels of the data points
➢ Sampling methods
➢ Training and test sets: disjoint sets are preferred
➢ K-fold cross validation: splitting data into different training sets to tune parameters of the algorithm
39
How model validation is done?
1. i. Train/Test split
• The original dataset is split into Train/Training
set and Test set.
• The dataset can be divided into 70-30 or 60-40,
75-25 or 80-20, or even 50-50 depending on the Training Set
application at hand. As a rule, the proportion of
training data has to be larger than the test data.
• On the Train set, the machine learning model is
built.
• The built ML model can be tested on both the
Train data (already seen/known by the model)
and Test set (the unseen/unknown data by the
model).
• The model will give some accuracy value on the
Train set and some other accuracy value on the
Test set. Test Set
• If the accuracy on the Train set and the Test set
are similar or close, then the model is said to be
a well trained model.
• If the accuracy on the Test set is less than the
accuracy on the Train set then the model is said
to be not a well trained model. This situation is
called overfitting.
1. ii. Train/Validation/Test split Data Set
Group no. 2
• Cross-validation is another
model validation popularly used
in Machine Learning. The steps .
are as follows: .
1. The dataset is randomly split up into .
‘n’ sets of equal size.
2. One of the set is used as the test set
.
and the rest are used as the training .
set. .
3. The model is trained on the training
set and tested on the test set. .
4. Then the process is repeated for ‘n’ Group no. n
iterations until each unique set has
been used as the test set.
2. i. K-fold Cross Validation Data Set
Group no. 1
Courtesy: Internet
2. ii. Stratified K-Fold Cross- Data Set
➢ Underfitting:
➢ Model is too simple to represent all the
relevant class characteristics
➢ High training error, high test error
➢ Low variance, high bias
Underfitting Overfitting
Solution for overfitting
• How do we deal with this?
1) Reduce number of features
• Manually select which features to keep
• But, in reducing the number of features we lose some information
• Ideally select those features which minimize data loss, but even so, some
info is lost
2) Regularization
• Keep all features, but reduce magnitude of parameters θ
• Works well when we have a lot of features, each of which contributes a bit to
predicting y
Module I: Introduction to Machine Learning
• 1.3 Performance Measures: Measuring Quality of model- Confusion
Matrix, Accuracy, Recall, Precision, Specificity, F1 Score, RMSE
58
Classification metrics for Performance
measure
➢Classification metrics are used to evaluate the performance of the data
mining/ machine learning algorithms
𝑇𝑁
Specificity= 𝑇𝑁+𝐹𝑃
Definitions
• Confusion matrix: A Confusion matrix is an N x N matrix used for evaluating
the performance of a classification model, where N is the number of target
classes. The matrix compares the actual target values with those predicted by the
machine learning model.
• Accuracy: Accuracy simply measures how often the classifier makes the correct
prediction. It’s the ratio between the number of correct predictions and the total
number of predictions.
• Precision: It is a measure of correctness that is achieved in true prediction. In
simple words, it tells us how many predictions are actually positive out of all
the total positive predicted.
• Recall: It is a measure of actual observations which are predicted correctly, i.e.
how many observations of positive class are actually predicted as positive. It is
also known as Sensitivity or TPR (True Positive Rate).
• Specificity: Specificity is the measure of a test's ability to correctly identify true
negatives (i.e., correctly excluding cases where the condition is absent). Also
known as TNR (True Negative Rate).
Confusion matrix in
case of 3-class
problem
Numerical on example dataset
• Discussed in class some random dataset example dog vs cat
• The numerical discussed on board is important, practice it.
Q and A
69
Q and A
70
THANK YOU
71