Machine Learning Notes

Uploaded by

khushi.gupta5

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views7 pages

Machine Learning Notes

Uploaded by

khushi.gupta5

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

Machine Learning Notes

Module 1
Applications of Machine learning:
 Virtual Personal Assistant  Virtual Personal Assistant
 Speech recognition  Online Transportation
 Email spam and malware filtering  Social Media Services
 Bioinformatics  Email spam filtering
 Natural language processing  Product Recommendation
 Real Time Examples  Online Fraud detection
 Traffic prediction
Advantages of ML:
 Fast, Accurate, Efficient.
 Automation of most applications.
 Wide range of real-life applications.
 Enhanced cyber security and spam detection.
 No human Intervention is needed.
 Handling multi-dimensional data.
Disadvantages of ML:
 It is very difficult to identify and rectify the errors.
 Data Acquisition.
 Interpretation of results Requires more time and space.
Artificial Intelligence is a concept of creating intelligent machines that stimulates human
behavior whereas Machine learning is a subset of Artificial intelligence that allows machine
to learn from data without being programmed.
Disadvantages of Supervised Learning: not
suitable for handling the complex tasks,
cannot predict the correct output if the test
data is different from the training dataset.
Training required lots of computation
times, we need enough knowledge about
the classes of object.
Advantages of Unsupervised Learning:
used for more complex tasks as we don't
have labeled input data. preferable as it is
easy to get unlabeled data in comparison to
labeled data.
Disadvantages of Unsupervised Learning:
intrinsically more difficult than supervised
learning as it does not have corresponding
output. The result might be less accurate as
input data is not labeled, and algorithms do
not know the exact output in advance.

In machine learning projects, we generally divide the original dataset into training dataset
and testing dataset. We train our model over a subset of the original dataset, i.e., the training
dataset(>=60%), and then evaluate whether it can generalize well to the new or unseen
dataset or test set(20-25%).

If the accuracy of the model on training data is greater than that on testing data, then the
model is said to have overfitting. On the other hand, the model is said to be underfitted
when it is not able to capture the underlying trend of the data. It means the model shows poor
performance even with the training dataset. In most cases, underfitting issues occur when the
model is not perfectly suitable for the problem that we are trying to solve. To avoid the
overfitting issue, we can either increase the training time of the model or increase the number
of features in the dataset.
Basic steps of cross-validations are:
 Reserve a subset of the dataset as a validation set.
 Provide the training to the model using the training dataset.
 Now, evaluate model performance using the validation set. If the model performs well
with the validation set, perform the further step, else check for the issues.
Hypothesis space (H): Defined as a set of all possible legal hypotheses; hence it is also
known as a hypothesis set. It is used by supervised machine learning algorithms to determine
the best possible hypothesis to describe the target function or best maps input to output.
Hypothesis (h): It is defined as the approximate function that best describes the target in
supervised machine learning algorithms. It is primarily based on data as well as bias and
restrictions applied to data.
Steps to perform hypothesis test are as follows:
1) Formulate a Hypothesis
2) Determine the significance level
3) Determine the type of test
4) Calculate the Test Statistic values and the p values. P value is a probability (between
0-1) with the assumption that null hypothesis is true.
5) Make Decision
Number of currect predictions
Accuracy=
Total number of predictions
Confusion Matrix:
N Predicted: NO Predicted: YES

Actual: NO

Actual: YES
True Positives ( TP )
Precision(P)=
True Positives (TP )+ False Positives ( FP )
True Positives (TP)
Recall(R)=
True Positives ( TP ) + False Negatives ( FN )
2∗P∗R
F 1 score=
P+ R
If we maximize precision, it will minimize the FP errors, and if we maximize recall, it will
minimize the FN error.
AUC-ROC curve: Probability curve that plots TPR against FPR at various threshold values
and separates signal from noise.
TP FP
TPR∨Sensivity= FPR=
TP+ FN FP+TN
TN
Specificity=
TN + FP
Performance metrics for regression:
1
Mean Absolute Error ( MAE )=
N
∑|Y −Y '|
1
∑
2
Mean Squared Error ( M S E )= ( Y −Y ' )
N

2 MSE ( model ) SSres

R ( coefficient of determination )=1− =1−
MSE ( baseline ) SS tot
n
SSres =∑ ( y i−^
2
yi )
i=1

n
SStot =∑ ( y i− y )
2

i=1

yi= observed values yi cap=predicted values y bar=mean of observed

 R2=1: Perfect fit (explains all variance)
 R2=0: explains none of the variance
 R2<0: Performs worse than simply predicting mean


Module 2
Simple linear regression is when you want to predict values of one variable, given values of
another variable. It is same as a bivariate correlation between the independent and dependent
variable.
The purpose of regression analysis is to come up with an equation of a line that fits through
that cluster of points with the minimal amount of deviations from the line.
Each data point di represents the difference between the observed y-value and the predicted
y-value for a given x- value on the line. These differences are called residuals.
Regression line (line of best fit)- sum of squares of residuals is minimum
^y =mx+b

RMSE=√ MSE

2
R =1−
∑ ( y i− ^y i )2
∑ ( y i− y )2
Multiple regression equation: ŷ = b + m1x1 + m2x2 + m3x3 + … + mkxk xn=independent
variable
Logistic regression equation:
Logistic can’t use MSE because it gives wavy graph, use logarithmic function instead

Gradient descent by differentiating:

A loss function measures the performance of a model by measuring the difference between
the output expected from the model and the actual output obtained from the model.
The optimizer helps improve the model by adjusting its parameters to minimize the loss
function value.
General Optimization Algorithm Structure:
 Initialize variables or population.
 Evaluate the objective function for current solutions.
 Update parameters using search strategy (gradient step, mutation, exploration, etc.).
 Check stopping criteria (max iterations, convergence, tolerance).
 Return the best solution.
Steepest Descent: In this method, the search starts from an initial trial point X1, and
iteratively moves along the steepest descent directions until the optimum point is found.
Newton’s method: Based on Taylor series
Derivative free optimization algorithms are often used when it is difficult to find function
derivatives, or if finding such derivatives are time consuming.
Random Search: This method generates trial solutions for the optimization model using
random number generators for the decision variables. Random search method includes
random jump method, random walk method and random walk method with direction
exploitation.
Simplex Method: Simplex method is a conventional direct search algorithm where the best
solution lies on the vertices of a geometric figure in N-dimensional space made of a set of
N+1 points.
The Nelder–Mead method (also downhill simplex method, amoeba method, or polytope
method) is a numerical method used to find the minimum or maximum of an objective
function in a multidimensional space. 2D- triangle, 3D- tetrahedron, 4D- pentachoron
Steps: Sort  Reflect  Extend  Contract  Shrink  Check convergence
Example:

ML 1 PPT Unit 1
No ratings yet
ML 1 PPT Unit 1
93 pages
AI ML 3 Updated
No ratings yet
AI ML 3 Updated
34 pages
Module 3 - ML
No ratings yet
Module 3 - ML
101 pages
Social - Confirmitu Scale by Dhapola & Satyendra Singh
No ratings yet
Social - Confirmitu Scale by Dhapola & Satyendra Singh
15 pages
Let Reviewer Statistics Multiple Choice
100% (2)
Let Reviewer Statistics Multiple Choice
5 pages
(Ebook PDF) Modern Business Statistics With Microsoft Office Excel 6th Editionpdf Download
100% (4)
(Ebook PDF) Modern Business Statistics With Microsoft Office Excel 6th Editionpdf Download
50 pages
ML Insem Solved Question Paper
No ratings yet
ML Insem Solved Question Paper
4 pages
Chapter 6 Supervised Learning
No ratings yet
Chapter 6 Supervised Learning
6 pages
Obtaining Data: Methods of Data Collection Planning & Conducting Surveys Planning & Conducting Experiments
No ratings yet
Obtaining Data: Methods of Data Collection Planning & Conducting Surveys Planning & Conducting Experiments
40 pages
Linear Regression
No ratings yet
Linear Regression
29 pages
Summary - Data Analytics& Machine Learning
No ratings yet
Summary - Data Analytics& Machine Learning
18 pages
05-1 Supervised Learning
No ratings yet
05-1 Supervised Learning
65 pages
Machine Learning Notes ?
No ratings yet
Machine Learning Notes ?
64 pages
Lec 07-08 - Final
No ratings yet
Lec 07-08 - Final
32 pages
Stock Market Prediction Using ML
No ratings yet
Stock Market Prediction Using ML
29 pages
Module 3-1
No ratings yet
Module 3-1
7 pages
Aiml 4
No ratings yet
Aiml 4
107 pages
ML 2
No ratings yet
ML 2
155 pages
Ds Module 4
No ratings yet
Ds Module 4
73 pages
Predictive Maintenance
No ratings yet
Predictive Maintenance
66 pages
Linear Regression Lab Guide
100% (1)
Linear Regression Lab Guide
8 pages
Chapter Regression
No ratings yet
Chapter Regression
10 pages
Week11 - Regularization and Optimization
No ratings yet
Week11 - Regularization and Optimization
75 pages
ML Final
No ratings yet
ML Final
92 pages
Forecasting and Learning Theory
No ratings yet
Forecasting and Learning Theory
46 pages
ABDUA 3 and 4
No ratings yet
ABDUA 3 and 4
102 pages
机器学习
No ratings yet
机器学习
41 pages
Unit I
No ratings yet
Unit I
14 pages
2EL1730 ML Lecture02 Linear and Logistic Regression
No ratings yet
2EL1730 ML Lecture02 Linear and Logistic Regression
65 pages
Machine Learning: Introduction and Linear Regression
No ratings yet
Machine Learning: Introduction and Linear Regression
29 pages
Lecture 10 - 04.09.2024 - Regression-02 Lecture Slides
No ratings yet
Lecture 10 - 04.09.2024 - Regression-02 Lecture Slides
61 pages
Unit-I Machine Learning Basics
No ratings yet
Unit-I Machine Learning Basics
85 pages
ML Short
No ratings yet
ML Short
11 pages
AI & ML Unit 3 Notes
No ratings yet
AI & ML Unit 3 Notes
20 pages
Mastering The Basics of Machine Learning
No ratings yet
Mastering The Basics of Machine Learning
65 pages
QSRI Lecture1
No ratings yet
QSRI Lecture1
45 pages
ML Notes
No ratings yet
ML Notes
14 pages
Machine Learning Guide 2017
No ratings yet
Machine Learning Guide 2017
15 pages
Unit II - Supervised Machine Learning Techniques
No ratings yet
Unit II - Supervised Machine Learning Techniques
131 pages
Machine Learning
No ratings yet
Machine Learning
19 pages
Supervised Learning
No ratings yet
Supervised Learning
24 pages
Unit - 2, Updated Notes
No ratings yet
Unit - 2, Updated Notes
121 pages
Unit-2 Supervised Machine Learning
No ratings yet
Unit-2 Supervised Machine Learning
132 pages
ML 1
No ratings yet
ML 1
24 pages
Modern Pridictive Modelling (Regression)
No ratings yet
Modern Pridictive Modelling (Regression)
12 pages
SemVII MachineLearning
No ratings yet
SemVII MachineLearning
22 pages
Lecture Slide 02 - Supervised Learning-1
No ratings yet
Lecture Slide 02 - Supervised Learning-1
43 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
12 pages
Machine Learning Shortnote
No ratings yet
Machine Learning Shortnote
14 pages
Fam QB Ans
No ratings yet
Fam QB Ans
9 pages
Regression
No ratings yet
Regression
56 pages
Types of Machine Learning
No ratings yet
Types of Machine Learning
63 pages
Aiml Unit 3
No ratings yet
Aiml Unit 3
9 pages
Cp4252 ML Unit-II
No ratings yet
Cp4252 ML Unit-II
44 pages
Algorithms For Data Science: Attendance: 88772147
No ratings yet
Algorithms For Data Science: Attendance: 88772147
35 pages
Unit 2
No ratings yet
Unit 2
7 pages
ML Imp QB
No ratings yet
ML Imp QB
34 pages
Aiml-Qb - Unit 3
No ratings yet
Aiml-Qb - Unit 3
6 pages
ERROR and Confusion Matrix
No ratings yet
ERROR and Confusion Matrix
29 pages
Big - Data Unit-2
100% (2)
Big - Data Unit-2
64 pages
Statistical Design and Analysis of Clinical Trials Principles and Methods 1st Edition Weichung Joe Shih PDF Download
100% (4)
Statistical Design and Analysis of Clinical Trials Principles and Methods 1st Edition Weichung Joe Shih PDF Download
71 pages
Normal Distribution Quiz
No ratings yet
Normal Distribution Quiz
2 pages
Oral Qu Stions
No ratings yet
Oral Qu Stions
12 pages
Machine Learning QUESTION AND ANSWERS
No ratings yet
Machine Learning QUESTION AND ANSWERS
13 pages
Lecture - 3
No ratings yet
Lecture - 3
24 pages
Quiz 2 DS5110 32875 Intro To Data Management SEC 01 Spring 2022 BOS 2 TR PDF
No ratings yet
Quiz 2 DS5110 32875 Intro To Data Management SEC 01 Spring 2022 BOS 2 TR PDF
26 pages
Hypothesis Testing Part 1
No ratings yet
Hypothesis Testing Part 1
31 pages
Forecasting Models Overview
No ratings yet
Forecasting Models Overview
29 pages
Capstone Project
No ratings yet
Capstone Project
6 pages
Week 2-A.Guess The Distribution
No ratings yet
Week 2-A.Guess The Distribution
10 pages
Control Chart A Statistical Process Cont
No ratings yet
Control Chart A Statistical Process Cont
10 pages
T-Test Guide: Usage and Calculation
No ratings yet
T-Test Guide: Usage and Calculation
18 pages
Intro To Machine Learning With PyTorch
No ratings yet
Intro To Machine Learning With PyTorch
48 pages
Fly Lab JS - Genetics of Organisms
No ratings yet
Fly Lab JS - Genetics of Organisms
8 pages
Statistics and Probability: Department of Education
100% (1)
Statistics and Probability: Department of Education
3 pages
Chapter 4 - Measures of Position
No ratings yet
Chapter 4 - Measures of Position
11 pages
Quantitative Forecasting Methods
No ratings yet
Quantitative Forecasting Methods
9 pages
Statistics and Probability - Q3-1
No ratings yet
Statistics and Probability - Q3-1
4 pages
Bayesian Inference For Partially Identified Models Exploring The Limits of Limited Data 1st Edition Complete EPUB Ebook
100% (19)
Bayesian Inference For Partially Identified Models Exploring The Limits of Limited Data 1st Edition Complete EPUB Ebook
14 pages
Logistics Models: Question One (Exercise 6.1)
No ratings yet
Logistics Models: Question One (Exercise 6.1)
6 pages
A - Step-By-Step - Guide - To - Exploratory - Factor - Analysi... - (6. - Step - 1 - Variables - To - Include)
No ratings yet
A - Step-By-Step - Guide - To - Exploratory - Factor - Analysi... - (6. - Step - 1 - Variables - To - Include)
3 pages
Regression Analysis Insights
No ratings yet
Regression Analysis Insights
8 pages
Indian Stock Market Seasonality Study
No ratings yet
Indian Stock Market Seasonality Study
3 pages
List of AMOS Fit Indices
No ratings yet
List of AMOS Fit Indices
6 pages
Application Question BBM
No ratings yet
Application Question BBM
3 pages
RMSD and MSD
No ratings yet
RMSD and MSD
4 pages
Statistical Methods for Engineers
No ratings yet
Statistical Methods for Engineers
1 page