0% found this document useful (0 votes)

35 views49 pages

(Unit-04) Part-01 - ML Algo

This document provides an overview of machine learning algorithms, specifically regression. It defines regression as constructing a model to predict dependent variables from independent variables. Regression uses continuous output variables like salary or weight. Examples of regression applications given include sales forecasting, price analysis, and risk assessment. Simple linear regression fits a linear relationship between one dependent and independent variable. It finds the slope and y-intercept that minimize the sum of squared errors between the predicted and actual values using the least squares method. The goodness of fit is measured by R-squared, which indicates the percentage of variation explained by the model. Logistic regression is also introduced as a technique to predict categorical dependent variables.

Uploaded by

suma varanasi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views49 pages

(Unit-04) Part-01 - ML Algo

Uploaded by

suma varanasi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 49

Artificial Intelligence

(CSE3007)
Unit – 04 (Part-I)

Machine Learning Algorithms

Dr. Susant Kumar Panigrahi

Assistant Professor
School of Electrical & Electronics Engineering
What is Regression?

• The main goal of regression is the construction of an efficient model

to predict the dependent attributes from a bunch of attribute
variables. A regression problem is when the output variable is either
real or a continuous value i.e. salary, weight, area, etc.

• We can also define regression as a statistical means that is used in

applications like housing, investing, etc. It is used to predict the
relationship between a dependent variable and a bunch of
independent variables.
Examples
Applications of Regression

1. Evaluating Trends and Sales Estimates

• Linear regressions can be used in business to

evaluate trends and make estimates or forecasts.
• For example, if a company’s sales have increased
steadily every month for the past few years,
conducting a linear analysis on the sales data
with monthly sales on the y-axis and time on the
x-axis would produce a line that that depicts the
upward trend in sales. After creating the trend
line, the company could use the slope of the line
to forecast sales in future months.
2. Analyzing the Impact of Price Changes

• Linear regression can also be used to analyze the

effect of pricing on consumer behavior.

• For example, if a company changes the price on a

certain product several times, it can record the
quantity it sells for each price level and then
performs a linear regression with quantity sold as
the dependent variable and price as the
explanatory variable. The result would be a line
that depicts the extent to which consumers reduce
their consumption of the product as prices
increase, which could help guide future pricing
decisions.
3. Assessing Risk
Simple Linear Regression

• One of the most interesting

and common regression
technique is simple linear
regression. In this, we predict
the outcome of a dependent
variable based on the
independent variables, the
relationship between the
variables is linear. Hence, the
word linear regression.
• Simple linear regression is a regression
technique in which the independent variable
has a linear relationship with the dependent
variable. The straight line in the diagram is
the best fit line.

• The main goal of the simple linear regression

is to consider the given data points and plot
the best fit line to fit the model in the best
way possible.
The Main Idea of Least Square and Linear Regression

Data Points of some

observations
Dependent Variable

But which among these lines best fit

Independent Variable
the data for future prediction..!!
The Main Idea of Least Square and Linear Regression
Lets measure how well this line fits the data….
Lets start with a worst case scenario….

So far the distance between the data points and the
line is:
The Main Idea of Least Square and Linear Regression
So far the distance between the data points and the
line is:
The Main Idea of Least Square and Linear Regression
Finally …….
So to make the cost positive and more
mathematically meaningful, each difference
terms are squared and added together to find
the fit:

= 24.62

This measure indicate how well the line

fits the data
The Main Idea of Least Square and Linear Regression
Rotate the line a little bit and check how well it
fits = 18.72

Rotate the line a little bit more and check how

well it fits = 14.05
The Main Idea of Least Square and Linear Regression

Rotate the line a whole lot then how well it fits

= 31.71
The Main Idea of Least Square and Linear Regression
There is a sweet spot between the horizontal
line and the last case of “whole lot rotated
line” for which we may get the optimal value of
the fit.
The generic line equation for the above linear
regression is:

or Slope
y- intercept

We need to find out the optimum value of and

so that we minimize the sum of squared
As we are looking to find the value
residual.
of m and c so that we will get

Mathematically: smallest sum of residual, so it is
Sum of squared residual = called as “Least Square”
The Main Idea of Least Square and Linear Regression

How do we find the optimal rotation:

“We take the derivative of this
function”

Derivative tells us the slope of the function at

every point…

Notice: The slope at the best point (the “Least

Square”) is zero.

Different rotations are the different values of slope m and y-intercept c.

The big concepts…!!!!!

•• We
want to minimize the squares of the distance between the
observed value and the line.

• We do this by taking the derivative and finding the values of

slope and y-intercept where it is equal to zero.

• The final line minimizes the sums of squares (“least square”)

between it and the real data.
Understanding Linear Regression Algorithm

Mean of
Mean of

Centroid ()

The best fit regression line must pass

through the centroid.

So we need to find out the equation of line that should pass

through the centroid point using least square approach.
Finding the equation of line …..

The generic line equation for the above linear

regression is:

𝑐=𝑦 −𝑚 𝑥

4

𝑚= =0 . 4
𝑐=3.6−0.4×3=2.4

10
The Predicted Line…..
The Predicted Line…..
Goodness of fit…. – R2

WHAT IS R-SQUARED?

 R-squared is a statistical measure of how close the data are to the fitted
regression line.
 It is also known as the coefficient of determination, or the coefficient of
multiple determination for multiple regression.
 The definition of R-squared is fairly straight-forward; it is the percentage of
the response variable variation that is explained by a linear model.
 R-squared = Explained variation / Total variation
Calculation of – R2
Calculation of – R2

2
𝑅 ≈0.3
Interpretation of values of R2

R2=1
Regression line is a Perfect fit
on actual values

R2=0
There is larger distance
between Actual and predicted
values.
Advantages And Disadvantages

Advantages Disadvantages
Linear regression performs exceptionally The assumption of linearity between
well for linearly separable data dependent and independent variables
Easier to implement, interpret and It is often quite prone to noise and
efficient to train overfitting
It handles overfitting pretty well using Linear regression is quite sensitive to
dimensionally reduction techniques,
regularization, and cross-validation outliers
One more advantage is the extrapolation It is prone to multicollinearity
beyond a specific data set
Solve it
• Use least-squares regression to fit a straight line to

• Also find the goodness of fit. Analyze the result.

Logistic Regression
What is Regression?

• Regression analysis is a powerful statistical analysis technique. A dependent

variable of our interest is used to predict the values of other independent
variables in a data-set.
• We come across regression in an intuitive way all the time. Like predicting the
weather using the data-set of the weather conditions in the past.
• It uses many techniques to analyses and predict the outcome, but the
emphasis is mainly on relationship between dependent variable and one or
more independent variable.
• Logistic regression analysis predicts the outcome in a binary variable which
has only two possible outcomes.
What Is Logistic Regression?
• Logistic regression is a classification algorithm, used when the
value of the target variable is categorical in nature. Logistic
regression is most commonly used when the data in question
has binary output, so when it belongs to one class or another, or
is either a 0 or 1.

• Remember that classification tasks have discrete categories,

unlike regressions tasks.
• Logistic Regression is a Machine Learning algorithm
which is used for the classification problems, it is a
predictive analysis algorithm and based on the
concept of probability.
Logistic Regression

• It is a technique to analyze a data-set which has a dependent

variable and one or more independent variables to predict
the outcome in a binary variable, meaning it will have only
two outcomes.

• The dependent variable is categorical in nature. Dependent

variable is also referred as target variable and the
independent variables are called the predictors.
• Logistic regression is a special case of linear regression
where we only predict the outcome in a categorical
variable. It predicts the probability of the event using
the log function.

• We use the Sigmoid function/curve to predict the

categorical value. The threshold value decides the
outcome(win/lose).
• We can call a Logistic Regression a Linear Regression model but the Logistic
Regression uses a more complex cost function, this cost function can be
defined as the ‘Sigmoid function’ or also known as the ‘logistic function’
instead of a linear function.
• The hypothesis of logistic regression tends it to limit the cost function between
0 and 1. Therefore linear functions fail to represent it as it can have a value
greater than 1 or less than 0 which is not possible as per the hypothesis of
logistic regression.
What is the Sigmoid Function?

• In order to map predicted values to probabilities, we use the

Sigmoid function. The function maps any real value into
another value between 0 and 1. In machine learning, we use
sigmoid to map predictions to probabilities.

• The sigmoid function/logistic function is a function that

resembles an “S” shaped curve when plotted on a graph. It
takes values between 0 and 1 and “squishes” them towards
the margins at the top and bottom, labeling them as 0 or 1.
• The equation for the Sigmoid function is this:

• What is the variable e in this instance? The e represents the

exponential function or exponential constant, and it has a value of
approximately 2.71828.
Example:

Age group people those who either brought

insurance or not.

Have_insurance = 1 (Brought Insurance)

Have_insurance = 0 (Have no Insurance)

Applying Linear Regression
Applying Linear Regression  Thesholding

Likely to buy insurance

Applying Linear Regression  Thesholding

[Lets assume we have another extreme value]
Unlikely to buy
insurance

New predictions are more

erroneous
Sigmoid or Logit Function

1
𝑆 ( 𝑦 )= 𝑦
1+ 𝑒
Linear Regression and Logistic Regression
Relationship
Linear Regression Logistic Regression
1. Definition To predict a continuous dependent To predict a categorical dependent
variable based on values of variable based on values of
independent variable independent variables
2. Variable Type Continuous dependent variable Categorical dependent variable

3. Estimation Method Least square estimation Maximum likelihood estimation

4. Equation Y= a0+a1x Log()=a0+a1x1+a2x2 +…+ anxn
5. Best fit line Straight line Curve
6. Relationship between Linear Non linear
dependent and independent
variable
7. Output Predicted Integer value Predicted binary value (0/1)
Types Of Logistic Regression

• Binary logistic regression – It has

only two possible outcomes.
Example- yes or no
• Multinomial logistic regression – It
has three or more nominal
categories. Example- cat, dog,
elephant.
• Ordinal logistic regression- It has
three or more ordinal categories,
ordinal meaning that the categories
will be in a order. Example- user
ratings(1-5).

DS Unit-Iv
No ratings yet
DS Unit-Iv
34 pages
ML - Module 2
No ratings yet
ML - Module 2
16 pages
Linear Regression
No ratings yet
Linear Regression
11 pages
Unit 2
No ratings yet
Unit 2
26 pages
What Is Linear Regression
No ratings yet
What Is Linear Regression
14 pages
Regression Analysis for ML Beginners
No ratings yet
Regression Analysis for ML Beginners
12 pages
Linear Regression
No ratings yet
Linear Regression
24 pages
What Are Linear Models in Machine Learning (1) .Docx (Unit3 ML)
No ratings yet
What Are Linear Models in Machine Learning (1) .Docx (Unit3 ML)
60 pages
Applying Machine Learning Algorithms With Scikit-Learn (Sklearn) - Notes
No ratings yet
Applying Machine Learning Algorithms With Scikit-Learn (Sklearn) - Notes
19 pages
Complete Linear Regression Algorithm
No ratings yet
Complete Linear Regression Algorithm
4 pages
Regression
No ratings yet
Regression
11 pages
4 ML
No ratings yet
4 ML
41 pages
UNIT 2 Machine Learning BCAI601BCDS062
No ratings yet
UNIT 2 Machine Learning BCAI601BCDS062
244 pages
Linear Regression
No ratings yet
Linear Regression
49 pages
L4a - Supervised Learning
No ratings yet
L4a - Supervised Learning
25 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
13 pages
Linear Regression-Part 2
No ratings yet
Linear Regression-Part 2
26 pages
Unit III
No ratings yet
Unit III
11 pages
Unit III
No ratings yet
Unit III
24 pages
Unit-2 Supervised Machine Learning
No ratings yet
Unit-2 Supervised Machine Learning
132 pages
Everything You Need To Know About Linear Regression
No ratings yet
Everything You Need To Know About Linear Regression
19 pages
Unit 3
No ratings yet
Unit 3
48 pages
Unit III
No ratings yet
Unit III
18 pages
Mod3 Eda
No ratings yet
Mod3 Eda
16 pages
Classical Machine Learning: Linear Regression: Ramesh S
No ratings yet
Classical Machine Learning: Linear Regression: Ramesh S
28 pages
RRB - Unit 2 Regresion
No ratings yet
RRB - Unit 2 Regresion
53 pages
Regression and Introduction To Bayesian Network
No ratings yet
Regression and Introduction To Bayesian Network
12 pages
Experiment No 7
No ratings yet
Experiment No 7
7 pages
Complete
No ratings yet
Complete
12 pages
ML - LAB - BE CSE (DS) Final
No ratings yet
ML - LAB - BE CSE (DS) Final
110 pages
AIML MSE 2 Notes
No ratings yet
AIML MSE 2 Notes
35 pages
Linear Regression
No ratings yet
Linear Regression
8 pages
Regression Modelling
No ratings yet
Regression Modelling
25 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
26 pages
Linear vs Logistic Regression Guide
No ratings yet
Linear vs Logistic Regression Guide
81 pages
Classification & Regression Models
No ratings yet
Classification & Regression Models
32 pages
Hanan
No ratings yet
Hanan
9 pages
Linear Regression
No ratings yet
Linear Regression
7 pages
Combinepdf
No ratings yet
Combinepdf
8 pages
DA unit-III
No ratings yet
DA unit-III
30 pages
Linear Regression in Machine Learning
No ratings yet
Linear Regression in Machine Learning
23 pages
Unit - II - DA
No ratings yet
Unit - II - DA
22 pages
Regression Analysis Guide
No ratings yet
Regression Analysis Guide
13 pages
Types of Supervised Learning2
No ratings yet
Types of Supervised Learning2
66 pages
Unit-Iii-1 1
No ratings yet
Unit-Iii-1 1
31 pages
Unit 3 Da
No ratings yet
Unit 3 Da
20 pages
DA Notes 3
No ratings yet
DA Notes 3
12 pages
Da Module 3
No ratings yet
Da Module 3
54 pages
Machine Learning: Bilal Khan
100% (2)
Machine Learning: Bilal Khan
20 pages
Data Science
100% (1)
Data Science
14 pages
Unit - Iii Data Analysis
No ratings yet
Unit - Iii Data Analysis
39 pages
Unit-2: Machine Learning Techniques (KCS-055) Module-2
No ratings yet
Unit-2: Machine Learning Techniques (KCS-055) Module-2
199 pages
Linear Regression
No ratings yet
Linear Regression
9 pages
Unit 2-1
No ratings yet
Unit 2-1
30 pages
Unit - Iii
No ratings yet
Unit - Iii
9 pages
Regression Coeffient
No ratings yet
Regression Coeffient
52 pages
Data Analytics Unit 2
No ratings yet
Data Analytics Unit 2
13 pages
Linear Regression
No ratings yet
Linear Regression
35 pages
Lecture 17
No ratings yet
Lecture 17
2 pages
Complex Analysis: Christian Berg
No ratings yet
Complex Analysis: Christian Berg
192 pages
Assignment 1
100% (1)
Assignment 1
3 pages
Lecture 9 - Performance Evaluation
No ratings yet
Lecture 9 - Performance Evaluation
2 pages
Estimation of The Low Cycle Fatigue Life For Submarine Pressure Hull
No ratings yet
Estimation of The Low Cycle Fatigue Life For Submarine Pressure Hull
12 pages
Tutorial sheet-1-MA1003E
No ratings yet
Tutorial sheet-1-MA1003E
2 pages
Comment On Decagonal and Quasi Crystalline Tilling in Medieval Islamic Architecture
No ratings yet
Comment On Decagonal and Quasi Crystalline Tilling in Medieval Islamic Architecture
3 pages
Tridiagonal System Solver Guide
No ratings yet
Tridiagonal System Solver Guide
2 pages
Design & Analysis of Algorithms (DAA) Unit - II
No ratings yet
Design & Analysis of Algorithms (DAA) Unit - II
24 pages
Sampling Guide For Air Contaminants in The Workplace
No ratings yet
Sampling Guide For Air Contaminants in The Workplace
152 pages
Intro to Statistics for Students
No ratings yet
Intro to Statistics for Students
28 pages
March 2025
No ratings yet
March 2025
6 pages
s.2 Math Mid Exam 2025
No ratings yet
s.2 Math Mid Exam 2025
2 pages
A Divergence Dating Analysis of Turtle Using Fossil Calibrations An Example of Best Practices
No ratings yet
A Divergence Dating Analysis of Turtle Using Fossil Calibrations An Example of Best Practices
24 pages
CHAPTER 1 DR Wan Zul
No ratings yet
CHAPTER 1 DR Wan Zul
28 pages
Geography IA Guide
100% (1)
Geography IA Guide
13 pages
Mtap G4S1 Student
No ratings yet
Mtap G4S1 Student
2 pages
RRB NTPC Syllabus 2024, Subjects, Topics and Pattern For CBT 1, 2
No ratings yet
RRB NTPC Syllabus 2024, Subjects, Topics and Pattern For CBT 1, 2
20 pages
Differential Equations Course Guide
No ratings yet
Differential Equations Course Guide
20 pages
BIOSTATISTICS
No ratings yet
BIOSTATISTICS
55 pages
Secondary 4 Mathematics Exam 2004
No ratings yet
Secondary 4 Mathematics Exam 2004
7 pages
Finding Volume
No ratings yet
Finding Volume
9 pages
Age Calculation
No ratings yet
Age Calculation
4 pages
Data Analysis: in Microsoft Excel
100% (1)
Data Analysis: in Microsoft Excel
48 pages
Statistics: An Overview: Unit 1
No ratings yet
Statistics: An Overview: Unit 1
10 pages
6165-10 L2 Qualification Handbook v2 PDF
No ratings yet
6165-10 L2 Qualification Handbook v2 PDF
56 pages
Practice Worksheet On "Word Problems"
No ratings yet
Practice Worksheet On "Word Problems"
1 page
NIOS Class 12 Previous Year Question Papers Physics 2006
No ratings yet
NIOS Class 12 Previous Year Question Papers Physics 2006
5 pages
Digital Systems Design Exam 2023
No ratings yet
Digital Systems Design Exam 2023
2 pages
Bsed Math 163 Syllabus PDF
No ratings yet
Bsed Math 163 Syllabus PDF
6 pages

(Unit-04) Part-01 - ML Algo

Uploaded by

(Unit-04) Part-01 - ML Algo

Uploaded by

Artificial Intelligence

Machine Learning Algorithms

Dr. Susant Kumar Panigrahi

• The main goal of regression is the construction of an efficient model

• We can also define regression as a statistical means that is used in

1. Evaluating Trends and Sales Estimates

• Linear regressions can be used in business to

• Linear regression can also be used to analyze the

• For example, if a company changes the price on a

• One of the most interesting

• The main goal of the simple linear regression

Data Points of some

But which among these lines best fit

This measure indicate how well the line

Rotate the line a little bit more and check how

Rotate the line a whole lot then how well it fits

We need to find out the optimum value of and

How do we find the optimal rotation:

Derivative tells us the slope of the function at

Notice: The slope at the best point (the “Least

Different rotations are the different values of slope m and y-intercept c.

• We do this by taking the derivative and finding the values of

• The final line minimizes the sums of squares (“least square”)

The best fit regression line must pass

So we need to find out the equation of line that should pass

The generic line equation for the above linear

• Also find the goodness of fit. Analyze the result.

• Regression analysis is a powerful statistical analysis technique. A dependent

• Remember that classification tasks have discrete categories,

• It is a technique to analyze a data-set which has a dependent

• The dependent variable is categorical in nature. Dependent

• We use the Sigmoid function/curve to predict the

• In order to map predicted values to probabilities, we use the

• The sigmoid function/logistic function is a function that

• What is the variable e in this instance? The e represents the

Age group people those who either brought

Have_insurance = 1 (Brought Insurance)

Have_insurance = 0 (Have no Insurance)

Likely to buy insurance

Applying Linear Regression  Thesholding

New predictions are more

3. Estimation Method Least square estimation Maximum likelihood estimation

• Binary logistic regression – It has

You might also like