0% found this document useful (0 votes)

5 views17 pages

2.1.3 Logistic Regression

Sem 6 material for Mumbai university

Uploaded by

cloudcore951

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views17 pages

2.1.3 Logistic Regression

Sem 6 material for Mumbai university

Uploaded by

cloudcore951

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 17

MODULE 2

LEARNING WITH
REGRESSION AND TREES
2.1 Learning with Regression: Linear Regression,
Multivariate Linear Regression, Logistic Regression.
1
Darakhshan Khan
LOGISTIC REGRESSION
• This type of statistical model (also known as logit model) is often used for classification
• Logistic regression estimates the probability of an event occurring, such as voted or
didn’t vote, based on a given dataset of independent variables.
• Since the outcome is a probability, the dependent variable is bounded between 0 and
1.
• Consider y variable (binary classification)
• 0: negative class
• 1: positive class
• Examples
• Email: spam / not spam
• Online transactions: fraudulent / not fraudulent
2
• Tumor: malignant / not malignant
Darakhshan Khan
LOGISTIC REGRESSION (1)
• Issue 1 of Linear Regression
• Using linear regression and then threshold the classifier
output (i.e. anything over some value is yes, else no)
• In our example, linear regression with thresholding
seems to work i.e. it does a reasonable job of stratifying
the data points into one of two classes
• But what if we had a single Yes with a very large
tumour size
• This would lead to classifying all the existing yeses as
nos

3
Darakhshan Khan
LOGISTIC REGRESSION (2)

Issue 2 of Linear Regression

• We know Y is 0 or 1
• Model can give values large than 1 or less than 0
• So, logistic regression generates a value where is
always either 0 or 1
• Logistic regression is a classification algorithm -
don't be confused

4
Darakhshan Khan
LOGISTIC REGRESSION – MODEL REPRESENTATION

• What function is used to represent model in classification?

• Aim of this classifier to output values between 0 and 1
• Using linear regression, hθ(x) = (θTx)
• For classification hypothesis representation we do hθ(x) = g(θTx)
• Where, g(z), z is a real number
• g(z) = 1/(1 + e-z)
• This is the sigmoid function, or the logistic function
• If we combine these equations we can write out the hypothesis as
• What does the sigmoid function look like
• Crosses 0.5 at the origin, then flattens out]
• Asymptotes at 0 and 1
5
• Given this we need to fit θ to our data
Darakhshan Khan
LOGISTIC REGRESSION – MODEL
REPRESENTATION(1)
Interpreting output
• When (hθ(x)) outputs a number, treat that value as the estimated probability that
y=1 on input x
• Example : If X is a feature vector with x0 = 1 (as always) and x1 = tumourSize
(some value)
• hθ(x) = 0.7, Tells a patient they have a 70% chance of a tumor being malignant
• More formal notation, hθ(x) = P(y=1|x ; θ)
• Probability that y=1, given x, parameterized by θ
• Since this is a binary classification task we know y = 0 or 1
• So the following must be true
P(y=1|x ; θ) + P(y=0|x ; θ) = 1
•
6
• P(y=0|x ; θ) = 1 - P(y=1|x ; θ)
Darakhshan Khan
LOGISTIC REGRESSION - DECISION
BOUNDARY

• Better understand of what the hypothesis function (model) looks like

• One way of using the sigmoid function is;
• When the probability of y being 1 is greater than 0.5 then we can predict y = 1
• Else we predict y = 0

• Looking at sigmoid function, g(z) is greater than or equal to 0.5 when z is

greater than or equal to 0
• So if z is positive, g(z) is greater than 0.5, where z = (θ T x)
• So when , θT x >= 0 , then hθ >= 0.5
• So what we've shown is that the hypothesis predicts y = 1 when θT x >= 0
• The corollary of that when θT x <= 0 then the hypothesis predicts y = 0
7
Darakhshan Khan
LOGISTIC REGRESSION - DECISION
• BOUNDARY(1)
Example, h (x) = g(θ + θ x + θ x )
θ 0 1 1 2 2

• Assume, θ0 = -3, θ1 = 1, θ2 = 1
• So our parameter vector is a column vector with the above values, i.e., θ T is a row vector = [-3,1,1]
• What does this mean? The z here becomes θT x
• We predict "y = 1" if -3x0 + 1x1 + 1x2 >= 0
• -3 + 1x1 + 1x2 >= 0

• We can also re-write this as If (x1 + x2 >= 3) then we predict y = 1

• If we plot x1 + x2 = 3, we graphically plot our decision boundary
• Means we have these two regions on the graph
• Blue = false, Magenta = true
• Line = decision boundary
• Concretely, the straight line is the set of points where hθ(x) = 0.5 exactly
• The decision boundary is a property of the hypothesis
8
• Means we can create the boundary with the hypothesis(function) and parameters without any data
Darakhshan Khan
• Later, we use the data to determine the parameter values
LOGISTIC REGRESSION – NON LINEAR DECISION
BOUNDARY
• Get logistic regression to fit a complex non-linear data set i,e. add
higher order terms in hypothesis function
• hθ(x) = g(θ0 + θ1x1 + θ2x2 + θ3x12 + θ4x22)
• We take the transpose of the θ vector times the input vector i.e θ T was
[-1,0,0,1,1]
• So, Predict that "y = 1" if -1 + x12 + x22 >= 0 or x12 + x22
>= 1
• If we plot x12 + x22 = 1, this gives a circle with a radius of 1 around origin
• This indicates more complex decision boundaries can be build by fitting
complex parameters to this (relatively) simple hypothesis
• More complex decision boundaries?
• By using higher order polynomial terms, we can get even more complex
decision boundaries
9
Darakhshan Khan
LOGISTIC REGRESSION – COST FUNCTION
(ERROR/LOSS)
• Fit θ parameters
• Define the optimization objective for the cost function and fit the parameters

10
Darakhshan Khan
LOGISTIC REGRESSION – COST FUNCTION
(ERROR/LOSS)
• Linear regression uses the following function to determine θ
• Instead of writing the squared error term, we can write it as "cost()“,
• cost(hθ(xi), y) = 1/2(hθ(xi) - yi)2
• Which evaluates to the cost for an individual example using the same measure as used in linear regression
• redefine J(θ) as
• To further simplify it we can get rid of the superscripts
• If we use this function for logistic regression this is a non-convex function for parameter optimization
• We have some function - J(θ) - for determining the parameters
• Our hypothesis function has a non-linearity (sigmoid function of hθ(x) )
• This is a complicated non-linear function
• If you take hθ(x) and plug it into the Cost() function, and them plug the Cost() function into J(θ) and
plot J(θ) we find many local optimum -> non convex function
• Lots of local minima mean gradient descent may not find the global optimum - may get stuck in a
local minimum
11
Darakhshan Khan
LOGISTIC REGRESSION – COST FUNCTION (ERROR/LOSS)
(1)cost, where we can apply gradient descent
• We need different convex logistic regression

• This is our logistic regression cost function i.e. penalty the algorithm pays
• Plot the function, Plot y = 1 hθ(x) evaluates as -log(hθ(x) )

• So when hθ(x) =1 (correct), cost function is 0 else it slowly increases cost function as we become
"more" wrong , i.e. hθ(x) = 0
• X axis is what we predict
• Y axis is the cost associated with that prediction
• This cost functions has some interesting properties
• If y = 1 and hθ(x) = 1
• If hypothesis predicts exactly 1 and thats exactly correct then that corresponds to 0 (exactly, not nearly 0)
• As hθ(x) goes to 0
• Cost goes to infinity, This captures the intuition that if hθ(x) = 0 (predict P (y=1|x; θ) = 0) but y = 1
this will penalize the learning algorithm with a massive cost
• What about if y = 0, then cost is evaluated as -log(1- hθ(x)) 12
• Just
Darakhshan Khanget inverse of the other function
LOGISTIC REGRESSION – SIMPLIFIED COST FUNCTION
• Define a simpler way to write the cost function and apply gradient descent to the logistic regression

• Rather than writing cost function on two lines/two cases. We can compress them into one equation -
more efficient
• cost(hθ, (x),y) = -ylog( hθ(x) ) - (1-y)log( 1- hθ(x) )
• If y = 1,then -log(hθ(x)) - (0)log(1 - hθ(x)) = -log(hθ(x))
• Which is what we had before when y = 1

• If y = 0, then -(0)log(hθ(x)) - (1)log(1 - hθ(x)) = -log(1- hθ(x))

• Which is what we had before when y = 0

• Cost function for the θ parameters can be defined as

13
Darakhshan Khan
LOGISTIC REGRESSION – SIMPLIFIED COST FUNCTION(1)

• To fit parameters θ:
• Find parameters θ which minimize J(θ)
• This means we have a set of parameters to use in our model for future
predictions
• Then, if we're given some new example with set of features x, we can
take the θ which we generated, and output our prediction using

• Which is P(y=1 | x ; θ), Probability y = 1, given x, parameterized by θ 14

Darakhshan Khan
LOGISTIC REGRESSION – GRADIENT DESCENT
How to minimize the logistic regression cost function J(θ)
• Use gradient descent as before
• Repeatedly update each parameter using a learning rate (For derivation refer notes)

• It looks identical, but the hypothesis for Logistic Regression is different from Linear
Regression
• Ensuring Gradient Descent is Running Correctly

15
Darakhshan Khan
LOGISTIC REGRESSION – MULTICLASS
CLASSIFICATION PROBLEMS
• Similar terms: One-vs-all or One-vs-rest
• Examples
• Email folders or tags (4 classes): Work, Friends, Family, Hobby
• Medical Diagnosis (3 classes): Not ill, Cold, Flu
• Weather (4 classes): Sunny, Cloudy, Rainy. Snow
• Binary vs Multi-class

16
Darakhshan Khan
LOGISTIC REGRESSION – MULTICLASS
CLASSIFICATION PROBLEMS
One-vs-all (One-vs-rest)
• Split them into 3 distinct groups and
compare them to the rest
• If you have k classes, you need to
train k logistic regression classifiers

17
Darakhshan Khan

Champo Carpets Problem Statement
0% (1)
Champo Carpets Problem Statement
2 pages
Machine Learning: Professor Department of Computer Science & Engineering
No ratings yet
Machine Learning: Professor Department of Computer Science & Engineering
25 pages
Ada Sorting
No ratings yet
Ada Sorting
74 pages
Daa Unit I
No ratings yet
Daa Unit I
16 pages
ML - Logistic Regression&KNN
No ratings yet
ML - Logistic Regression&KNN
48 pages
Week 8
No ratings yet
Week 8
38 pages
Data Science L19 - LogisticRegression
No ratings yet
Data Science L19 - LogisticRegression
52 pages
03-Logistic Regression
No ratings yet
03-Logistic Regression
59 pages
Unit 3-ML
No ratings yet
Unit 3-ML
99 pages
Lecture 3. Classification
No ratings yet
Lecture 3. Classification
60 pages
Lecture 08
No ratings yet
Lecture 08
42 pages
LP 3 - Solving MILPs With PuLP
No ratings yet
LP 3 - Solving MILPs With PuLP
7 pages
Power Method
No ratings yet
Power Method
19 pages
Logistic Regression by IntuitiveAI v2.5
No ratings yet
Logistic Regression by IntuitiveAI v2.5
8 pages
Lecture Ai
No ratings yet
Lecture Ai
40 pages
Lecture Notes 6 Logistic Regression
No ratings yet
Lecture Notes 6 Logistic Regression
8 pages
Operating System Os Notes New Cs 2nd Year
No ratings yet
Operating System Os Notes New Cs 2nd Year
89 pages
Seminar
No ratings yet
Seminar
51 pages
Lecture 07
No ratings yet
Lecture 07
26 pages
Logistic Regression
No ratings yet
Logistic Regression
36 pages
Logistic Regression
No ratings yet
Logistic Regression
21 pages
Signals and Systems Unit 3
No ratings yet
Signals and Systems Unit 3
55 pages
01B DL2023 LinearModels
No ratings yet
01B DL2023 LinearModels
47 pages
Logistic Regression
No ratings yet
Logistic Regression
4 pages
4.logistic Regression
No ratings yet
4.logistic Regression
16 pages
3-LG Eval
No ratings yet
3-LG Eval
52 pages
Slide 2
No ratings yet
Slide 2
30 pages
Logistic Regression
No ratings yet
Logistic Regression
12 pages
Logistic Regression
No ratings yet
Logistic Regression
23 pages
06 Logistic Regression
No ratings yet
06 Logistic Regression
55 pages
Notes 05
No ratings yet
Notes 05
51 pages
Logistic Regression in Machine Learning
No ratings yet
Logistic Regression in Machine Learning
38 pages
Logistic Regression Explained
No ratings yet
Logistic Regression Explained
41 pages
Lecture 03 Logistic Regression
No ratings yet
Lecture 03 Logistic Regression
34 pages
09 23ECE216 LogisticRegression
No ratings yet
09 23ECE216 LogisticRegression
40 pages
Logistic Regression
No ratings yet
Logistic Regression
74 pages
Stiff Ode
No ratings yet
Stiff Ode
25 pages
A Layman's Guide To The Project
No ratings yet
A Layman's Guide To The Project
34 pages
Logistic Regression (Probability Concepts) and Perceptron
No ratings yet
Logistic Regression (Probability Concepts) and Perceptron
20 pages
Treap Trees MIT
No ratings yet
Treap Trees MIT
6 pages
Lecture 8 Logistic Regression
No ratings yet
Lecture 8 Logistic Regression
34 pages
11 Stability
No ratings yet
11 Stability
32 pages
ML 03 Logistic Regression
No ratings yet
ML 03 Logistic Regression
32 pages
Logistic Regression
No ratings yet
Logistic Regression
9 pages
Logistic Regression
No ratings yet
Logistic Regression
34 pages
Digital Communication Through Band-Limited Channels Muris Sarajlic
No ratings yet
Digital Communication Through Band-Limited Channels Muris Sarajlic
42 pages
ML-Unit 4
No ratings yet
ML-Unit 4
29 pages
Ch2Regression and Regularization1
No ratings yet
Ch2Regression and Regularization1
45 pages
Logistic Regression
No ratings yet
Logistic Regression
14 pages
Logistic Regression Explained
No ratings yet
Logistic Regression Explained
50 pages
Classification-Introduction, Logistic Regression
No ratings yet
Classification-Introduction, Logistic Regression
26 pages
Algorithms Notes
No ratings yet
Algorithms Notes
66 pages
M02Logistic Regression Logistic RegressioLogistic Regressionn
No ratings yet
M02Logistic Regression Logistic RegressioLogistic Regressionn
19 pages
Dijkstra Solutions
No ratings yet
Dijkstra Solutions
13 pages
Assignment 1 Ai Lp-Ii
No ratings yet
Assignment 1 Ai Lp-Ii
3 pages
DIP Lecture-7 10 RKJ Image Enhancement
No ratings yet
DIP Lecture-7 10 RKJ Image Enhancement
210 pages
Algorithm Analysis Guide
No ratings yet
Algorithm Analysis Guide
19 pages
Sample Research Paper
No ratings yet
Sample Research Paper
26 pages
Introduction To Machine Learning: Dr. Muhammad Amjad Iqbal
No ratings yet
Introduction To Machine Learning: Dr. Muhammad Amjad Iqbal
20 pages
Artificial Intelligence Project Presentation
No ratings yet
Artificial Intelligence Project Presentation
34 pages
Navier-Stokes Solution Guide
No ratings yet
Navier-Stokes Solution Guide
5 pages
Discrete Convolution (Slides)
No ratings yet
Discrete Convolution (Slides)
40 pages
Clustering Algorithms
No ratings yet
Clustering Algorithms
61 pages
Logistic Regression
No ratings yet
Logistic Regression
6 pages
3 Logistic Regression and Regularization
No ratings yet
3 Logistic Regression and Regularization
42 pages
AC-ED L04 - Logistic Regression, Regularization
No ratings yet
AC-ED L04 - Logistic Regression, Regularization
80 pages
Significant Figures Worksheet
No ratings yet
Significant Figures Worksheet
2 pages
Heap Sort and Quick Sort-2
No ratings yet
Heap Sort and Quick Sort-2
54 pages
NS Assignment
No ratings yet
NS Assignment
9 pages
Nmode2 160210054831 PDF
No ratings yet
Nmode2 160210054831 PDF
170 pages
Logistic Regression: Gunjan Bharadwaj Assistant Professor Dept of CEA
100% (1)
Logistic Regression: Gunjan Bharadwaj Assistant Professor Dept of CEA
42 pages
Regression and Classification
No ratings yet
Regression and Classification
15 pages
06 Logistic Regression PDF
No ratings yet
06 Logistic Regression PDF
10 pages
ML DSBA Lab2
No ratings yet
ML DSBA Lab2
4 pages
Logistic Regression
No ratings yet
Logistic Regression
37 pages
Numerical Methods Course Overview
No ratings yet
Numerical Methods Course Overview
7 pages
Fem
No ratings yet
Fem
27 pages
Dynamic Programming: 17.1. The Coin Changing Problem
No ratings yet
Dynamic Programming: 17.1. The Coin Changing Problem
3 pages
Logistic Regression Quiz Analysis
No ratings yet
Logistic Regression Quiz Analysis
7 pages
Implementing Bit-Serial Digital Filters in At6000 Fpgas
No ratings yet
Implementing Bit-Serial Digital Filters in At6000 Fpgas
9 pages
ME PDC Course Plan
No ratings yet
ME PDC Course Plan
6 pages
Machine Learning for Mechanics
No ratings yet
Machine Learning for Mechanics
19 pages
A Tutorial of Machine Learning
No ratings yet
A Tutorial of Machine Learning
16 pages
Machine Learning 2
No ratings yet
Machine Learning 2
19 pages
Cs6601 Project 1 Proposal
No ratings yet
Cs6601 Project 1 Proposal
1 page
Logistic Regression Quiz
100% (1)
Logistic Regression Quiz
5 pages
ML Week 3 Logistic Regression
60% (10)
ML Week 3 Logistic Regression
6 pages

2.1.3 Logistic Regression

Uploaded by

2.1.3 Logistic Regression

Uploaded by

MODULE 2

Issue 2 of Linear Regression

• What function is used to represent model in classification?

• Better understand of what the hypothesis function (model) looks like

• Looking at sigmoid function, g(z) is greater than or equal to 0.5 when z is

• We can also re-write this as If (x1 + x2 >= 3) then we predict y = 1

• If y = 0, then -(0)log(hθ(x)) - (1)log(1 - hθ(x)) = -log(1- hθ(x))

• Cost function for the θ parameters can be defined as

• Which is P(y=1 | x ; θ), Probability y = 1, given x, parameterized by θ 14

You might also like