0% found this document useful (0 votes)

20 views6 pages

Cost Function of Logistic Regression

The document discusses the cost function in logistic regression, emphasizing the use of log loss as a metric for model evaluation. It explains why Mean Squared Error (MSE) is unsuitable for logistic regression due to the nonlinearity introduced by the sigmoid function, leading to a non-convex cost function. Additionally, it highlights the challenges of optimizing non-convex functions, such as getting stuck in local minima and encountering plateaus or saddle points.

Uploaded by

Aman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views6 pages

Cost Function of Logistic Regression

Uploaded by

Aman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

Cost Function of Logistic Regression

A cost function is a mathematical function that calculates the difference

between the target actual values (ground truth) and the values predicted by
the model. A function that assesses a machine learning model’s performance
also referred to as a loss function or objective function. Usually, the objective
of a machine learning algorithm is to reduce the error or output of cost
function.

Log loss and Cost function for Logistic Regression

One of the popular metrics to evaluate models for classification by using

probabilities is log loss.

F=−∑(i=1 to M) yilog (hθ(xi))+(1−yi)log(1−hθ(xi))

The cost function can be written as:

F(θ)=1/n∑(i=1 to n) 1/2[hθ(xi)−Yi]2
For Logistic Regression,

hθ(x)=g(θTx)
The above equation leads to a non−convex function that acts as the cost
function. The cost function logistic regression is log loss and is summarized
below.

cost(hθ(x), y) = -log(hθ(x)) , when y=1

and

cost(hθ(x), y) = -log(1 - hθ(x)) , when y=0

where,

 y is the actual value of the target variable,

 hθ (x) is the predicted probability that y=1 given , X and
parameterized by θ.
 yi is the actual label for the i th training example.
This cost function penalizes the model with a higher loss when its prediction
diverges from the actual label. Specifically, it imposes a large penalty when
the model confidently predicts the wrong class (i.e., high probability for the
incorrect class).

Why Mean Squared Error suitable for Linear Regression?

Because in linear regression there present a value where exist minimum

error i.e. global minima.
Why Mean Squared Error not suitable for Logistic Regression?

Let’s consider the Mean Squared Error (MSE) as a cost function, but it is not
suitable for logistic regression due to its nonlinearity introduced by the
sigmoid function.

MSE = 1/2m Σ (i=1 to m) (σ(i) - yi)2

In logistic regression, if we substitute the sigmoid function into the above

MSE equation, we get
The equation 1/(1+ez) is a nonlinear transformation, and evaluating this term
within the Mean Squared Error formula results in a non-convex cost function.
A non-convex function, have multiple local minima which can make it difficult
to optimize using traditional gradient descent algorithms as shown below.

Imagine you have a function that looks like a series of hills and valleys, with
multiple peaks and troughs scattered throughout. This type of function is
called non-convex because it doesn't have a single, well-defined minimum
point; instead, it has multiple local minima (valleys) and potentially even
some local maxima (peaks).

When you're trying to optimize such a function, the goal is to find the lowest
point, which corresponds to the global minimum. However, because of the
presence of multiple local minima, traditional gradient descent algorithms
can encounter difficulties.

Why is it challenging?

1. Getting Stuck in Local Minima: Gradient descent algorithms, like the

one used in logistic regression, work by iteratively moving in the direction of
the steepest descent of the function. However, if they start from an initial
point that is not the global minimum and there are multiple local minima,
they might get trapped in one of the local minima instead of reaching the
global minimum. Once stuck in a local minimum, the algorithm cannot
escape it to find the true minimum.

2. Plateaus and Saddle Points: In addition to local minima, non-convex

functions may have plateaus (flat regions) and saddle points (points where
the gradient is zero but not a minimum or maximum). These features can
slow down or stall the convergence of gradient descent algorithms, making
optimization even more challenging.

Mark as Read
Report An Issue

Algebraic Geometry - A First Course - Joe Harris - Harvard University
86% (7)
Algebraic Geometry - A First Course - Joe Harris - Harvard University
337 pages
FREE ENERGY Tesla Secrets For Everybody
100% (9)
FREE ENERGY Tesla Secrets For Everybody
58 pages
UBS Business Plan - Stategic Planning and Financing Basis - Model For Generating A Business Plan - (UBS AG) PDF
No ratings yet
UBS Business Plan - Stategic Planning and Financing Basis - Model For Generating A Business Plan - (UBS AG) PDF
26 pages
Mathematics 9 - Q3 - Mod11 - Conditions Proving For Triangles Similar - v3
100% (2)
Mathematics 9 - Q3 - Mod11 - Conditions Proving For Triangles Similar - v3
28 pages
Plus One Notes - Eng
No ratings yet
Plus One Notes - Eng
11 pages
Unit 3-ML
No ratings yet
Unit 3-ML
99 pages
Traction Alternator Type Ta10106cy
No ratings yet
Traction Alternator Type Ta10106cy
64 pages
Tendrel Nyesel - Rigpa Wiki052150
No ratings yet
Tendrel Nyesel - Rigpa Wiki052150
6 pages
Camatkara-Candrika 3ed
No ratings yet
Camatkara-Candrika 3ed
100 pages
Electromagnetic Warp Drive Theory
No ratings yet
Electromagnetic Warp Drive Theory
16 pages
Machine Learning Regression Basics
No ratings yet
Machine Learning Regression Basics
22 pages
Fender
No ratings yet
Fender
14 pages
Logistic Regression
No ratings yet
Logistic Regression
10 pages
Week 6
No ratings yet
Week 6
72 pages
Tibetan Meditation for Modern Minds
No ratings yet
Tibetan Meditation for Modern Minds
10 pages
DSR Ss 03 January 2023 Indordb
No ratings yet
DSR Ss 03 January 2023 Indordb
19 pages
STCMB 1
No ratings yet
STCMB 1
59 pages
AAN 2023 Day 1-2 Mind Next Original
No ratings yet
AAN 2023 Day 1-2 Mind Next Original
21 pages
Organophosphate Insecticides (OPC)
No ratings yet
Organophosphate Insecticides (OPC)
27 pages
LEC2 مشين
No ratings yet
LEC2 مشين
116 pages
Reto 4
No ratings yet
Reto 4
5 pages
Participant Handbook: Iot Hardware Analyst
No ratings yet
Participant Handbook: Iot Hardware Analyst
152 pages
English 5 Co Combined
100% (2)
English 5 Co Combined
85 pages
Aspiring Entrepreneur's CV
No ratings yet
Aspiring Entrepreneur's CV
4 pages
Hota ML Regression
No ratings yet
Hota ML Regression
57 pages
Lec 02 LogisticReg
No ratings yet
Lec 02 LogisticReg
33 pages
Week 7
No ratings yet
Week 7
21 pages
Reflection Paper Guide for "The Billionaire"
No ratings yet
Reflection Paper Guide for "The Billionaire"
4 pages
Fileml
No ratings yet
Fileml
54 pages
Final ML
No ratings yet
Final ML
54 pages
2024-Spring - 2242-Biol-1345-001 3
No ratings yet
2024-Spring - 2242-Biol-1345-001 3
5 pages
Algorithms Notes
No ratings yet
Algorithms Notes
66 pages
Sono 336 Carotid-Worksheet
No ratings yet
Sono 336 Carotid-Worksheet
1 page
Ann Cum Syllabus AP English 10-04-2025 1
No ratings yet
Ann Cum Syllabus AP English 10-04-2025 1
5 pages
CS 304.A Training Models
No ratings yet
CS 304.A Training Models
149 pages
Assignment/ Tugasan HBEC4403 Social and Emotional Development of Young Children/ September 2023 Semester
No ratings yet
Assignment/ Tugasan HBEC4403 Social and Emotional Development of Young Children/ September 2023 Semester
12 pages
Lecture 07
No ratings yet
Lecture 07
26 pages
01B DL2023 LinearModels
No ratings yet
01B DL2023 LinearModels
47 pages
Loss Functions
No ratings yet
Loss Functions
29 pages
Technical Datasheet Modula - EN24062013
No ratings yet
Technical Datasheet Modula - EN24062013
2 pages
Linear Regression Guide
No ratings yet
Linear Regression Guide
36 pages
Christian Family: Divine Foundation
No ratings yet
Christian Family: Divine Foundation
2 pages
Lecture 04
No ratings yet
Lecture 04
24 pages
Regression vs Classification Algorithms
100% (1)
Regression vs Classification Algorithms
13 pages
Hundred Page ML Book CH 3
No ratings yet
Hundred Page ML Book CH 3
16 pages
Lec6 7 Linear Regression
No ratings yet
Lec6 7 Linear Regression
38 pages
Regression and Optimization in ML
No ratings yet
Regression and Optimization in ML
41 pages
Op Tim Ization
No ratings yet
Op Tim Ization
18 pages
A Layman's Guide To The Project
No ratings yet
A Layman's Guide To The Project
34 pages
Devotional Insights of Gaura-kiçora
No ratings yet
Devotional Insights of Gaura-kiçora
95 pages
A Tutorial of Machine Learning
No ratings yet
A Tutorial of Machine Learning
16 pages
Linear Regression: Level:4 Department: IT, Security
No ratings yet
Linear Regression: Level:4 Department: IT, Security
35 pages
GD in LR
No ratings yet
GD in LR
23 pages
Logistic Regression
No ratings yet
Logistic Regression
37 pages
A General Theory of Domination and Justice 1st Edition Lovett Instant Download
No ratings yet
A General Theory of Domination and Justice 1st Edition Lovett Instant Download
145 pages
AES DRRM Memo PASS
No ratings yet
AES DRRM Memo PASS
2 pages
ML:Introduction: Week 1 Lecture Notes
No ratings yet
ML:Introduction: Week 1 Lecture Notes
10 pages
Deep Learning (Part 8) - Coursesteach
No ratings yet
Deep Learning (Part 8) - Coursesteach
16 pages
02 - Linear Models - A
No ratings yet
02 - Linear Models - A
23 pages
AC-ED L04 - Logistic Regression, Regularization
No ratings yet
AC-ED L04 - Logistic Regression, Regularization
80 pages
Ch2Regression and Regularization1
No ratings yet
Ch2Regression and Regularization1
45 pages
Linear Regression by IntuitiveAI v2.5
No ratings yet
Linear Regression by IntuitiveAI v2.5
5 pages
Regression Analysis
No ratings yet
Regression Analysis
54 pages
ML Assignment3 Solution
No ratings yet
ML Assignment3 Solution
13 pages
Are Today's Teenagers Smarter and Better Than We Think - The New York Times
No ratings yet
Are Today's Teenagers Smarter and Better Than We Think - The New York Times
5 pages
Logistic Regression by IntuitiveAI v2.5
No ratings yet
Logistic Regression by IntuitiveAI v2.5
8 pages
Sample Research Paper
No ratings yet
Sample Research Paper
26 pages
Cost Function For Logistic Regression
No ratings yet
Cost Function For Logistic Regression
42 pages
Output 23
No ratings yet
Output 23
6 pages
Logistic Regression: Gunjan Bharadwaj Assistant Professor Dept of CEA
100% (1)
Logistic Regression: Gunjan Bharadwaj Assistant Professor Dept of CEA
42 pages
Classification-Introduction, Logistic Regression
No ratings yet
Classification-Introduction, Logistic Regression
26 pages
Logistic Regression Loss
No ratings yet
Logistic Regression Loss
7 pages
CS229 Supplemental Lecture Notes: 1 Binary Classification
No ratings yet
CS229 Supplemental Lecture Notes: 1 Binary Classification
7 pages
M02Logistic Regression Logistic RegressioLogistic Regressionn
No ratings yet
M02Logistic Regression Logistic RegressioLogistic Regressionn
19 pages
Basic Interview Question of Linear Regression
No ratings yet
Basic Interview Question of Linear Regression
9 pages
Logistic Regression Guide & Concepts
No ratings yet
Logistic Regression Guide & Concepts
25 pages
Logistic Regression
No ratings yet
Logistic Regression
9 pages
(Machine Learning Coursera) Lecture Note Week 1
No ratings yet
(Machine Learning Coursera) Lecture Note Week 1
8 pages
(MLP) Lecture Notes
No ratings yet
(MLP) Lecture Notes
22 pages
Logistic Regression
No ratings yet
Logistic Regression
8 pages
Regression PPT
No ratings yet
Regression PPT
21 pages
Logistic Regression
No ratings yet
Logistic Regression
6 pages
Logistic Regression
No ratings yet
Logistic Regression
9 pages
W2 Ann
No ratings yet
W2 Ann
12 pages
chp2 Cost Functions
No ratings yet
chp2 Cost Functions
7 pages
Binary Classification and Logistic Regression
No ratings yet
Binary Classification and Logistic Regression
7 pages
4 Linear Regression Additional Notes
No ratings yet
4 Linear Regression Additional Notes
8 pages

Cost Function of Logistic Regression

Uploaded by

Cost Function of Logistic Regression

Uploaded by

Cost Function of Logistic Regression

A cost function is a mathematical function that calculates the difference

Log loss and Cost function for Logistic Regression

One of the popular metrics to evaluate models for classification by using

F=−∑(i=1 to M) yilog (hθ(xi))+(1−yi)log(1−hθ(xi))

The cost function can be written as:

cost(hθ(x), y) = -log(hθ(x)) , when y=1

cost(hθ(x), y) = -log(1 - hθ(x)) , when y=0

 y is the actual value of the target variable,

Why Mean Squared Error suitable for Linear Regression?

Because in linear regression there present a value where exist minimum

MSE = 1/2m Σ (i=1 to m) (σ(i) - yi)2

In logistic regression, if we substitute the sigmoid function into the above

1. Getting Stuck in Local Minima: Gradient descent algorithms, like the

2. Plateaus and Saddle Points: In addition to local minima, non-convex

You might also like