0% found this document useful (0 votes)

39 views54 pages

Linear Regression in Machine Learning

Uploaded by

baygiolamaygio04

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

39 views54 pages

Linear Regression in Machine Learning

Uploaded by

baygiolamaygio04

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 54

UET

Since 2004

ĐẠI HỌC CÔNG NGHỆ, ĐHQGHN

VNU-University of Engineering and Technology

INT3405 - Machine Learning

Lecture 3: Linear Regression
Duc-Trong Le & Viet-Cuong Ta

Hanoi, 09/2023
Outline
● Supervised Learning
● Linear Regression with One Variable
○ Model Representation
○ Cost Functions
○ Gradient Descent
● Linear Regression with Multiple Variables
○ Learning rate
○ Normal Equation

FIT-CS INT3405 - Machine Learning 2

Recap: Random Variables

FIT-CS INT3405 - Machine Learning 3

Supervised Learning
●Supervised (Inductive) Learning
●Formalization
○ Input:

○ Output:

○ Target function: (unknown)

○ Training Data:

○ Hypothesis:

○ Hypothesis space:
FIT-CS INT3405 - Machine Learning 4
A Learning Problem

Unknown
Function
Input Output

FIT-CS INT3405 - Machine Learning 5

The Statistical Learning Framework

6
The Statistical Learning Framework

7
The Statistical Learning Framework

8
Hypothesis Spaces
●Linear models

○ Infinite possible hypotheses!

○ Any choices of coefficient a and b will result in a possible hypothesis
● Polynomial models

● Any nonlinear models

FIT-CS INT3405 - Machine Learning 9

Two Views of Learning
●Learning is the removal of our remaining uncertainty.
○ If we are know that x and y are linearly dependent, then we could
use the training data to infer the linear function
●Learning requires guessing a good, small hypothesis class.
○ We could start with a very small / simple class, and enlarge it until it
contains a hypothesis that fits the data
●But we could be wrong
○ Our prior knowledge might be wrong
○ Our guess of the hypothesis class could be wrong
■ The smaller the hypothesis class, the more likely we are wrong
FIT-CS INT3405 - Machine Learning 10
Two Strategies for Machine Learning
●Develop Languages for Expressing Prior Knowledge
○ Rule grammars and stochastic models

●Develop Flexible Hypothesis Spaces

○ Nested collections of hypotheses, rules, linear models, decision trees,
neural networks, etc.

●For either case, the key is to

○ Developing efficient algorithms for finding a Hypothesis that best
approximates the target function for fitting the data
FIT-CS INT3405 - Machine Learning 11
Key Issues in Machine Learning
● What are good hypothesis spaces?
○ Which spaces have been useful in practical applications and why?
● What algorithms can work with these spaces?
○ Are there general design principles for machine learning algorithms?
● How can we find the best hypothesis in an efficient way?
○ How to find the optimal solution efficiently (“optimization” question)
● How can we optimize accuracy on future data?
○ Known as the “overfitting” problem (i.e., “generalization” theory)
● How can we have confidence in the results?
○ How much training data is required to find accurate hypothesis? (“statistical” question)
● Are some learning problems computationally intractable? (“computational” question)
● How can we formulate application problems as machine learning problems? (“engineering”
question)
FIT-CS INT3405 - Machine Learning 12
Regression with One Variable (1)
Housing Prices
(Portland, OR)
Price
(in 1000s of dollars)

Size
(feet2)
Supervised Learning Regression Problem
Given the “right answer” for each Predict real-valued output
example in the data.

FIT-CS INT3405 - Machine Learning 13

Regression with One Variable (2)

Training set of Size in feet2 (x) Price ($) in 1000's (y)

housing prices 2104 460
(Portland, OR) 1416 232
1534 315
852 178
… …

Notation:
m = Number of training examples
x’s = “input” variable / features
y’s = “output” variable / “target” variable

FIT-CS INT3405 - Machine Learning 14

Model Representation
Training Set How do we represent h ?

Learning Algorithm y

Size of h Estimated x
house price
x Hypothesis y
Linear regression with one variable.
“Univariate Linear Regression”

How to choose parameters ?

FIT-CS INT3405 - Machine Learning 15
Formulation: Cost Function (1)
Hypothesis:

Parameters:

y
Cost Function: mean squared error (MSE)

x
Goal:

FIT-CS INT3405 - Machine Learning 16

Formulation: Cost Function (2)
Simplified
Hypothesis:

Parameters:

Cost Function:

Goal:

FIT-CS INT3405 - Machine Learning 17

Cost Function: Example (1)

For fix this is a function of x function of the parameter

FIT-CS INT3405 - Machine Learning 18

Cost Function: Example (2)

For fix this is a function of x function of the parameter

FIT-CS INT3405 - Machine Learning 19

Cost Function: Example (3)

For fix this is a function of x function of the parameter

FIT-CS INT3405 - Machine Learning 20

Cost Function (1)

Hypothesis:

Parameters:

Cost Function:

Goal:

FIT-CS INT3405 - Machine Learning 21

Cost Function (2)

(for fixed , this is a function of x) (function of the parameters )

Price ($)
in
1000’s

Size in feet2
(x)

FIT-CS INT3405 - Machine Learning 22

Cost Function (3)
●Contour plots

FIT-CS INT3405 - Machine Learning 23

Cost Function (4)

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 24

Cost Function (5)

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 25

Gradient Descent for Optimization (1)

Given some objective function

Want to optimize

Outline:
• Start with some
• Keep changing to reduce
until we hopefully end up at a minimum

FIT-CS INT3405 - Machine Learning 26

Gradient Descent for Optimization (2)

FIT-CS INT3405 - Machine Learning 27

Gradient Descent for Optimization (3)

FIT-CS INT3405 - Machine Learning 28

Gradient Descent Algorithm

Gradient descent algorithm

learning rate parameter

(rule of thumb: 0.1)

FIT-CS INT3405 - Machine Learning 29

Gradient Descent for Linear Regression (1)
Gradient descent algorithm Linear Regression Model

FIT-CS INT3405 - Machine Learning 30

Gradient Descent for Linear Regression (2)

Gradient descent algorithm

update
and
simultaneously

FIT-CS INT3405 - Machine Learning 31

Gradient Descent Example (1)

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 32

Gradient Descent Example (2)

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 33

Gradient Descent Example (3)

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 34

Gradient Descent Example (4)

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 35

Gradient Descent Example (5)

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 36

Gradient Descent Example (6)

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 37

Gradient Descent Example (7)

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 38

Gradient Descent Example (8)

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 39

Gradient Descent Example (9)

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 40

Batch Gradient Descent

“Batch”: Each step of gradient descent uses all the

training examples.

FIT-CS INT3405 - Machine Learning 41

Multivariate Linear Regression (1)
Multiple features (variables).

Size (feet2) Number of Number of Age of home Price ($1000)

bedrooms floors (years)

2104 5 1 45 460
1416 3 2 40 232
1534 3 2 30 315
852 2 1 36 178
… … … … …

Notation:
= number of features
= input (features) of training example.
= value of feature in training example.
FIT-CS INT3405 - Machine Learning 42
Multivariate Linear Regression (2)
Hypothesis:

Previously:

For convenience of notation, define .

FIT-CS INT3405 - Machine Learning 43

Gradient Descent for Multivariate LR
Hypothesis:

Parameters:

Cost function:

Gradient descent:
Repeat (simultaneously update for every )

FIT-CS INT3405 - Machine Learning 44

Univariate LR vs Multivariate LR

Gradient Descent
Previously (n=1): New algorithm :
Repeat Repeat

(simultaneously update )

FIT-CS INT3405 - Machine Learning 45

Convergence and Learning Rate

Example automatic convergence test:

Declare convergence if
decreases by less than
in one iteration.

No. of iterations
For sufficiently small , should decrease on every iteration.
But if is too small, gradient descent can be slow to converge.
If is too large: may not decrease on every iteration; may not converge.
SML– Term 1 2020-2021
FIT-CS INT3405 - Machine Learning 46
46
Learning Rate

divergenc
e

gradually
too small too decreased
constant large

FIT-CS INT3405 - Machine Learning 47

Normal Equation (1)
Gradient Descent
• Iterative approach
Normal Equation
• Analytical method to solve
Intuition Example: If 1D

Solve equation to find w

FIT-CS INT3405 - Machine Learning 48

Normal Equation (2)

FIT-CS INT3405 - Machine Learning 49

Normal Equation (3)
●Matrix-vector formulation

●Analytical solution

FIT-CS INT3405 - Machine Learning 50

The Pseudo-inverse

FIT-CS INT3405 - Machine Learning 51

Normal Equation: Example
Examples:
Size (feet2) Number of Number of Age of home Price ($1000)
bedrooms floors (years)

1 2104 5 1 45 460
1 1416 3 2 40 232
1 1534 3 2 30 315
1 852 2 1 36 178

is inverse of matrix .
FIT-CS INT3405 - Machine Learning 52
Gradient Descent vs Normal Equation
training examples, features.
Gradient Descent Normal Equation
• Need to choose . • No need to choose .
• Needs many iterations. • Don’t need to iterate.
• Works well even • Need to compute
when is large.
• Slow if is very large.

FIT-CS INT3405 - Machine Learning 53

Summary
● Supervised Learning
● Linear Regression with One Variable
○ Model Representation
○ Cost Functions
○ Gradient Descent
● Linear Regression with Multiple Variables
○ Learning rate
○ Normal Equation

Duc-Trong Le
FIT-CS INT3405 - Machine Learning 54

XI - BST - 3 - Private, Public and Global Enterprises
No ratings yet
XI - BST - 3 - Private, Public and Global Enterprises
3 pages
Avasthas of Planets
No ratings yet
Avasthas of Planets
13 pages
MC Female Home Challenge 6.0 Cut
100% (2)
MC Female Home Challenge 6.0 Cut
22 pages
Machine Learning: Linear Regression
No ratings yet
Machine Learning: Linear Regression
55 pages
Lecture2 - General Concepts For ML
No ratings yet
Lecture2 - General Concepts For ML
69 pages
Lecture 4 Classification P1
No ratings yet
Lecture 4 Classification P1
49 pages
Lecture 1 - Introduction To ML
No ratings yet
Lecture 1 - Introduction To ML
41 pages
Chapter 1 5 Thesis Sample
100% (2)
Chapter 1 5 Thesis Sample
64 pages
Intro to Linear Regression
100% (1)
Intro to Linear Regression
47 pages
Cleaning Validation MACO Swab Rinse Ovais v1.1
No ratings yet
Cleaning Validation MACO Swab Rinse Ovais v1.1
8 pages
Week 4
No ratings yet
Week 4
101 pages
CS229 Lecture Notes: Supervised Learning
No ratings yet
CS229 Lecture Notes: Supervised Learning
30 pages
Linear Regression
No ratings yet
Linear Regression
91 pages
Linear Regression
No ratings yet
Linear Regression
38 pages
Lecture W2ab
No ratings yet
Lecture W2ab
56 pages
Grade 9 Chapter 10 Review Exercise
No ratings yet
Grade 9 Chapter 10 Review Exercise
6 pages
Lecture 7 - Feature Selection & Model Optimization
No ratings yet
Lecture 7 - Feature Selection & Model Optimization
48 pages
Understanding Kohlberg's Moral Stages
No ratings yet
Understanding Kohlberg's Moral Stages
43 pages
Linear Regression
No ratings yet
Linear Regression
63 pages
Cse 445 ML - 1
No ratings yet
Cse 445 ML - 1
28 pages
Lecture 5
No ratings yet
Lecture 5
41 pages
Lecture3 - Linear Regression and Logistic Regression
No ratings yet
Lecture3 - Linear Regression and Logistic Regression
60 pages
Slide 2 ML Basics
No ratings yet
Slide 2 ML Basics
42 pages
Machine Learning Notes Cs229 1
No ratings yet
Machine Learning Notes Cs229 1
217 pages
Chapter Three Searching and Sorting Algorithm
100% (1)
Chapter Three Searching and Sorting Algorithm
47 pages
Linear Regression
No ratings yet
Linear Regression
95 pages
CSE4233 CSE4267 Assignment Midterm SU25
No ratings yet
CSE4233 CSE4267 Assignment Midterm SU25
3 pages
Lecture 4 Classification P1
No ratings yet
Lecture 4 Classification P1
51 pages
CS 229: Supervised Learning Basics
100% (1)
CS 229: Supervised Learning Basics
48 pages
Lecture 2 - General Concepts For ML
No ratings yet
Lecture 2 - General Concepts For ML
63 pages
ML 01 (Pranavv)
No ratings yet
ML 01 (Pranavv)
14 pages
הרצאה-Classifiers and Decision Trees
No ratings yet
הרצאה-Classifiers and Decision Trees
119 pages
Lecture 4 Classification P1
No ratings yet
Lecture 4 Classification P1
50 pages
ML 02 Linear Regression
No ratings yet
ML 02 Linear Regression
51 pages
Selling Task % Weight of Task in Sales Process % Advertising Contribution To Task Advertising's Contribution To Sales Estimated Estimated Projected
100% (1)
Selling Task % Weight of Task in Sales Process % Advertising Contribution To Task Advertising's Contribution To Sales Estimated Estimated Projected
2 pages
Large-Scale Machine Learning Guide
No ratings yet
Large-Scale Machine Learning Guide
47 pages
Linear Regression
No ratings yet
Linear Regression
3 pages
Oops (Object Oriented Programming System) : Object Class Inheritance Polymorphism Abstraction Encapsulation
No ratings yet
Oops (Object Oriented Programming System) : Object Class Inheritance Polymorphism Abstraction Encapsulation
65 pages
Lecture 6 Classification P3 SVM
No ratings yet
Lecture 6 Classification P3 SVM
44 pages
CS229
No ratings yet
CS229
69 pages
Linear Regression
100% (1)
Linear Regression
51 pages
Linear Regression For ML Ass
No ratings yet
Linear Regression For ML Ass
99 pages
Supervised Learning & Regression
No ratings yet
Supervised Learning & Regression
41 pages
Role of Family in Consumer Behaviour
0% (1)
Role of Family in Consumer Behaviour
10 pages
Lecture 5 Classification SVM
No ratings yet
Lecture 5 Classification SVM
44 pages
Week 04
No ratings yet
Week 04
101 pages
Machine Learning - 5
No ratings yet
Machine Learning - 5
50 pages
Lecture 6 Classification SVM
No ratings yet
Lecture 6 Classification SVM
44 pages
Devotional Insights of Gaura-kiçora
No ratings yet
Devotional Insights of Gaura-kiçora
95 pages
DTreesAndOverfitting 1 11 2011 - Final
No ratings yet
DTreesAndOverfitting 1 11 2011 - Final
20 pages
Ep 20 Units
No ratings yet
Ep 20 Units
142 pages
03 Supervised Classification
No ratings yet
03 Supervised Classification
68 pages
AZ AI Lec 08 Machine Learing1
No ratings yet
AZ AI Lec 08 Machine Learing1
60 pages
Lecture Slides Week11
No ratings yet
Lecture Slides Week11
33 pages
Lecture Slides-Week9,10
No ratings yet
Lecture Slides-Week9,10
66 pages
Lecture Slides-Week9
No ratings yet
Lecture Slides-Week9
46 pages
Lecture Slides-Week11
No ratings yet
Lecture Slides-Week11
32 pages
Machine Learning: Introduction and Linear Regression
No ratings yet
Machine Learning: Introduction and Linear Regression
29 pages
Ds Module 4
No ratings yet
Ds Module 4
73 pages
Linear Regression
No ratings yet
Linear Regression
4 pages
Week 3
No ratings yet
Week 3
56 pages
Lect 1
No ratings yet
Lect 1
24 pages
Machine Learning Midterm Exam 2020
No ratings yet
Machine Learning Midterm Exam 2020
6 pages
Stress in Speech
No ratings yet
Stress in Speech
1 page
Linear Regression: Jia-Bin Huang Virginia Tech
No ratings yet
Linear Regression: Jia-Bin Huang Virginia Tech
59 pages
Fiz117 Notebook
No ratings yet
Fiz117 Notebook
77 pages
Power Screw Lift Mechanism Design
No ratings yet
Power Screw Lift Mechanism Design
5 pages
18nov-5th Sem Green Synthesis
No ratings yet
18nov-5th Sem Green Synthesis
21 pages
1 ML Introduction
No ratings yet
1 ML Introduction
36 pages
cs229 2
No ratings yet
cs229 2
275 pages
Linear Regression Basics Explained
No ratings yet
Linear Regression Basics Explained
48 pages
Well Productivity in An Iranian Gas-Cond
No ratings yet
Well Productivity in An Iranian Gas-Cond
11 pages
Dbms Theory
No ratings yet
Dbms Theory
20 pages
Machine Learning Summary
No ratings yet
Machine Learning Summary
38 pages
Puritanism & Early American Literature
No ratings yet
Puritanism & Early American Literature
4 pages
Pro Forma Invoice
0% (1)
Pro Forma Invoice
1 page
CHAPTER 3 - Unveiling Art (Subject, Content, Style and Presentation Methods)
No ratings yet
CHAPTER 3 - Unveiling Art (Subject, Content, Style and Presentation Methods)
2 pages
CH 12
No ratings yet
CH 12
37 pages
NARAYANI MAHAL Job Fare
No ratings yet
NARAYANI MAHAL Job Fare
2 pages
Machine Learning Shortnote
No ratings yet
Machine Learning Shortnote
14 pages
Bca Muj
No ratings yet
Bca Muj
4 pages
Namma Kalvi 12th Zoology Question Bank em 217045
No ratings yet
Namma Kalvi 12th Zoology Question Bank em 217045
45 pages
Assignment MHDD 160
No ratings yet
Assignment MHDD 160
2 pages
Lance Design For Argon Bubbling in Molten Metal
No ratings yet
Lance Design For Argon Bubbling in Molten Metal
12 pages
Latin American Veggie Meal Plan
No ratings yet
Latin American Veggie Meal Plan
2 pages
Critical Thinking Exercise: "Wild Child: The Story of Feral Children"
No ratings yet
Critical Thinking Exercise: "Wild Child: The Story of Feral Children"
2 pages

Linear Regression in Machine Learning

Uploaded by

Linear Regression in Machine Learning

Uploaded by

UET

ĐẠI HỌC CÔNG NGHỆ, ĐHQGHN

INT3405 - Machine Learning

FIT-CS INT3405 - Machine Learning 2

FIT-CS INT3405 - Machine Learning 3

○ Target function: (unknown)

FIT-CS INT3405 - Machine Learning 5

○ Infinite possible hypotheses!

● Any nonlinear models

FIT-CS INT3405 - Machine Learning 9

●Develop Flexible Hypothesis Spaces

●For either case, the key is to

FIT-CS INT3405 - Machine Learning 13

Training set of Size in feet2 (x) Price ($) in 1000's (y)

FIT-CS INT3405 - Machine Learning 14

How to choose parameters ?

FIT-CS INT3405 - Machine Learning 16

FIT-CS INT3405 - Machine Learning 17

For fix this is a function of x function of the parameter

FIT-CS INT3405 - Machine Learning 18

For fix this is a function of x function of the parameter

FIT-CS INT3405 - Machine Learning 19

For fix this is a function of x function of the parameter

FIT-CS INT3405 - Machine Learning 20

FIT-CS INT3405 - Machine Learning 21

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 22

FIT-CS INT3405 - Machine Learning 23

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 24

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 25

Given some objective function

FIT-CS INT3405 - Machine Learning 26

FIT-CS INT3405 - Machine Learning 27

FIT-CS INT3405 - Machine Learning 28

Gradient descent algorithm

learning rate parameter

FIT-CS INT3405 - Machine Learning 29

FIT-CS INT3405 - Machine Learning 30

Gradient descent algorithm

FIT-CS INT3405 - Machine Learning 31

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 32

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 33

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 34

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 35

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 36

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 37

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 38

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 39

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 40

“Batch”: Each step of gradient descent uses all the

FIT-CS INT3405 - Machine Learning 41

Size (feet2) Number of Number of Age of home Price ($1000)

For convenience of notation, define .

FIT-CS INT3405 - Machine Learning 43

FIT-CS INT3405 - Machine Learning 44

FIT-CS INT3405 - Machine Learning 45

Example automatic convergence test:

FIT-CS INT3405 - Machine Learning 47

Solve equation to find w

FIT-CS INT3405 - Machine Learning 48

FIT-CS INT3405 - Machine Learning 49

FIT-CS INT3405 - Machine Learning 50

FIT-CS INT3405 - Machine Learning 51

FIT-CS INT3405 - Machine Learning 53

You might also like