0% found this document useful (0 votes)

46 views44 pages

Lecture 6 Classification SVM

Uploaded by

Nguyễn Đắc Học UET-

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

46 views44 pages

Lecture 6 Classification SVM

Uploaded by

Nguyễn Đắc Học UET-

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 44

UET

Since 2004

ĐẠI HỌC CÔNG NGHỆ, ĐHQGHN

VNU-University of Engineering and Technology

INT3405 - Machine Learning

Lecture 5: Classification (P3) - SVM
Duc-Trong Le & Hoang Van Xiem

Hanoi, 03/2024
Outline
● Problem and Intuition
● Formulation of Linear SVM
○ Hard Margin SVM
○ Soft Margin SVM
○ Primal/dual Problems
● Nonlinear SVM with Kernel
○ Kernel Tricks
○ SVM with Kernel
● Multi-class classification
FIT-CS INT3405 - Machine Learning 2
Recap: Bayes Theorem & Decision Boundary

Posterior Likelihood Prior Decision Boundary

FIT-CS INT3405 - Machine Learning 3

History
● SVMs introduced in COLT-92 by Boser, Guyon & Vapnik. Became rather
popular since.
● Theoretically well motivated algorithm: developed from Statistical
Learning Theory (Vapnik & Chervonenkis) since the 60s
● Empirically good performance: successful applications in many fields
(bioinformatics, text, image recognition, . . . )
● Centralized website: www.kernel-machines.org

FIT-CS INT3405 - Machine Learning 4

Problem Setting
● Problem Setting
○ Training data

○ For two-class (binary)

classification

● Goal
○ To find an optimal linear
hyperplane (decision boundary)
that separates all the data

FIT-CS INT3405 - Machine Learning 5

Intuition

● One possible solution

FIT-CS INT3405 - Machine Learning 6

Intuition

● Another possible
solution

FIT-CS INT3405 - Machine Learning 7

Intuition

● Too many other

possible solutions

FIT-CS INT3405 - Machine Learning 8

Intuition

● Which one is better

than the other?
● How to define better?

FIT-CS INT3405 - Machine Learning 9

Intuition: Maximum Margin
● Intuition of “Margin”
○ The margin of a linear classifier as the
width that the boundary could be
increased by before hitting a data point.

● Idea of SVM
○ Find the separating hyperplane
maximizing the margin

FIT-CS INT3405 - Machine Learning 10

Support Vector Machines (SVM)

Support
Vectors

FIT-CS INT3405 - Machine Learning 11

SVM: Optimization Formulation (1)

●From Margin to Norm

■ Margin: distance between

■ Maximizing margin is equivalent to minimizing

●Constraints
○ Separation with margin, i.e.,

○ Simplified as the equivalent constraint

FIT-CS INT3405 - Machine Learning 12

SVM: Optimization Formulation (2)
●SVM as a Quadratic Programming (QP) problem

○ Convex problem, has unique minimum

○ Quadratic objective function
○ Linear equality and inequality constraints

FIT-CS INT3405 - Machine Learning 13

Linearly Non-separable Cases
● What if the data cannot be linearly separable?

● For such case,

Hard margin SVM cannot be applied
directly

FIT-CS INT3405 - Machine Learning 14

Soft Margin SVM
●Standard Linear SVM
○Introduce slack variables
○Relax the constraints
○Penalize the relaxation
Primal
Problem:

C is a regularization parameter. Soft margin SVM trade off between

maximizing the margin and minimizing the misclassification error rate

FIT-CS INT3405 - Machine Learning 15

Linearly Non-separable Case
● Re-written as an unconstrained optimization:

FIT-CS INT3405 - Machine Learning 16

Linearly Non-separable Case
Model Training
Complexity Error

Support
Vector
Machine

Regularized
logistic
regression

Choice of Parameter C:
Large C: Lower bias, high variance
Small C: Higher bias, low variance
FIT-CS INT3405 - Machine Learning 17
Dual Form of SVM

https://www.quora.com/What-is-primal-and-dual-formulation-in-SVM

FIT-CS INT3405 - Machine Learning 18

Suppose we’re in 1-dimension

What would SVMs

do with this data?

FIT-CS INT3405 - Machine Learning 19

Suppose we’re in 1-dimension

Not a big surprise

FIT-CS INT3405 - Machine Learning 20

Harder 1-dimensional Dataset

FIT-CS INT3405 - Machine Learning 21

Harder 1-dimensional Dataset

FIT-CS INT3405 - Machine Learning 22

Harder 1-dimensional Dataset

FIT-CS INT3405 - Machine Learning 23

SVM: Nonlinear Case
● Limitation of linear SVM
○ Linear SVM classifiers sometimes are restricted for some complex classification tasks
where data are not linearly separable in input space
● Basic Idea of Nonlinear SVM
○ Map data into a richer feature space including nonlinear features, then construct a
linear hyperplane in that space (using the same way)

FIT-CS INT3405 - Machine Learning 24

SVM: Nonlinear Case
● First, define a feature mapping

● Then learns a hyperplane in the feature space

● Almost the same Primal form of SVM

FIT-CS INT3405 - Machine Learning 25

SVM: Nonlinear Case
● The dual problem

● The optimal solution

FIT-CS INT3405 - Machine Learning 26

How to choose the feature mapping?

• Polynomial mapping
• Example:

• Problem of using explicit feature mapping:

• The dimensionality of can be very large, making w hard to
represent explicitly in memory, and hard for the QP to solve

FIT-CS INT3405 - Machine Learning 27

Kernel Tricks
• Idea: Replacing dot product with a kernel function

• Not all functions are kernel functions

• A function could be a kernel if it is
○ Symmetric:
○ Positive semi-definite (PSD): the “Gram matrix” K
defined by is PSD
(the PSD means )
• Benefits
○ Efficiency: Computing kernel is often more efficient than compute
and the dot product
○ Flexibility: can choose various kernel functions as long as the
existence of is guaranteed (Mercer’ condition)
FIT-CS INT3405 - Machine Learning 28
Kernel Functions
• Linear Kernel

• Polynomial Kernel (degree d)

• Gaussian / RBF Kernel

FIT-CS INT3405 - Machine Learning 29

Kernel Functions
● Example: Polynomial Kernels

FIT-CS INT3405 - Machine Learning 30

Gaussian/RBF Kernel
●The kernel can be inner product in the infinite dimensional space.
Assume x∈R.

FIT-CS INT3405 - Machine Learning 31

Nonlinear SVM with Kernel (1)
● Introducing nonlinearity into the model
● Computationally efficient
● The dual form

● The decision function

FIT-CS INT3405 - Machine Learning 32

Nonlinear SVM with Kernel (2)

FIT-CS INT3405 - Machine Learning 33

Nonlinear SVM with Kernel (3)

FIT-CS INT3405 - Machine Learning 34

Nonlinear SVM with Kernel (4)
●The inner product in the feature space (similarity score) is performed
implicitly
●Any linear classification method can be easily extended to nonlinear
feature space (e.g., kernelized logistic regression)
●Non-vectorial data can be utilized (as long as kernel matrix is PSD)
●Questions:
○ Which kernel to use? How to set the parameters?
○ One kernel for each feature type or for all?

FIT-CS INT3405 - Machine Learning 35

Curse of Kernalization

●Challenge
○ Training kernel classifiers is often much more computationally
expensive
○ For kernel SVM, if one solves it by typical QP solvers, it will need
O(N^3). Even for faster solvers (SMO) or others, it typically needs
at least O(N^2) time cost.
○ But linear classifiers can be trained in much fasters, typically in
linear time O(N)
●Question
○ How to train kernel machines for large-scale datasets?

FIT-CS INT3405 - Machine Learning 36

Kernel Approximation
● Our goal
○ To construct a new representation
so that :
● Linear model
○ The hypothesis can be rewritten:

where
○ Then apply linear classifiers on the new representation z
● Two methods
○ Kernel Functional Approximation: Fourier method
○ Kernel Matrix Approximation: Nystrӧm method

FIT-CS INT3405 - Machine Learning 37

Multi-class Classification
● Consider k classes
● One-against-the rest: Train k binary SVMs:
○ 1st class vs. (2 − k)th class
○ 2nd class vs. (1, 3 − k)th class
○…
● k decision functions

FIT-CS INT3405 - Machine Learning 38

Multi-class Classification
● Prediction

● Reason: If it’s the 1st class, then we should have

FIT-CS INT3405 - Machine Learning 39

Multi-class Classification
● One-against-one: train k(k − 1)/2 binary SVMs
(1,2), (1,3), . . . , (1,k), (2,3), (2,4), . . . , (k−1,k)
● Example: if 4 classes 6 binary SVMs

FIT-CS INT3405 - Machine Learning 40

Multi-class Classification
● For a testing data, predict all binary SVMs

● Select the one with the

largest vote

● May use decision values as well

FIT-CS INT3405 - Machine Learning 41

Multi-class Classification
● There are many other methods
● A comparison in [Hsu and Lin, 2002]
● Accuracy similar for many problems
● But 1-against-1 fastest for training
● Assume the SVM optimization with size n is
● 1 vs. all
○ k problems, each has N data
● 1 vs. 1
○ k(k − 1)/2 problems, each 2N/k data on average

Chih-Wei Hsu and Chih-Jen Lin, "A comparison of methods for multiclass support vector machines," in IEEE Transactions on Neural Networks, vol. 13, no. 2, pp. 415-
425, March 2002

FIT-CS INT3405 - Machine Learning 42

Summary
● Problem and Intuition
● Formulation of Linear SVM
○ Hard Margin SVM
○ Soft Margin SVM
○ Primal/dual Problems
● Nonlinear SVM with Kernel
○ Kernel Tricks
○ SVM with Kernel
● Multi-class classification
FIT-CS INT3405 - Machine Learning 43
UET
Since 2004

ĐẠI HỌC CÔNG NGHỆ, ĐHQGHN

VNU-University of Engineering and Technology

Thank you
Email me
[email protected]

SVM Guide for Data Scientists
No ratings yet
SVM Guide for Data Scientists
24 pages
Chapter 6 ML Classifications
100% (1)
Chapter 6 ML Classifications
51 pages
Presentation - SVM & KM - May 2009
No ratings yet
Presentation - SVM & KM - May 2009
24 pages
Quantum Perturbation Theory
No ratings yet
Quantum Perturbation Theory
46 pages
SVM Tutorial
No ratings yet
SVM Tutorial
34 pages
SVM PRESENTATION
No ratings yet
SVM PRESENTATION
34 pages
Introduction To Support Vector Machines: BTR Workshop Fall 2006
No ratings yet
Introduction To Support Vector Machines: BTR Workshop Fall 2006
88 pages
Wave Spectrum Fatigue Guide
100% (1)
Wave Spectrum Fatigue Guide
40 pages
Introduction To Support Vector Machines: BTR Workshop Fall 2006
No ratings yet
Introduction To Support Vector Machines: BTR Workshop Fall 2006
88 pages
A Practical Guide To Critical Thinking-Haskins
0% (1)
A Practical Guide To Critical Thinking-Haskins
20 pages
Support Vector Machines
No ratings yet
Support Vector Machines
24 pages
Atc Lecture Tyliu
No ratings yet
Atc Lecture Tyliu
48 pages
SVM Applications and Properties
100% (1)
SVM Applications and Properties
34 pages
"Classifiers": R & D Project by Under The Guidance of
No ratings yet
"Classifiers": R & D Project by Under The Guidance of
59 pages
CS-13410 Introduction To Machine Learning
No ratings yet
CS-13410 Introduction To Machine Learning
33 pages
Introduction To: Support Vector Machines
No ratings yet
Introduction To: Support Vector Machines
53 pages
SVM Presentation
No ratings yet
SVM Presentation
27 pages
PMS KPK
No ratings yet
PMS KPK
2 pages
Lecture 7 - Feature Selection & Model Optimization
No ratings yet
Lecture 7 - Feature Selection & Model Optimization
48 pages
Machine Learning
No ratings yet
Machine Learning
45 pages
Hands On Machine Learning 3 Edition
No ratings yet
Hands On Machine Learning 3 Edition
43 pages
Lecture 4 Classification P1
No ratings yet
Lecture 4 Classification P1
49 pages
Introduction to Support Vector Machines
No ratings yet
Introduction to Support Vector Machines
33 pages
Lecture 4 Classification P1
No ratings yet
Lecture 4 Classification P1
51 pages
SML Unit 4
No ratings yet
SML Unit 4
61 pages
Extending Machine Learning Models
No ratings yet
Extending Machine Learning Models
64 pages
6 Math
No ratings yet
6 Math
184 pages
Quiz 1 On Wednesday
No ratings yet
Quiz 1 On Wednesday
46 pages
SVM Tutorial
No ratings yet
SVM Tutorial
31 pages
Fast Kernel Classifiers
No ratings yet
Fast Kernel Classifiers
41 pages
This Is
No ratings yet
This Is
7 pages
Machine Learning: Linear Regression
No ratings yet
Machine Learning: Linear Regression
55 pages
ML.4-Classification Techniques (Week 5,6,7)
No ratings yet
ML.4-Classification Techniques (Week 5,6,7)
56 pages
Lecture 6 - Classification - SVM
No ratings yet
Lecture 6 - Classification - SVM
48 pages
Lec11 Register Transfer and Micro Operations Part2
No ratings yet
Lec11 Register Transfer and Micro Operations Part2
22 pages
05 Work Energy
No ratings yet
05 Work Energy
60 pages
Lecture 5 Classification SVM
No ratings yet
Lecture 5 Classification SVM
44 pages
S6 MS Ch1 Note Combination and Permutation New
No ratings yet
S6 MS Ch1 Note Combination and Permutation New
20 pages
Air Cleaner Systems
No ratings yet
Air Cleaner Systems
21 pages
Unit 1 Mod 2 Acid-Base Eqm
No ratings yet
Unit 1 Mod 2 Acid-Base Eqm
13 pages
Automata Theory Chapter 2 PDF
No ratings yet
Automata Theory Chapter 2 PDF
12 pages
Chapter 9
No ratings yet
Chapter 9
34 pages
22-Kernel Tricks Shit
No ratings yet
22-Kernel Tricks Shit
43 pages
Lecture 4 Classification P1
No ratings yet
Lecture 4 Classification P1
50 pages
SVM Classifier & Parameter Estimation
No ratings yet
SVM Classifier & Parameter Estimation
45 pages
CMPE 442 Introduction To Machine Learning: Support Vector Machines
No ratings yet
CMPE 442 Introduction To Machine Learning: Support Vector Machines
64 pages
Ain3001 - 04 - Support - Vector.machines
No ratings yet
Ain3001 - 04 - Support - Vector.machines
50 pages
Linear Regression in Machine Learning
No ratings yet
Linear Regression in Machine Learning
54 pages
House Price Prediction Guide
No ratings yet
House Price Prediction Guide
32 pages
SVM-CDing2024 11 15
No ratings yet
SVM-CDing2024 11 15
54 pages
Leu Ning 1988
No ratings yet
Leu Ning 1988
13 pages
Support Vector Machine
No ratings yet
Support Vector Machine
52 pages
Grade 4 DLL Quarter 3 Week 1 (Sir Bien Cruz)
No ratings yet
Grade 4 DLL Quarter 3 Week 1 (Sir Bien Cruz)
47 pages
Classification
No ratings yet
Classification
4 pages
Unit 3
No ratings yet
Unit 3
100 pages
Support Vector Machines
No ratings yet
Support Vector Machines
57 pages
Introduction to Support Vector Machines
No ratings yet
Introduction to Support Vector Machines
40 pages
Ateneo de Davao Math Curriculum 2007
No ratings yet
Ateneo de Davao Math Curriculum 2007
2 pages
Intro To Real External Flows Lesson 1 PDF
No ratings yet
Intro To Real External Flows Lesson 1 PDF
11 pages
cs221 Lecture11
No ratings yet
cs221 Lecture11
71 pages
Lecture 6 Classification P3 SVM
No ratings yet
Lecture 6 Classification P3 SVM
44 pages
Handout 03 Classic Classifiers
No ratings yet
Handout 03 Classic Classifiers
39 pages
Lecture 5
No ratings yet
Lecture 5
19 pages
Grade 5 Math Lesson Plan: Volume
No ratings yet
Grade 5 Math Lesson Plan: Volume
14 pages
Optimization of The SWAT Model To Adequately Predict Different Segments of A Managed Streamflow Hydrograph
No ratings yet
Optimization of The SWAT Model To Adequately Predict Different Segments of A Managed Streamflow Hydrograph
21 pages
Advanced Statistical Physics Problems
No ratings yet
Advanced Statistical Physics Problems
7 pages
Stair, Staircase and Ramps
No ratings yet
Stair, Staircase and Ramps
18 pages
804YB Kendriya Vidyalaya Sangathan Hyderabad Region Common Summative Assessment - Ii
No ratings yet
804YB Kendriya Vidyalaya Sangathan Hyderabad Region Common Summative Assessment - Ii
8 pages
Circles
No ratings yet
Circles
2 pages
L6 Lecture Image - Classification.fundemental v4
No ratings yet
L6 Lecture Image - Classification.fundemental v4
66 pages
Analysisof Rainfall Variabilityin Sylhet Regionof Bangladesh
No ratings yet
Analysisof Rainfall Variabilityin Sylhet Regionof Bangladesh
11 pages
DC-1 Assignment-8
No ratings yet
DC-1 Assignment-8
5 pages
Math2 - q4 - Mod6 - Finding The Area of A Given
No ratings yet
Math2 - q4 - Mod6 - Finding The Area of A Given
24 pages
08 Classification
No ratings yet
08 Classification
46 pages
Asgn-6 Soln
No ratings yet
Asgn-6 Soln
16 pages
Lect 07 Distance Based Algorithms
No ratings yet
Lect 07 Distance Based Algorithms
34 pages
Machine Learning Crash Course: Computer Vision James Hays
No ratings yet
Machine Learning Crash Course: Computer Vision James Hays
38 pages
HandsOnML Ch5E
No ratings yet
HandsOnML Ch5E
31 pages
Support Vector Machine: Abinas Panda
No ratings yet
Support Vector Machine: Abinas Panda
52 pages
Classifying Data Using Support Vector Machines (SVMS) in Python
No ratings yet
Classifying Data Using Support Vector Machines (SVMS) in Python
5 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
26 pages
Algebra Notes From The Underground 1st Edition Paolo Aluffi Instant Download
No ratings yet
Algebra Notes From The Underground 1st Edition Paolo Aluffi Instant Download
82 pages
Tutorials
No ratings yet
Tutorials
75 pages
Classification 2
No ratings yet
Classification 2
56 pages
Mathematical Quantization 1st Edition Nik Weaver Instant Download
100% (6)
Mathematical Quantization 1st Edition Nik Weaver Instant Download
61 pages
ML 18-20 SVM
No ratings yet
ML 18-20 SVM
44 pages
Visual Recognition
No ratings yet
Visual Recognition
123 pages
Lecture 6 - Classification - SVM
No ratings yet
Lecture 6 - Classification - SVM
48 pages

Lecture 6 Classification SVM

Uploaded by

Lecture 6 Classification SVM

Uploaded by

UET

ĐẠI HỌC CÔNG NGHỆ, ĐHQGHN

INT3405 - Machine Learning

Posterior Likelihood Prior Decision Boundary

FIT-CS INT3405 - Machine Learning 3

FIT-CS INT3405 - Machine Learning 4

○ For two-class (binary)

FIT-CS INT3405 - Machine Learning 5

● One possible solution

FIT-CS INT3405 - Machine Learning 6

FIT-CS INT3405 - Machine Learning 7

● Too many other

FIT-CS INT3405 - Machine Learning 8

● Which one is better

FIT-CS INT3405 - Machine Learning 9

FIT-CS INT3405 - Machine Learning 10

FIT-CS INT3405 - Machine Learning 11

●From Margin to Norm

■ Maximizing margin is equivalent to minimizing

○ Simplified as the equivalent constraint

FIT-CS INT3405 - Machine Learning 12

○ Convex problem, has unique minimum

FIT-CS INT3405 - Machine Learning 13

● For such case,

FIT-CS INT3405 - Machine Learning 14

C is a regularization parameter. Soft margin SVM trade off between

FIT-CS INT3405 - Machine Learning 15

FIT-CS INT3405 - Machine Learning 16

FIT-CS INT3405 - Machine Learning 18

What would SVMs

FIT-CS INT3405 - Machine Learning 19

Not a big surprise

FIT-CS INT3405 - Machine Learning 20

FIT-CS INT3405 - Machine Learning 21

FIT-CS INT3405 - Machine Learning 22

FIT-CS INT3405 - Machine Learning 23

FIT-CS INT3405 - Machine Learning 24

● Then learns a hyperplane in the feature space

● Almost the same Primal form of SVM

FIT-CS INT3405 - Machine Learning 25

● The optimal solution

FIT-CS INT3405 - Machine Learning 26

• Problem of using explicit feature mapping:

FIT-CS INT3405 - Machine Learning 27

• Not all functions are kernel functions

• Polynomial Kernel (degree d)

• Gaussian / RBF Kernel

FIT-CS INT3405 - Machine Learning 29

FIT-CS INT3405 - Machine Learning 30

FIT-CS INT3405 - Machine Learning 31

● The decision function

FIT-CS INT3405 - Machine Learning 32

FIT-CS INT3405 - Machine Learning 33

FIT-CS INT3405 - Machine Learning 34

FIT-CS INT3405 - Machine Learning 35

FIT-CS INT3405 - Machine Learning 36

FIT-CS INT3405 - Machine Learning 37

FIT-CS INT3405 - Machine Learning 38

● Reason: If it’s the 1st class, then we should have

FIT-CS INT3405 - Machine Learning 39

FIT-CS INT3405 - Machine Learning 40

● Select the one with the

● May use decision values as well

FIT-CS INT3405 - Machine Learning 41

FIT-CS INT3405 - Machine Learning 42

ĐẠI HỌC CÔNG NGHỆ, ĐHQGHN

You might also like