0% found this document useful (0 votes)

46 views44 pages

Lecture 6 Classification SVM

Uploaded by

Nguyễn Đắc Học UET-

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

46 views44 pages

Lecture 6 Classification SVM

Uploaded by

Nguyễn Đắc Học UET-

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 44

UET

Since 2004

ĐẠI HỌC CÔNG NGHỆ, ĐHQGHN

VNU-University of Engineering and Technology

INT3405 - Machine Learning

Lecture 5: Classification (P3) - SVM
Duc-Trong Le & Hoang Van Xiem

Hanoi, 03/2024
Outline
● Problem and Intuition
● Formulation of Linear SVM
○ Hard Margin SVM
○ Soft Margin SVM
○ Primal/dual Problems
● Nonlinear SVM with Kernel
○ Kernel Tricks
○ SVM with Kernel
● Multi-class classification
FIT-CS INT3405 - Machine Learning 2
Recap: Bayes Theorem & Decision Boundary

Posterior Likelihood Prior Decision Boundary

FIT-CS INT3405 - Machine Learning 3

History
● SVMs introduced in COLT-92 by Boser, Guyon & Vapnik. Became rather
popular since.
● Theoretically well motivated algorithm: developed from Statistical
Learning Theory (Vapnik & Chervonenkis) since the 60s
● Empirically good performance: successful applications in many fields
(bioinformatics, text, image recognition, . . . )
● Centralized website: www.kernel-machines.org

FIT-CS INT3405 - Machine Learning 4

Problem Setting
● Problem Setting
○ Training data

○ For two-class (binary)

classification

● Goal
○ To find an optimal linear
hyperplane (decision boundary)
that separates all the data

FIT-CS INT3405 - Machine Learning 5

Intuition

● One possible solution

FIT-CS INT3405 - Machine Learning 6

Intuition

● Another possible
solution

FIT-CS INT3405 - Machine Learning 7

Intuition

● Too many other

possible solutions

FIT-CS INT3405 - Machine Learning 8

Intuition

● Which one is better

than the other?
● How to define better?

FIT-CS INT3405 - Machine Learning 9

Intuition: Maximum Margin
● Intuition of “Margin”
○ The margin of a linear classifier as the
width that the boundary could be
increased by before hitting a data point.

● Idea of SVM
○ Find the separating hyperplane
maximizing the margin

FIT-CS INT3405 - Machine Learning 10

Support Vector Machines (SVM)

Support
Vectors

FIT-CS INT3405 - Machine Learning 11

SVM: Optimization Formulation (1)

●From Margin to Norm

■ Margin: distance between

■ Maximizing margin is equivalent to minimizing

●Constraints
○ Separation with margin, i.e.,

○ Simplified as the equivalent constraint

FIT-CS INT3405 - Machine Learning 12

SVM: Optimization Formulation (2)
●SVM as a Quadratic Programming (QP) problem

○ Convex problem, has unique minimum

○ Quadratic objective function
○ Linear equality and inequality constraints

FIT-CS INT3405 - Machine Learning 13

Linearly Non-separable Cases
● What if the data cannot be linearly separable?

● For such case,

Hard margin SVM cannot be applied
directly

FIT-CS INT3405 - Machine Learning 14

Soft Margin SVM
●Standard Linear SVM
○Introduce slack variables
○Relax the constraints
○Penalize the relaxation
Primal
Problem:

C is a regularization parameter. Soft margin SVM trade off between

maximizing the margin and minimizing the misclassification error rate

FIT-CS INT3405 - Machine Learning 15

Linearly Non-separable Case
● Re-written as an unconstrained optimization:

FIT-CS INT3405 - Machine Learning 16

Linearly Non-separable Case
Model Training
Complexity Error

Support
Vector
Machine

Regularized
logistic
regression

Choice of Parameter C:
Large C: Lower bias, high variance
Small C: Higher bias, low variance
FIT-CS INT3405 - Machine Learning 17
Dual Form of SVM

https://www.quora.com/What-is-primal-and-dual-formulation-in-SVM

FIT-CS INT3405 - Machine Learning 18

Suppose we’re in 1-dimension

What would SVMs

do with this data?

FIT-CS INT3405 - Machine Learning 19

Suppose we’re in 1-dimension

Not a big surprise

FIT-CS INT3405 - Machine Learning 20

Harder 1-dimensional Dataset

FIT-CS INT3405 - Machine Learning 21

Harder 1-dimensional Dataset

FIT-CS INT3405 - Machine Learning 22

Harder 1-dimensional Dataset

FIT-CS INT3405 - Machine Learning 23

SVM: Nonlinear Case
● Limitation of linear SVM
○ Linear SVM classifiers sometimes are restricted for some complex classification tasks
where data are not linearly separable in input space
● Basic Idea of Nonlinear SVM
○ Map data into a richer feature space including nonlinear features, then construct a
linear hyperplane in that space (using the same way)

FIT-CS INT3405 - Machine Learning 24

SVM: Nonlinear Case
● First, define a feature mapping

● Then learns a hyperplane in the feature space

● Almost the same Primal form of SVM

FIT-CS INT3405 - Machine Learning 25

SVM: Nonlinear Case
● The dual problem

● The optimal solution

FIT-CS INT3405 - Machine Learning 26

How to choose the feature mapping?

• Polynomial mapping
• Example:

• Problem of using explicit feature mapping:

• The dimensionality of can be very large, making w hard to
represent explicitly in memory, and hard for the QP to solve

FIT-CS INT3405 - Machine Learning 27

Kernel Tricks
• Idea: Replacing dot product with a kernel function

• Not all functions are kernel functions

• A function could be a kernel if it is
○ Symmetric:
○ Positive semi-definite (PSD): the “Gram matrix” K
defined by is PSD
(the PSD means )
• Benefits
○ Efficiency: Computing kernel is often more efficient than compute
and the dot product
○ Flexibility: can choose various kernel functions as long as the
existence of is guaranteed (Mercer’ condition)
FIT-CS INT3405 - Machine Learning 28
Kernel Functions
• Linear Kernel

• Polynomial Kernel (degree d)

• Gaussian / RBF Kernel

FIT-CS INT3405 - Machine Learning 29

Kernel Functions
● Example: Polynomial Kernels

FIT-CS INT3405 - Machine Learning 30

Gaussian/RBF Kernel
●The kernel can be inner product in the infinite dimensional space.
Assume x∈R.

FIT-CS INT3405 - Machine Learning 31

Nonlinear SVM with Kernel (1)
● Introducing nonlinearity into the model
● Computationally efficient
● The dual form

● The decision function

FIT-CS INT3405 - Machine Learning 32

Nonlinear SVM with Kernel (2)

FIT-CS INT3405 - Machine Learning 33

Nonlinear SVM with Kernel (3)

FIT-CS INT3405 - Machine Learning 34

Nonlinear SVM with Kernel (4)
●The inner product in the feature space (similarity score) is performed
implicitly
●Any linear classification method can be easily extended to nonlinear
feature space (e.g., kernelized logistic regression)
●Non-vectorial data can be utilized (as long as kernel matrix is PSD)
●Questions:
○ Which kernel to use? How to set the parameters?
○ One kernel for each feature type or for all?

FIT-CS INT3405 - Machine Learning 35

Curse of Kernalization

●Challenge
○ Training kernel classifiers is often much more computationally
expensive
○ For kernel SVM, if one solves it by typical QP solvers, it will need
O(N^3). Even for faster solvers (SMO) or others, it typically needs
at least O(N^2) time cost.
○ But linear classifiers can be trained in much fasters, typically in
linear time O(N)
●Question
○ How to train kernel machines for large-scale datasets?

FIT-CS INT3405 - Machine Learning 36

Kernel Approximation
● Our goal
○ To construct a new representation
so that :
● Linear model
○ The hypothesis can be rewritten:

where
○ Then apply linear classifiers on the new representation z
● Two methods
○ Kernel Functional Approximation: Fourier method
○ Kernel Matrix Approximation: Nystrӧm method

FIT-CS INT3405 - Machine Learning 37

Multi-class Classification
● Consider k classes
● One-against-the rest: Train k binary SVMs:
○ 1st class vs. (2 − k)th class
○ 2nd class vs. (1, 3 − k)th class
○…
● k decision functions

FIT-CS INT3405 - Machine Learning 38

Multi-class Classification
● Prediction

● Reason: If it’s the 1st class, then we should have

FIT-CS INT3405 - Machine Learning 39

Multi-class Classification
● One-against-one: train k(k − 1)/2 binary SVMs
(1,2), (1,3), . . . , (1,k), (2,3), (2,4), . . . , (k−1,k)
● Example: if 4 classes 6 binary SVMs

FIT-CS INT3405 - Machine Learning 40

Multi-class Classification
● For a testing data, predict all binary SVMs

● Select the one with the

largest vote

● May use decision values as well

FIT-CS INT3405 - Machine Learning 41

Multi-class Classification
● There are many other methods
● A comparison in [Hsu and Lin, 2002]
● Accuracy similar for many problems
● But 1-against-1 fastest for training
● Assume the SVM optimization with size n is
● 1 vs. all
○ k problems, each has N data
● 1 vs. 1
○ k(k − 1)/2 problems, each 2N/k data on average

Chih-Wei Hsu and Chih-Jen Lin, "A comparison of methods for multiclass support vector machines," in IEEE Transactions on Neural Networks, vol. 13, no. 2, pp. 415-
425, March 2002

FIT-CS INT3405 - Machine Learning 42

Summary
● Problem and Intuition
● Formulation of Linear SVM
○ Hard Margin SVM
○ Soft Margin SVM
○ Primal/dual Problems
● Nonlinear SVM with Kernel
○ Kernel Tricks
○ SVM with Kernel
● Multi-class classification
FIT-CS INT3405 - Machine Learning 43
UET
Since 2004

ĐẠI HỌC CÔNG NGHỆ, ĐHQGHN

VNU-University of Engineering and Technology

Thank you
Email me
[email protected]

SVM Guide for Data Scientists
No ratings yet
SVM Guide for Data Scientists
24 pages
Topic 7 - Challenge Risk and Safety
No ratings yet
Topic 7 - Challenge Risk and Safety
83 pages
WBI04 01 MSC 20200123
No ratings yet
WBI04 01 MSC 20200123
29 pages
Operation Strategy
100% (1)
Operation Strategy
22 pages
Lecture 5 Classification SVM
No ratings yet
Lecture 5 Classification SVM
44 pages
Lecture 6 Classification P3 SVM
No ratings yet
Lecture 6 Classification P3 SVM
44 pages
Lecture 6 - Classification - SVM
No ratings yet
Lecture 6 - Classification - SVM
48 pages
Lecture 6 - Classification - SVM
No ratings yet
Lecture 6 - Classification - SVM
48 pages
Lecture 7 - Feature Selection & Model Optimization
No ratings yet
Lecture 7 - Feature Selection & Model Optimization
48 pages
Lecture 4 Classification P1
No ratings yet
Lecture 4 Classification P1
49 pages
Lecture 4 Classification P1
No ratings yet
Lecture 4 Classification P1
51 pages
Lecture 4 Classification P1
No ratings yet
Lecture 4 Classification P1
50 pages
Machine Learning
No ratings yet
Machine Learning
45 pages
L6 Lecture Image - Classification.fundemental v4
No ratings yet
L6 Lecture Image - Classification.fundemental v4
66 pages
22-Kernel Tricks Shit
No ratings yet
22-Kernel Tricks Shit
43 pages
Chapter 6 ML Classifications
100% (1)
Chapter 6 ML Classifications
51 pages
Lecture 5
No ratings yet
Lecture 5
19 pages
Unit 3
No ratings yet
Unit 3
100 pages
Support Vector Machines
No ratings yet
Support Vector Machines
24 pages
SVM-CDing2024 11 15
No ratings yet
SVM-CDing2024 11 15
54 pages
Extending Machine Learning Models
No ratings yet
Extending Machine Learning Models
64 pages
Introduction To: Support Vector Machines
No ratings yet
Introduction To: Support Vector Machines
53 pages
Introduction to Support Vector Machines
No ratings yet
Introduction to Support Vector Machines
33 pages
SVM Applications and Properties
100% (1)
SVM Applications and Properties
34 pages
Ain3001 - 04 - Support - Vector.machines
No ratings yet
Ain3001 - 04 - Support - Vector.machines
50 pages
Machine Learning Crash Course: Computer Vision James Hays
No ratings yet
Machine Learning Crash Course: Computer Vision James Hays
38 pages
Classification
No ratings yet
Classification
4 pages
"Classifiers": R & D Project by Under The Guidance of
No ratings yet
"Classifiers": R & D Project by Under The Guidance of
59 pages
CMPE 442 Introduction To Machine Learning: Support Vector Machines
No ratings yet
CMPE 442 Introduction To Machine Learning: Support Vector Machines
64 pages
Classification 2
No ratings yet
Classification 2
56 pages
Introduction to Support Vector Machines
No ratings yet
Introduction to Support Vector Machines
40 pages
Classifying Data Using Support Vector Machines (SVMS) in Python
No ratings yet
Classifying Data Using Support Vector Machines (SVMS) in Python
5 pages
Machine Learning: Linear Regression
No ratings yet
Machine Learning: Linear Regression
55 pages
SVM Tutorial
No ratings yet
SVM Tutorial
31 pages
SVM Presentation
No ratings yet
SVM Presentation
27 pages
Presentation - SVM & KM - May 2009
No ratings yet
Presentation - SVM & KM - May 2009
24 pages
08 Classification
No ratings yet
08 Classification
46 pages
ML.4-Classification Techniques (Week 5,6,7)
No ratings yet
ML.4-Classification Techniques (Week 5,6,7)
56 pages
Introduction To Support Vector Machines: BTR Workshop Fall 2006
No ratings yet
Introduction To Support Vector Machines: BTR Workshop Fall 2006
88 pages
Introduction To Support Vector Machines: BTR Workshop Fall 2006
No ratings yet
Introduction To Support Vector Machines: BTR Workshop Fall 2006
88 pages
ML 18-20 SVM
No ratings yet
ML 18-20 SVM
44 pages
Lect 07 Distance Based Algorithms
No ratings yet
Lect 07 Distance Based Algorithms
34 pages
This Is
No ratings yet
This Is
7 pages
SML Unit 4
No ratings yet
SML Unit 4
61 pages
Support Vector Machines
No ratings yet
Support Vector Machines
57 pages
Support Vector Machine
No ratings yet
Support Vector Machine
52 pages
HandsOnML Ch5E
No ratings yet
HandsOnML Ch5E
31 pages
Fast Kernel Classifiers
No ratings yet
Fast Kernel Classifiers
41 pages
Visual Recognition
No ratings yet
Visual Recognition
123 pages
SVM Classifier & Parameter Estimation
No ratings yet
SVM Classifier & Parameter Estimation
45 pages
cs221 Lecture11
No ratings yet
cs221 Lecture11
71 pages
Handout 03 Classic Classifiers
No ratings yet
Handout 03 Classic Classifiers
39 pages
SVM Tutorial
No ratings yet
SVM Tutorial
34 pages
CS-13410 Introduction To Machine Learning
No ratings yet
CS-13410 Introduction To Machine Learning
33 pages
Linear Regression in Machine Learning
No ratings yet
Linear Regression in Machine Learning
54 pages
Quiz 1 On Wednesday
No ratings yet
Quiz 1 On Wednesday
46 pages
Atc Lecture Tyliu
No ratings yet
Atc Lecture Tyliu
48 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
26 pages
SVM PRESENTATION
No ratings yet
SVM PRESENTATION
34 pages
Support Vector Machine: Abinas Panda
No ratings yet
Support Vector Machine: Abinas Panda
52 pages
Hands On Machine Learning 3 Edition
No ratings yet
Hands On Machine Learning 3 Edition
43 pages
ANZ J. Surg. 2008 78 (Suppl. 1) A68-A80
No ratings yet
ANZ J. Surg. 2008 78 (Suppl. 1) A68-A80
13 pages
Pega CSSA Cheat Sheet For OOTB Rules
No ratings yet
Pega CSSA Cheat Sheet For OOTB Rules
4 pages
Avasthas of Planets
No ratings yet
Avasthas of Planets
13 pages
A General Theory of Domination and Justice 1st Edition Lovett Instant Download
No ratings yet
A General Theory of Domination and Justice 1st Edition Lovett Instant Download
145 pages
True or False Items
No ratings yet
True or False Items
17 pages
Greek Architecture
No ratings yet
Greek Architecture
13 pages
Johnson Grammar School: Kuntloor-Hyderabad
No ratings yet
Johnson Grammar School: Kuntloor-Hyderabad
2 pages
Wearable Devices For The Detection of Covid-19
No ratings yet
Wearable Devices For The Detection of Covid-19
21 pages
MySQL Backup & Recovery Basics
No ratings yet
MySQL Backup & Recovery Basics
15 pages
Anthony 8
No ratings yet
Anthony 8
2 pages
CH 16
No ratings yet
CH 16
106 pages
Chapter 4 (Answers)
No ratings yet
Chapter 4 (Answers)
5 pages
CM2A
No ratings yet
CM2A
4 pages
UV Stable Waterproof Membrane Guide
No ratings yet
UV Stable Waterproof Membrane Guide
3 pages
Studies in The Psychology of Sex, Volume 3 Analysis of The Sexual Impulse Love and Pain The Sexual Impulse in Women by Ellis, Havelock, 1859-1939
100% (3)
Studies in The Psychology of Sex, Volume 3 Analysis of The Sexual Impulse Love and Pain The Sexual Impulse in Women by Ellis, Havelock, 1859-1939
242 pages
AR-M208 Service Manual
No ratings yet
AR-M208 Service Manual
32 pages
Insurance Industry Career
No ratings yet
Insurance Industry Career
6 pages
Aditya Internship Training
No ratings yet
Aditya Internship Training
14 pages
Why Weightlifting Is Superior
No ratings yet
Why Weightlifting Is Superior
4 pages
Experiment 16: Heat Conduction
No ratings yet
Experiment 16: Heat Conduction
6 pages
Perl Arrays and Lists Guide
No ratings yet
Perl Arrays and Lists Guide
5 pages
Christian Family: Divine Foundation
No ratings yet
Christian Family: Divine Foundation
2 pages
Sunny Days For Silicon
No ratings yet
Sunny Days For Silicon
5 pages
Gotaq QPCR Master Mix Quick Protocol
No ratings yet
Gotaq QPCR Master Mix Quick Protocol
1 page
GD4400
No ratings yet
GD4400
52 pages
New Design of Intelligent Load Shedding Algorithm Based On Critical Line Overloads To Reduce Network Cascading Failure Risks
No ratings yet
New Design of Intelligent Load Shedding Algorithm Based On Critical Line Overloads To Reduce Network Cascading Failure Risks
15 pages
The Empathetic School
100% (1)
The Empathetic School
9 pages

Lecture 6 Classification SVM

Uploaded by

Lecture 6 Classification SVM

Uploaded by

UET

ĐẠI HỌC CÔNG NGHỆ, ĐHQGHN

INT3405 - Machine Learning

Posterior Likelihood Prior Decision Boundary

FIT-CS INT3405 - Machine Learning 3

FIT-CS INT3405 - Machine Learning 4

○ For two-class (binary)

FIT-CS INT3405 - Machine Learning 5

● One possible solution

FIT-CS INT3405 - Machine Learning 6

FIT-CS INT3405 - Machine Learning 7

● Too many other

FIT-CS INT3405 - Machine Learning 8

● Which one is better

FIT-CS INT3405 - Machine Learning 9

FIT-CS INT3405 - Machine Learning 10

FIT-CS INT3405 - Machine Learning 11

●From Margin to Norm

■ Maximizing margin is equivalent to minimizing

○ Simplified as the equivalent constraint

FIT-CS INT3405 - Machine Learning 12

○ Convex problem, has unique minimum

FIT-CS INT3405 - Machine Learning 13

● For such case,

FIT-CS INT3405 - Machine Learning 14

C is a regularization parameter. Soft margin SVM trade off between

FIT-CS INT3405 - Machine Learning 15

FIT-CS INT3405 - Machine Learning 16

FIT-CS INT3405 - Machine Learning 18

What would SVMs

FIT-CS INT3405 - Machine Learning 19

Not a big surprise

FIT-CS INT3405 - Machine Learning 20

FIT-CS INT3405 - Machine Learning 21

FIT-CS INT3405 - Machine Learning 22

FIT-CS INT3405 - Machine Learning 23

FIT-CS INT3405 - Machine Learning 24

● Then learns a hyperplane in the feature space

● Almost the same Primal form of SVM

FIT-CS INT3405 - Machine Learning 25

● The optimal solution

FIT-CS INT3405 - Machine Learning 26

• Problem of using explicit feature mapping:

FIT-CS INT3405 - Machine Learning 27

• Not all functions are kernel functions

• Polynomial Kernel (degree d)

• Gaussian / RBF Kernel

FIT-CS INT3405 - Machine Learning 29

FIT-CS INT3405 - Machine Learning 30

FIT-CS INT3405 - Machine Learning 31

● The decision function

FIT-CS INT3405 - Machine Learning 32

FIT-CS INT3405 - Machine Learning 33

FIT-CS INT3405 - Machine Learning 34

FIT-CS INT3405 - Machine Learning 35

FIT-CS INT3405 - Machine Learning 36

FIT-CS INT3405 - Machine Learning 37

FIT-CS INT3405 - Machine Learning 38

● Reason: If it’s the 1st class, then we should have

FIT-CS INT3405 - Machine Learning 39

FIT-CS INT3405 - Machine Learning 40

● Select the one with the

● May use decision values as well

FIT-CS INT3405 - Machine Learning 41

FIT-CS INT3405 - Machine Learning 42

ĐẠI HỌC CÔNG NGHỆ, ĐHQGHN

You might also like