0% found this document useful (0 votes)

33 views44 pages

Machine Learning

The document discusses support vector machines and their use of maximum margin classifiers and Lagrange multipliers for optimization. It covers analytical geometry, defining maximum margins to separate linearly separable data, using Lagrange multipliers to solve constrained optimization problems like finding the optimal separating hyperplane for support vector machines, and deriving the Lagrange function to optimize the margin.

Uploaded by

Tok Tik

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views44 pages

Machine Learning

Uploaded by

Tok Tik

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 44

Machine Learning

Support Vector Machine

Lecturer: Duc Dung Nguyen, PhD.

Contact: [email protected]

Faculty of Computer Science and Engineering

Hochiminh city University of Technology
Contents

1. Analytical Geometry

2. Maximum Margin Classifiers

3. Lagrange Multipliers

4. Non-linearly Separable Data

5. Soft-margin

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 1 / 33

Analytical Geometry
Analytical Geometry

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 2 / 33

Analytical Geometry

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 3 / 33

Maximum Margin Classifiers
Maximum margin classifiers

• Assume that the data are linearly separable

• Decision boundary equation:
y(x) = w.x + b

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 4 / 33

Maximum margin classifiers

• Margin: the smallest distance between the decision boundary and any of the samples.

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 5 / 33

Maximum margin classifiers

• Margin: the smallest distance between the decision boundary and any of the samples.

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 5 / 33

Maximum margin classifiers

• Support vectors: samples at the two margins.

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 6 / 33

Maximum margin classifiers

• Scaling y (support vectors) to be 1 or -1:

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 7 / 33

Maximum margin classifiers

• Signed distance between the decision boundary and a sample xn :

y(xn )
||w||

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 8 / 33

Maximum margin classifiers

• Signed distance between the decision boundary and a sample xn :

y(xn )
||w||

• Absolute distance between the decision boundary and a sample xn :

tn .y(xn )
||w||

tn = +1 iff y(xn ) > 0 and tn = −1 iff y(xn ) < 0

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 8 / 33

Maximum margin classifiers

• Maximum margin:
1
arg max minn (tn .(w.xn + b))
w,b ||w||
with the constraint:
tn .(w.xn + b) ≥ 1

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 9 / 33

Maximum margin classifiers

• To be optimized:
1
arg min kwk2
w,b 2
with the constraint:
tn .(w.xn + b) ≥ 1

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 10 / 33

Lagrange Multipliers
Optimization using Lagrange multipliers

Joseph-Louis Lagrange born 25 January 1736 – Paris, 10

April 1813; also reported as Giuseppe Luigi Lagrange,
was an Italian Enlightenment Era mathematician and as-
tronomer. He made significant contributions to the fields
of analysis, number theory, and both classical and celestial
mechanics.

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 11 / 33

Optimization using Lagrange multipliers

• Problem:
arg max f (x)
x

with the constraint:

g(x) = 0

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 12 / 33

Optimization using Lagrange multipliers

• Solution is the stationary point of the Lagrange function:

L(x, λ) = f (x) + λ.g(x)

such that:
∂L(x, λ)/∂xn = ∂f (x)/∂xn + λ.∂g(x)/∂xn = 0
and
∂L(x, λ)/∂λ = g(x) = 0

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 13 / 33

Optimization using Lagrange multipliers

• Example:
f (x) = 1 − u2 − v 2
with the constraint:
g(x) = u + v − 1 = 0

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 14 / 33

Optimization using Lagrange multipliers

• Lagrange function:

L(x, λ) = f (x) + λ.g(x) = (1 − u2 − v 2 ) + λ.(u + v − 1)

∂L(x, λ)/∂u = ∂f (x)/∂u + λ.∂g(x)/∂u = −2u + λ = 0
∂L(x, λ)/∂v = ∂f (x)/∂v + λ.∂g(x)/∂v = −2v + λ = 0
∂L(x, λ)/∂λ = g(x) = u + v − 1 = 0

• Solution: u = 1/2 and v = 1/2

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 15 / 33

Optimization using Lagrange multipliers

• Example:

f (x) = 1 − u2 − v 2

with the constraint:

g(x) = u + v − 1 = 0

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 16 / 33

Optimization using Lagrange multipliers

• Problem:
arg max f (x)
x

with the inequality constraint:

g(x) ≥ 0

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 17 / 33

Optimization using Lagrange multipliers

Solution is the stationary point of the Lagrange function:

L(x, λ) = f (x) + λ.g(x)

such that:
∂L(x, λ)/∂xn = ∂f (x)/∂xn + λ.∂g(x)/∂xn = 0
and
g(x) ≥ 0
λ≥0
λ.g(x) = 0

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 18 / 33

Optimization using Lagrange multipliers

• To be optimized:
1
arg min kwk2
w,b 2
with the constraint:
tn .(w.xn + b) ≥ 1)
• Lagrange function for maximum margin classifier:
1 X
L(w, b, a) = kwk2 − an .(tn .(w.xn + b) − 1)
2
n=1..N

tn .(w.xn + b) − 1 ≥ 0
an ≥ 0
an .(tn .(w.xn + b) − 1) = 0
Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 19 / 33
Optimization using Lagrange multipliers

• Lagrange function for maximum margin classifier:

1 X
L(w, b, a) = kwk2 − an .(tn .(w.xn + b) − 1)
2
n=1..N

• Solution for w:
∂(w, b, a)/∂w = 0
X
w= an .tn .xn
n=1..N
X
∂L(w, b, a)/∂b = an .tn = 0
n=1..N

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 20 / 33

Optimization using Lagrange multipliers

• Lagrange function for maximum margin classifier:

1 X
L(w, b, a) = kwk2 − an .(tn .(w.xn + b) − 1)
2
n=1..N

• Solution for a: dual representation to be optimized

X 1 X X
L∗ (a) = an − an .am .tn .tm .xn .xm
2
n=1..N n=1..N m=1..N

with the constraints:

an ≥ 0
X
an .tn = 0
n=1..N

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 21 / 33

Optimization using Lagrange multipliers

• Lagrange function for maximum margin classifier:

1 X
L(w, b, a) = kwk2 − an .(tn .(w.xn + b) − 1)
2
n=1..N

• Solution for a: dual representation to be optimized

X 1 X X
L∗ (a) = an − an .am .tn .tm .xn .xm
2
n=1..N n=1..N m=1..N

Why optimization via dual representation?

• Sparsity: an = 0 if xn is not a support vector.

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 22 / 33

Optimization using Lagrange multipliers

• Lagrange function for maximum margin classifier:

1 X
L(w, b, a) = kwk2 − an .(tn .(w.xn + b) − 1)
2
n=1..N

an .(tn .(w.xn + b) − 1) = 0
• Solution for b:
1 X
b= am .tm .xm .xn
|S|
n∈S

where S is the set of support vectors (an 6= 0)

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 23 / 33

Optimization using Lagrange multipliers

• Classification: X
y(x) = w.x + b = an .tn .xn .x + b
n=1..N

y(x) > 0 → +1
y(x) < 0 → −1

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 24 / 33

Non-linearly Separable Data
Kernel trick for non-linearly separable data

• Mapping the data points into a high dimensional feature space.

• Example 1:
• Original space: (x)
• New space: (x, x2 )

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 25 / 33

Kernel trick for non-linearly separable data

• Example 2:
• Original space: (u, v)
• New space: ((u2 + v 2 )1/2 , arctan(v/u))

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 26 / 33

Kernel trick for non-linearly separable data

Example 3: XOR function

In1 In2 t
0 0 0
0 1 1
1 0 1
1 1 0

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 27 / 33

Kernel trick for non-linearly separable data

Example 3: XOR function

In1 In2 In3 Output

0 0 1 1
0 1 0 0
1 0 0 0
1 1 0 1

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 28 / 33

Kernel trick for non-linearly separable data

• Classification in the new space:

X
y(x) = w.φ(x) + b = an .tn .φ(xn ).φ(x) + b
n=1..N

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 29 / 33

Kernel trick for non-linearly separable data

• Classification in the new space:

X
y(x) = w.φ(x) + b = an .tn .φ(xn ).φ(x) + b
n=1..N

• Computational complexity of φ(xn ).φ(x) is high due to the high dimension of φ(.).

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 29 / 33

Kernel trick for non-linearly separable data

• Classification in the new space:

X
y(x) = w.φ(x) + b = an .tn .φ(xn ).φ(x) + b
n=1..N

• Computational complexity of φ(xn ).φ(x) is high due to the high dimension of φ(.).
• Kernel trick:
φ(xn ).φ(xm ) = K(xn , xm )

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 29 / 33

Kernel trick for non-linearly separable data

• A typical kernel function:

K(u, v) = (1 + u.v)2

√ √ √
φ((u1 .u2 , ..., ud )) = (1, 2u1 , 2u2 , ..., 2ud ,
√ √ √
2u1 .u2 , 2u1 .u3 , ..., 2ud−1 .ud ,
u21 , u22 , ..., u2d )
X X X X
φ(u).φ(v) = 1 + 2 ui .vi + 2 ui .vi .uj .vj + u2i vi2
i=1..d i=1..d−1 j=i+1..d i=1..d

φ(u.φ(v) = K(u, v)
• Is φ(x) guaranteed to be separable?

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 30 / 33

Soft-margin
Soft margin SVM

• Soft-margin SVM: to allow some of the training samples to be misclassified.

• Slack variable: ξ

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 31 / 33

Soft margin SVM

• New constraints:
tn .(w.xn + b) ≥ 1 − ξn
ξn ≥ 0

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 32 / 33

Soft margin SVM

• New constraints:
tn .(w.xn + b) ≥ 1 − ξn
ξn ≥ 0
• To be minimized:
1 X
||w||2 = C ξn
2
n=1..N

C > 0: controls the trade-off between the margin and slack variable penalty

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 32 / 33

Summary

• SVM is a sparse kernel method.

• Soft margin SVM is to deal with non-linearly separable data after kernel mapping.

Lecturer: Duc Dung Nguyen, PhD. Contact: [email protected] Machine Learning 33 / 33

SVM
No ratings yet
SVM
21 pages
Automation in Construction
No ratings yet
Automation in Construction
26 pages
Support Vector Machines For Classification: A Seminar On Data Mining
No ratings yet
Support Vector Machines For Classification: A Seminar On Data Mining
18 pages
Data Mining CS4168 Lecture 5 Basics of Classification 1
No ratings yet
Data Mining CS4168 Lecture 5 Basics of Classification 1
25 pages
Support Vector Machines (SVM) : Y.H. Hu
No ratings yet
Support Vector Machines (SVM) : Y.H. Hu
25 pages
AI Exam for BTech Students
0% (1)
AI Exam for BTech Students
3 pages
10 SVM
No ratings yet
10 SVM
23 pages
Support Vector Machines: (Vapnik, 1979)
No ratings yet
Support Vector Machines: (Vapnik, 1979)
34 pages
SVM Seminarbericht Hofmann
No ratings yet
SVM Seminarbericht Hofmann
16 pages
Support Vector Machines: Logisic Regression
No ratings yet
Support Vector Machines: Logisic Regression
10 pages
An Introduction To Support Vector Machines
No ratings yet
An Introduction To Support Vector Machines
13 pages
SVM Tutorial
No ratings yet
SVM Tutorial
34 pages
CS-13410 Introduction To Machine Learning
No ratings yet
CS-13410 Introduction To Machine Learning
33 pages
Support Vector Machine Overview
No ratings yet
Support Vector Machine Overview
45 pages
SVM Slides
No ratings yet
SVM Slides
22 pages
SVM Tutorial
No ratings yet
SVM Tutorial
34 pages
Intro to Support Vector Machines
No ratings yet
Intro to Support Vector Machines
25 pages
ML - 05 - Support Vector Machines
No ratings yet
ML - 05 - Support Vector Machines
52 pages
Intro to Machine Learning Basics
No ratings yet
Intro to Machine Learning Basics
32 pages
Machine Learning and Data Mining: Introduction to (Học máy và Khai phá dữ liệu)
No ratings yet
Machine Learning and Data Mining: Introduction to (Học máy và Khai phá dữ liệu)
49 pages
ML Lectures - 20 22
No ratings yet
ML Lectures - 20 22
14 pages
EXPLOR 1 Stamped
No ratings yet
EXPLOR 1 Stamped
46 pages
Syllabus MAI391 Sp24
No ratings yet
Syllabus MAI391 Sp24
16 pages
Support Vector Machine
No ratings yet
Support Vector Machine
35 pages
SVMs for Machine Learning Students
No ratings yet
SVMs for Machine Learning Students
36 pages
Answer Any Two Full Questions, Each Carries 15 Marks: F F1124 Pages: 2
No ratings yet
Answer Any Two Full Questions, Each Carries 15 Marks: F F1124 Pages: 2
2 pages
SVM PRESENTATION
No ratings yet
SVM PRESENTATION
34 pages
A09 Support Vector Machines 2up
No ratings yet
A09 Support Vector Machines 2up
15 pages
Support Vector Machine Master Thesis
100% (3)
Support Vector Machine Master Thesis
7 pages
Introduction To: Support Vector Machines
No ratings yet
Introduction To: Support Vector Machines
53 pages
SVM Tutorial
No ratings yet
SVM Tutorial
31 pages
7.fuzzy Neurons and Fuzzy Neural Networks
No ratings yet
7.fuzzy Neurons and Fuzzy Neural Networks
6 pages
Support Vector Machine
No ratings yet
Support Vector Machine
55 pages
Machine Learning - Open Elective - Part III
No ratings yet
Machine Learning - Open Elective - Part III
90 pages
SVM Student
No ratings yet
SVM Student
40 pages
Lec06 SVM
No ratings yet
Lec06 SVM
25 pages
11 Ethem Linear SVM 2015
No ratings yet
11 Ethem Linear SVM 2015
66 pages
SVM Classifiers: A Technical Guide
No ratings yet
SVM Classifiers: A Technical Guide
44 pages
W12 SVM
No ratings yet
W12 SVM
52 pages
Support Vector Machine
No ratings yet
Support Vector Machine
46 pages
SVM Slides
No ratings yet
SVM Slides
32 pages
(DL) Ch04-Regularization
No ratings yet
(DL) Ch04-Regularization
37 pages
Lec5 Support Vector Machine
No ratings yet
Lec5 Support Vector Machine
28 pages
SVM-CDing2024 11 15
No ratings yet
SVM-CDing2024 11 15
54 pages
Support Vector Machines
No ratings yet
Support Vector Machines
13 pages
SVM New
No ratings yet
SVM New
12 pages
cs221 Lecture11
No ratings yet
cs221 Lecture11
71 pages
Deep Learn
No ratings yet
Deep Learn
48 pages
Support Vector Machine
No ratings yet
Support Vector Machine
29 pages
4 - SVM
No ratings yet
4 - SVM
58 pages
Jean Gallier, Jocelyn Quaintance - Linear Algebra and Optimization With Applications To Machine Learning - Volume II - Fundamentals of Optimization Theory With Applications To Machine Learning. 2-Wor
100% (1)
Jean Gallier, Jocelyn Quaintance - Linear Algebra and Optimization With Applications To Machine Learning - Volume II - Fundamentals of Optimization Theory With Applications To Machine Learning. 2-Wor
896 pages
ML Lec SVM Linear
No ratings yet
ML Lec SVM Linear
19 pages
MCA Machine Learning Practical File
No ratings yet
MCA Machine Learning Practical File
22 pages
Support Vector Machines: Theory, Implementation, and Applications
No ratings yet
Support Vector Machines: Theory, Implementation, and Applications
40 pages
Daily Dose of Data Science Full Archive
No ratings yet
Daily Dose of Data Science Full Archive
53 pages
Cvpr17 Pointnet Slides
No ratings yet
Cvpr17 Pointnet Slides
68 pages
Support Vector Machines
No ratings yet
Support Vector Machines
33 pages
Machine Learning Coding Interview Questions - MLExpert
No ratings yet
Machine Learning Coding Interview Questions - MLExpert
3 pages
L5 SVMs
No ratings yet
L5 SVMs
37 pages
202104 - 공공분야 인공지능 도입 실무 안내서 PDF
No ratings yet
202104 - 공공분야 인공지능 도입 실무 안내서 PDF
74 pages
1 N-Grams and Language Models Detailed
No ratings yet
1 N-Grams and Language Models Detailed
4 pages
Leveraging Web Scraping To Develop A Fake News Detection Model For Philippine News Using RNN-LSTM
No ratings yet
Leveraging Web Scraping To Develop A Fake News Detection Model For Philippine News Using RNN-LSTM
7 pages
Neural Networks: CMR Technical Campus
No ratings yet
Neural Networks: CMR Technical Campus
30 pages
Visual Question Generation in Bengali
No ratings yet
Visual Question Generation in Bengali
10 pages
Multilayer Perceptron
No ratings yet
Multilayer Perceptron
5 pages
Support Vector Machine
No ratings yet
Support Vector Machine
19 pages
cs224n 2023 Lecture03 Neuralnets
No ratings yet
cs224n 2023 Lecture03 Neuralnets
83 pages
Ch08 Discriminative Models
No ratings yet
Ch08 Discriminative Models
36 pages
10 SVM
No ratings yet
10 SVM
77 pages
Notes For Electrical 2nd Year
No ratings yet
Notes For Electrical 2nd Year
4 pages
Practical Machine Learning-1
No ratings yet
Practical Machine Learning-1
5 pages
ML Assignment
No ratings yet
ML Assignment
3 pages
09 SupportVectorMachines
No ratings yet
09 SupportVectorMachines
58 pages
CMP 342 Artificial Intelligence and Neural Network 3-1-3
No ratings yet
CMP 342 Artificial Intelligence and Neural Network 3-1-3
7 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
26 pages
30150-Article Text-34204-1-2-20240324
No ratings yet
30150-Article Text-34204-1-2-20240324
10 pages
GenAI Course
No ratings yet
GenAI Course
3 pages
AI & ML Question Bank: Modules 3 & 4
No ratings yet
AI & ML Question Bank: Modules 3 & 4
2 pages
Graph Neural Network For Fraud Detection Via Spatial-Temporal Attention
No ratings yet
Graph Neural Network For Fraud Detection Via Spatial-Temporal Attention
14 pages
AI and Robotics Complete Practice Set
No ratings yet
AI and Robotics Complete Practice Set
48 pages
PersoNet: AI Framework for Personality-Based Agent Selection
No ratings yet
PersoNet: AI Framework for Personality-Based Agent Selection
7 pages
Unit 5
No ratings yet
Unit 5
46 pages
ANN1
No ratings yet
ANN1
13 pages
SVM Tutorial
No ratings yet
SVM Tutorial
34 pages
Support Vector Machine
No ratings yet
Support Vector Machine
30 pages
Beu ML 20 Vvi Questions
No ratings yet
Beu ML 20 Vvi Questions
4 pages
SkillDzire AI Program Book
No ratings yet
SkillDzire AI Program Book
39 pages