0% found this document useful (0 votes)

18 views36 pages

6 Lec SVM Kernel

Uploaded by

ĂmÑa CheEma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views36 pages

6 Lec SVM Kernel

Uploaded by

ĂmÑa CheEma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 36

Support vector machines (SVMs)

Dr. Saifullah Khalid

[email protected]
Slides Credit: Mostly based on UofT intro to machine learning course
Sequence

‫ ﺓ‬Support vector machine (SVM)

‫ ﺓ‬Optimal separating hyper planes
‫ ﺓ‬Non-seperable data

‫ ﺓ‬Kernel Method
‫ ﺓ‬Dual formulation of SVM
‫ ﺓ‬Inner product of kernels

2
Separating Hyperplane?
Separating Hyperplane?
Separating Hyperplane?
Support Vector Machine (SVM)

Support vectors

Maximize
• SVMs maximize the margin (or the margin
street) around the separating hyperplane
• The decision function is fully specified
by a (usually very small) subset of
training samples, the support vectors
Support Vectors

d
X X
v1
v2

X X
v3

X
X

Three support vectors: v1, v2, v3, instead of just the 3 circled points at the tail ends of the
support vectors. d denotes 1/2 of the street ‘width’
Optimal Separating Hyperplane

‫ ﺓ‬Optimal Separating Hyperplane: A hyperplane that

separates two classes and maximizes the distance to the
closest point from either class, i.e., maximize the margin of
the classifier

‫ ﺓ‬Intuitively, ensuring that a classifier is not too close to any

data points leads to better generalization on the test data.
Geometry of Points and Planes
Geometry of Points and Planes
Maximizing Margin as an Optimization Problem
Maximizing Margin as an Optimization Problem
Maximizing Margin as an Optimization Problem
Maximizing Margin as an Optimization Problem
Maximizing Margin as an Optimization Problem
Maximizing Margin as an Optimization Problem

Algebraic max-margin objective:

‫ ﺓ‬This is a Quadratic Program: Quadratic objective + Linear inequality constraints.

‫ ﺓ‬The important training examples are the ones with algebraic margin 1, and are
called support vectors

‫ ﺓ‬Hence, this algorithm is called the (hard) Support Vector Machine (SVM)

‫ ﺓ‬SVM-like algorithms are often called max-margin or large-margin

Non-Separable Data Points

‫ ﺓ‬How can we apply the max-margin principle if the

data are not linearly separable?
Maximizing Margin for Non-Separable Data
Points

Main Idea: ‫ ﺓ‬Allow some points to be within the margin or even be

misclassified; we represent this with slack variables ξi.
‫ ﺓ‬But constrain or penalize the total amount of slacks
Maximizing Margin for Non-Separable Data Points
Maximizing Margin for Non-Separable Data Points
‫ ﺓ‬Soft-margin SVM objective:

• 𝛾 is a hyper parameter that trades off the margin with the

amount of slack.
► For 𝛾 = 0, we’ll get 𝒘 = 0. (Why?)
► As 𝛾 → ∞ we get the hard-margin objective.
• Note: It is also possible to constrain 𝑖 𝜉𝑖 instead of penalizing it
From Margin Violation to Hinge Loss
Let’s simplify the soft margin constraint by eliminating ξi.

Recall: 𝑡 𝑖 𝒘𝑇𝒙𝑖 + 𝑏 ≥ 1 − 𝜉𝑖 ∀𝑖 ∈ 𝑁
𝜉𝑖 ≥ 0 ∀𝑖 ∈ 𝑁

‫ ﺓ‬We would like to find a smallest slack variable ξi that satisfy both
𝜉𝑖 ≥ 1 − 𝑡 𝑖 𝒘 𝑇 𝒙 𝑖 + 𝑏 and 𝜉𝑖 ≥ 0
‫ ﺓ‬Case 1: 1 − 𝑡 𝑖 𝒘 𝑇 𝒙 𝑖 + 𝑏 ≤ 0
The smallest non-negative ξi that satisfies the constraint is 𝜉𝑖 = 0
‫ ﺓ‬Case 2: 1 − 𝑡 𝑖 𝒘 𝑇 𝒙 𝑖 + 𝑏 > 0
The smallest 𝜉𝑖 that satisfies the constraint is 𝜉𝑖 = 1 − 𝑡 𝑖 𝒘 𝑇 𝒙 𝑖 + 𝑏
‫ ﺓ‬Hence, 𝜉𝑖 = max {0, 1 − 𝑡 𝑖 𝒘 𝑇 𝒙 𝑖 + 𝑏 }
‫ ﺓ‬Therefore, the slack penalty can be written as
𝑁 𝑁

𝜉𝑖 = 𝑚𝑎𝑥 {0, 1 − 𝑡 𝑖 𝑤 𝑇 𝑥 𝑖 + 𝑏 }
𝑖 𝑖
From Margin Violation to Hinge Loss
Kernel Methods
or
Kernel Trick
Nonlinear Decision Boundaries

‫ ﺓ‬SV Classifier: Margin maximizing linear classifier

‫ ﺓ‬Linear models are restrictive
‫ ﺓ‬Q: How can we get nonlinear decision boundaries?
‫ ﺓ‬Feature mapping 𝒙 → 𝜑(𝒙)

‫ ﺓ‬Q: How do we find good features?

Feature Maps

‫ ﺓ‬For a quadratic decision boundary

‫ ﺓ‬What feature mapping do we need?

‫ ﺓ‬One possibility (ignore √2 for now)

‫ ﺓ‬We have dim 𝜑 𝒙 = 𝑂 𝑑2 ; in a high dimension, the

computation cost might be large

‫ ﺓ‬Can we avoid the high computation cost?

‫ ﺓ‬Let us take a closer look at SVM

From Primal to Dual Formulation of SVM
‫ ﺓ‬Recall that the SVM is defined using the following constrained
optimization problem:

‫ ﺓ‬We can instead solve a dual optimization problem to obtain 𝒘

► We do not derive it here in detail. The basic idea is to form the following
Lagrangian, find w as a function of 𝛼 (and other variables), and express the
Lagrangian only in terms of the dual variables:
From Primal to Dual Formulation of SVM
‫ ﺓ‬Primal Optimization Problem:

‫ ﺓ‬Dual Optimization Problem:

‫ ﺓ‬The weights become:

which is a function of the dual variables 𝛼𝑖 ∀𝑖 ∈ 𝑁

From Primal to Dual Formulation of SVM
‫ ﺓ‬Dual Optimization Problem:

‫ ﺓ‬The weights become:

‫ ﺓ‬The non-zero weights 𝛼i corresponds to observations that satisfy

𝑡 𝑖 𝑤𝑇𝑥 𝑖 + 𝑏 = 1 − 𝜉𝑖 . These are the support vectors

‫ ﺓ‬Observation: The input data only appears in the form of inner

products 𝒙𝑖 𝒙𝑗
SVM in Feature Space
From Inner Products to Kernels
From Inner Products to Kernels
Kernels
Kernelizing SVM
Example: Linear SVM

• Solid line - decision boundary. Dashed - +1/-1 margin. Purple - Bayes optimal
• Solid dots - Support vectors on margin
Example: Degree-4 Polynomial Kernel SVM
Example: Gaussian Kernel SVM

Support Vector Machines: Detailed Notes: Compiled From Geeksforgeeks and Other Sources September 14, 2025
No ratings yet
Support Vector Machines: Detailed Notes: Compiled From Geeksforgeeks and Other Sources September 14, 2025
6 pages
Third Year Engineering: Unit II: Supervised Machine Learning
No ratings yet
Third Year Engineering: Unit II: Supervised Machine Learning
11 pages
Unit 2 PPT - Part 2
100% (1)
Unit 2 PPT - Part 2
81 pages
Support Vector Machine (SVM) Algorithm
No ratings yet
Support Vector Machine (SVM) Algorithm
10 pages
SVM Tutorial
No ratings yet
SVM Tutorial
34 pages
Support Vector Machinephd Thesis
100% (2)
Support Vector Machinephd Thesis
6 pages
Support Vector Machine (SVM) Algorithm
No ratings yet
Support Vector Machine (SVM) Algorithm
9 pages
Support Vector Machines: Theory, Implementation, and Applications
No ratings yet
Support Vector Machines: Theory, Implementation, and Applications
40 pages
Support Vector Machines Theory Implementation and Applications
No ratings yet
Support Vector Machines Theory Implementation and Applications
10 pages
Module 3 ML 24
No ratings yet
Module 3 ML 24
65 pages
SVM Presentation
No ratings yet
SVM Presentation
19 pages
Machine Learning (CSO851) - Lecture 05
No ratings yet
Machine Learning (CSO851) - Lecture 05
27 pages
SVM Consolidated
No ratings yet
SVM Consolidated
34 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
26 pages
SVM Algorithm: Key Concepts & Implementation
No ratings yet
SVM Algorithm: Key Concepts & Implementation
30 pages
Unit-III - SVM
No ratings yet
Unit-III - SVM
105 pages
Support Vector Machines (SVMS) - Introduction and Key Concepts
No ratings yet
Support Vector Machines (SVMS) - Introduction and Key Concepts
52 pages
Support Vector Machine: Abinas Panda
No ratings yet
Support Vector Machine: Abinas Panda
52 pages
7 - Support Vector Machines (SVM)
No ratings yet
7 - Support Vector Machines (SVM)
29 pages
Support Vector Machine
No ratings yet
Support Vector Machine
19 pages
Support Vector Machines
No ratings yet
Support Vector Machines
33 pages
Support Vector Machine
No ratings yet
Support Vector Machine
52 pages
L5 SVMs
No ratings yet
L5 SVMs
37 pages
Detailed SVM Presentation
No ratings yet
Detailed SVM Presentation
15 pages
Support Vector Machines: Some Slides Adapted From
No ratings yet
Support Vector Machines: Some Slides Adapted From
54 pages
Support Vector Machine: Prof. Subodh Kumar Mohanty
No ratings yet
Support Vector Machine: Prof. Subodh Kumar Mohanty
52 pages
Model Questions
No ratings yet
Model Questions
4 pages
SVM
No ratings yet
SVM
11 pages
27-Module 4 - Support Vector Machine and Naïve Bayes-20-09-2024
No ratings yet
27-Module 4 - Support Vector Machine and Naïve Bayes-20-09-2024
31 pages
Unit - 2-1
No ratings yet
Unit - 2-1
7 pages
A09 Support Vector Machines 2up
No ratings yet
A09 Support Vector Machines 2up
15 pages
Support Vector Machine (SVM) Terminology Hyperplane WX + B 0 Support Vectors Margin Kernel Hard Margin Soft Margin
No ratings yet
Support Vector Machine (SVM) Terminology Hyperplane WX + B 0 Support Vectors Margin Kernel Hard Margin Soft Margin
6 pages
Support Vector Machine For Classification
No ratings yet
Support Vector Machine For Classification
38 pages
Chapter 07 SVM
No ratings yet
Chapter 07 SVM
20 pages
SVM Student
No ratings yet
SVM Student
40 pages
Unit2 Notes What Is A Support Vector Machine
No ratings yet
Unit2 Notes What Is A Support Vector Machine
11 pages
SVM Notes Unit 4
No ratings yet
SVM Notes Unit 4
8 pages
Gtu Sem 1 PPS Paper
No ratings yet
Gtu Sem 1 PPS Paper
2 pages
An Introduction Of: Support Vector Machine
No ratings yet
An Introduction Of: Support Vector Machine
36 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
4 pages
Math Behind SVM Part 1 (Support Vector Machine) - by MLMath - Io - Medium
No ratings yet
Math Behind SVM Part 1 (Support Vector Machine) - by MLMath - Io - Medium
15 pages
Lec5 Support Vector Machine
No ratings yet
Lec5 Support Vector Machine
28 pages
Lecture 18 - SVM
No ratings yet
Lecture 18 - SVM
54 pages
Support Vector Machine
No ratings yet
Support Vector Machine
45 pages
Support Vector Machine
No ratings yet
Support Vector Machine
31 pages
SVM Tutorial
No ratings yet
SVM Tutorial
28 pages
SVM - Feb 15
No ratings yet
SVM - Feb 15
34 pages
SVM Applications and Properties
100% (1)
SVM Applications and Properties
34 pages
Ann Unit III
No ratings yet
Ann Unit III
20 pages
22-Kernel Tricks Shit
No ratings yet
22-Kernel Tricks Shit
43 pages
Lec06 SVM
No ratings yet
Lec06 SVM
25 pages
Support Vector Machine
No ratings yet
Support Vector Machine
17 pages
Artificial Intelligence For R-2017 by Krishna Sankar P., Shangaranarayanee N. P., Nithyananthan S.
0% (1)
Artificial Intelligence For R-2017 by Krishna Sankar P., Shangaranarayanee N. P., Nithyananthan S.
8 pages
SVM Tutorial
No ratings yet
SVM Tutorial
31 pages
By: Moataz Al-Haj: Vision Topics - Seminar (University of Haifa)
No ratings yet
By: Moataz Al-Haj: Vision Topics - Seminar (University of Haifa)
69 pages
Unit 2
No ratings yet
Unit 2
47 pages
Unexpected - Expectations 2015
No ratings yet
Unexpected - Expectations 2015
15 pages
Algorithm and Flowchart Unit - 1
No ratings yet
Algorithm and Flowchart Unit - 1
10 pages
CS-13410 Introduction To Machine Learning
No ratings yet
CS-13410 Introduction To Machine Learning
33 pages
Lab 05-GCD Calculator
No ratings yet
Lab 05-GCD Calculator
2 pages
Support Vector Machines: Jeff Wu
No ratings yet
Support Vector Machines: Jeff Wu
35 pages
Ensemble Learning & k-NN Quiz Solutions
100% (1)
Ensemble Learning & k-NN Quiz Solutions
12 pages
5 Thmodule
No ratings yet
5 Thmodule
28 pages
NNAI File-3
No ratings yet
NNAI File-3
32 pages
Flat 2
No ratings yet
Flat 2
8 pages
Ada Reference Notes Unit 1 - Unit 5
No ratings yet
Ada Reference Notes Unit 1 - Unit 5
61 pages
Quantum Machine Learning Guide
No ratings yet
Quantum Machine Learning Guide
16 pages
Final Deadlock OS Unit-II
No ratings yet
Final Deadlock OS Unit-II
41 pages
An Introduction Of: Support Vector Machine
No ratings yet
An Introduction Of: Support Vector Machine
36 pages
C++ Linked List & Stack Guide
No ratings yet
C++ Linked List & Stack Guide
3 pages
Upcast&Downcast
No ratings yet
Upcast&Downcast
11 pages
CS204 - Final
No ratings yet
CS204 - Final
3 pages
Rubric For Algorithm and Flowchart
No ratings yet
Rubric For Algorithm and Flowchart
2 pages
Moore and Mealy Machines
No ratings yet
Moore and Mealy Machines
4 pages
SVM PRESENTATION
No ratings yet
SVM PRESENTATION
34 pages
CHAPTER 2 Discrete Fourier Transform
No ratings yet
CHAPTER 2 Discrete Fourier Transform
75 pages
Machine Learning Interview Prep
No ratings yet
Machine Learning Interview Prep
2 pages
Bio-Inspired Computing - GA
No ratings yet
Bio-Inspired Computing - GA
29 pages
Ans: D
No ratings yet
Ans: D
10 pages
Tutorial 4 CSO
No ratings yet
Tutorial 4 CSO
7 pages
Discrete Structures: Recursion Basics
No ratings yet
Discrete Structures: Recursion Basics
18 pages
Topological Sort in Delphi
No ratings yet
Topological Sort in Delphi
8 pages
Class3 - Nondeterministic Finite Accepters
No ratings yet
Class3 - Nondeterministic Finite Accepters
19 pages
CS-850: Advanced Theory of Computation: Adnan Rashid
No ratings yet
CS-850: Advanced Theory of Computation: Adnan Rashid
72 pages
Chap 7
No ratings yet
Chap 7
67 pages
A and Weighted A Search: Maxim Likhachev Carnegie Mellon University
No ratings yet
A and Weighted A Search: Maxim Likhachev Carnegie Mellon University
55 pages

6 Lec SVM Kernel

Uploaded by

6 Lec SVM Kernel

Uploaded by

Support vector machines (SVMs)

Dr. Saifullah Khalid

‫ ﺓ‬Support vector machine (SVM)

‫ ﺓ‬Optimal Separating Hyperplane: A hyperplane that

‫ ﺓ‬Intuitively, ensuring that a classifier is not too close to any

Algebraic max-margin objective:

‫ ﺓ‬This is a Quadratic Program: Quadratic objective + Linear inequality constraints.

‫ ﺓ‬SVM-like algorithms are often called max-margin or large-margin

‫ ﺓ‬How can we apply the max-margin principle if the

Main Idea: ‫ ﺓ‬Allow some points to be within the margin or even be

• 𝛾 is a hyper parameter that trades off the margin with the

‫ ﺓ‬SV Classifier: Margin maximizing linear classifier

‫ ﺓ‬Q: How do we find good features?

‫ ﺓ‬For a quadratic decision boundary

‫ ﺓ‬One possibility (ignore √2 for now)

‫ ﺓ‬We have dim 𝜑 𝒙 = 𝑂 𝑑2 ; in a high dimension, the

‫ ﺓ‬Can we avoid the high computation cost?

‫ ﺓ‬Let us take a closer look at SVM

‫ ﺓ‬We can instead solve a dual optimization problem to obtain 𝒘

‫ ﺓ‬Dual Optimization Problem:

‫ ﺓ‬The weights become:

which is a function of the dual variables 𝛼𝑖 ∀𝑖 ∈ 𝑁

‫ ﺓ‬The weights become:

‫ ﺓ‬The non-zero weights 𝛼i corresponds to observations that satisfy

‫ ﺓ‬Observation: The input data only appears in the form of inner

You might also like