0% found this document useful (0 votes)

42 views34 pages

SVM - Feb 15

Uploaded by

nidhinb200723cs

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

42 views34 pages

SVM - Feb 15

Uploaded by

nidhinb200723cs

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 34

Support Vector Machine

1
Jayaraj P B
Outline
1. Finite Dimensional Vector Space
2. Hyper plane
3. SVM – overview
4. Mathematical formulation of SVM
problem
Linear Separators
Binary classification can be viewed as the task of
separating classes in feature space:
w Tx + b = 0
w Tx + b > 0
w Tx + b < 0

f(x) = sign(wTx + b)
Linear Separators

Which of the linear separators is optimal?

Optimal Hyperplane
SVM
Support vector machine is a supervised learning algorithm used in
classification.

It selects a small number of boundary instances called support

vectors and build a discriminant function that separates the training
examples with a wide boundary.

This function is then used for prediction purpose. Support vectors

play a key role in SVM algorithm.

To find the support vectors we use Lagrange duality.

Example
SVM can be understood with an example

Suppose we see a strange cat that also has some features of dogs, so
if we want a model that can accurately identify whether it is a cat or
dog, so such a model can be created by using the SVM algorithm.

We will first train our model with lots of images of cats and dogs so
that it can learn about different features of cats and dogs, and then
we test it with this strange creature.

So as support vector creates a decision boundary between these

two data (cat and dog) and choose extreme cases (support vectors),
it will see the extreme case of cat and dog.

On the basis of the support vectors, it will classify it as a cat.

Consider the below diagram:
Maximum Margin
Classification Margin
wT xi  b
Distance from example xi to the separator is r
w
Examples closest to the hyperplane are support vectors.

Margin ρ of the separator is the distance between support vectors.

r
Hyperplane
There can be multiple lines/decision boundaries to segregate the
classes in n-dimensional space, but we need to find out the best
decision boundary that helps to classify the data points.

This best boundary is known as the hyperplane of SVM.

The dimensions of the hyperplane depend on the features present in

the dataset, which means if there are 2 features (as shown in
image), then hyperplane will be a straight line.

And if there are 3 features, then hyperplane will be a 2-dimension

plane.

We always create a hyperplane that has a maximum margin, which

means the maximum distance between the data points.
Maximum Margin Classification
Maximizing the margin is good according to intuition and PAC
theory.

Implies that only support vectors matter; other training examples

are ignorable
Support Vectors:

The data points or vectors that are the closest

to the hyperplane and which affect the
position of the hyperplane are termed as
Support Vector. Since these vectors support
the hyperplane, hence called a Support
vector.
Hyperplanes in 2D and 3D feature space
The goal of SVM algorithm is to find the
direction of the margin such that it
separates the data with widest gap. By
finding such margin, we will find both
the vector w and the offset b.
SVM can be of two types:
Linear SVM: Linear SVM is used for linearly separable data, which
means if a dataset can be classified into two classes by using a single
straight line, then such data is termed as linearly separable data, and
classifier is used called as Linear SVM classifier.

Non-linear SVM: Non-Linear SVM is used for non-linearly separated

data, which means if a dataset cannot be classified by using a
straight line, then such data is termed as non-linear data and
classifier used is called as Non-linear SVM classifier.
Derivation
Maximum Margin

We have to maximize the margin while making sure

that all the training data points are on the correct
side of the separating hyperplane. This formulation
is known as the primal problem.
We can construct a LaGrangian for this optimization problem.

There will be a Lagrangian multiplier corresponding to each

constraint.

The Lagrangian L will be,

We can apply Karush-Kuhn-Tucker (KKT) conditions on the above
constrained quadratic programming problem. The KKT conditions
are as follows,
By applying the KKT conditions to the Lagrangian L on w
And applying the same for b,
Soft margin SVM
In previous section we assumed that the data points were perfectly
separable by a hyperplane. However, in most of the real world
situations that is not the case.

So in Soft margin SVM we allow some misclassifications by

penalizing those on the wrong side of the decision boundary and we
have to minimize this amount.

Now the support vectors need not be on the hyperplane, they can
go beyond the hyperplane of the other class.
Soft Margin Classification
What if the training set is not linearly separable?
Slack variables ξi can be added to allow misclassification of difficult
or noisy examples, resulting margin called soft.

ξi
ξi
We add a penalization parameter, C, to set a ceiling in the amount
of training error.

Now by inclusion of slack variable our optimization problem

becomes,
Going through the same steps as Hard margin by solving the
Lagrangian and merging the KKT conditions in dual problem we get
the following optimization problem,
Kernels
Soft margin SVM can improve the performance when the data
points are non linearly separable, we can further improve the
performance by mapping the data points to a different feature
space of higher dimension using a mapping function.

The hardmargin optimization task appears in the form of inner

product between xi and xj .

This allows us to make the SVM non-linear.

Instead of applying it on the original input space x, we apply on the

new feature space (x). Given a feature mapping
SVM - Pros
• Effective on datasets with multiple features, like financial or
medical data.

• Effective in cases where number of features is greater than the

number of data points.

• Uses a subset of training points in the decision function called

support vectors which makes it memory efficient.

• Different kernel functions can be specified for the decision

function. You can use common kernels, but it's also possible to
specify custom kernels.
Cons
• If the number of features is a lot bigger than the
number of data points, avoiding over-fitting when
choosing kernel functions and regularization term is
crucial.

• SVMs don't directly provide probability estimates.

Those are calculated using an expensive five-fold cross-
validation.

• Works best on small sample sets because of its high

training time.
• Since SVMs can use any number of kernels, it's
important that you know about a few of them.
SVM applications
• SVMs were originally proposed by Boser, Guyon and Vapnik in 1992
and gained increasing popularity in late 1990s.
• SVMs are currently among the best performers for a number of
classification tasks ranging from text to genomic data.
• SVMs can be applied to complex data types beyond feature vectors
(e.g. graphs, sequences, relational data) by designing kernel
functions for such data.
• SVM techniques have been extended to a number of tasks such as
regression [Vapnik et al. ’97], principal component analysis
[Schölkopf et al. ’99], etc.
• Most popular optimization algorithms for SVMs use decomposition
to hill-climb over a subset of αi’s at a time, e.g. SMO [Platt ’99] and
[Joachims ’99]
• Tuning SVMs remains a black art: selecting a specific kernel and
parameters is usually done in a try-and-see manner.

Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
4 pages
Unit 2 PPT - Part 2
100% (1)
Unit 2 PPT - Part 2
81 pages
Introduction of Sludge Management
No ratings yet
Introduction of Sludge Management
154 pages
Sample Test Hkimo Grade 3 (Vòng Sơ Lo I) : Part I: Logical Thinking
100% (1)
Sample Test Hkimo Grade 3 (Vòng Sơ Lo I) : Part I: Logical Thinking
7 pages
Support Vector Machine (SVM) : Basic Terminologies
100% (1)
Support Vector Machine (SVM) : Basic Terminologies
2 pages
General Principles of Machine Tool Design
100% (1)
General Principles of Machine Tool Design
16 pages
Third Year Engineering: Unit II: Supervised Machine Learning
No ratings yet
Third Year Engineering: Unit II: Supervised Machine Learning
11 pages
Lesson Plan in Mathematics 4: School: Teacher: Date: I. Objectives
No ratings yet
Lesson Plan in Mathematics 4: School: Teacher: Date: I. Objectives
6 pages
SVM Notes
No ratings yet
SVM Notes
4 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
29 pages
Unit2 Notes What Is A Support Vector Machine
No ratings yet
Unit2 Notes What Is A Support Vector Machine
11 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
26 pages
Support Vector Machine - Explanation
No ratings yet
Support Vector Machine - Explanation
12 pages
Module 3 ML 24
No ratings yet
Module 3 ML 24
65 pages
Support Vector Machine: Abinas Panda
No ratings yet
Support Vector Machine: Abinas Panda
52 pages
Support Vector Machine
No ratings yet
Support Vector Machine
12 pages
UNIT-III Support Vector Machines
No ratings yet
UNIT-III Support Vector Machines
43 pages
SVM
No ratings yet
SVM
11 pages
Support Vector Machine: Prof. Subodh Kumar Mohanty
No ratings yet
Support Vector Machine: Prof. Subodh Kumar Mohanty
52 pages
Support Vector Machine
No ratings yet
Support Vector Machine
19 pages
7 - Support Vector Machines (SVM)
No ratings yet
7 - Support Vector Machines (SVM)
29 pages
SUpport Vector Machine
No ratings yet
SUpport Vector Machine
28 pages
Support Vector Machines
No ratings yet
Support Vector Machines
13 pages
Steel Detaing Part1
No ratings yet
Steel Detaing Part1
114 pages
Support Vector Machine For Classification
No ratings yet
Support Vector Machine For Classification
38 pages
Support Vector Machine (SVM) Terminology Hyperplane WX + B 0 Support Vectors Margin Kernel Hard Margin Soft Margin
No ratings yet
Support Vector Machine (SVM) Terminology Hyperplane WX + B 0 Support Vectors Margin Kernel Hard Margin Soft Margin
6 pages
SVM Notes Unit 4
No ratings yet
SVM Notes Unit 4
8 pages
Ankita
No ratings yet
Ankita
10 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
4 pages
Support Vector Machine
No ratings yet
Support Vector Machine
18 pages
Controls Engineering in FRC
No ratings yet
Controls Engineering in FRC
352 pages
Electrical Machines: Induction Motors - Note
No ratings yet
Electrical Machines: Induction Motors - Note
41 pages
Machine Learning (CSO851) - Lecture 05
No ratings yet
Machine Learning (CSO851) - Lecture 05
27 pages
Machine Learning (R17a0534) 54 57
No ratings yet
Machine Learning (R17a0534) 54 57
4 pages
Unit-III - SVM
No ratings yet
Unit-III - SVM
105 pages
DataMining Chapter5
No ratings yet
DataMining Chapter5
9 pages
Support Vector Machine
No ratings yet
Support Vector Machine
52 pages
W12 SVM
No ratings yet
W12 SVM
52 pages
5 SVM
No ratings yet
5 SVM
34 pages
Support Vector Machine Guide
No ratings yet
Support Vector Machine Guide
21 pages
Support Vector Machine
No ratings yet
Support Vector Machine
29 pages
SVM Basics for Data Scientists
No ratings yet
SVM Basics for Data Scientists
139 pages
Electromagnetism Research Paper
No ratings yet
Electromagnetism Research Paper
3 pages
Stairwell Pressurization Analysis
No ratings yet
Stairwell Pressurization Analysis
17 pages
Support Vector Machine Algorithm
No ratings yet
Support Vector Machine Algorithm
8 pages
Data Mining Techniques
No ratings yet
Data Mining Techniques
27 pages
NetSDK Programming Manual
No ratings yet
NetSDK Programming Manual
49 pages
Unit 2
No ratings yet
Unit 2
47 pages
Sci8-Q1-W5-6-L2-3 - Work, Power and Energy
No ratings yet
Sci8-Q1-W5-6-L2-3 - Work, Power and Energy
4 pages
Chapter 07 SVM
No ratings yet
Chapter 07 SVM
20 pages
ML Lec-19
No ratings yet
ML Lec-19
20 pages
Support Vector Machine
No ratings yet
Support Vector Machine
17 pages
Chapter 07
No ratings yet
Chapter 07
18 pages
6 Lec SVM Kernel
No ratings yet
6 Lec SVM Kernel
36 pages
Unit 2 - SVM - 241016 - 104220
No ratings yet
Unit 2 - SVM - 241016 - 104220
47 pages
Machine Learning Unit-3.3
No ratings yet
Machine Learning Unit-3.3
38 pages
Ann Unit III
No ratings yet
Ann Unit III
20 pages
Preparation PLM 11
No ratings yet
Preparation PLM 11
18 pages
Lecture+10-12 (Sampling and Reconstruction) PDF
No ratings yet
Lecture+10-12 (Sampling and Reconstruction) PDF
72 pages
RCC Structure by PANDI MANI
No ratings yet
RCC Structure by PANDI MANI
13 pages
Support Vector Machine
No ratings yet
Support Vector Machine
13 pages
DMML Unit4 - SVM
No ratings yet
DMML Unit4 - SVM
50 pages
Dokumen - Tips Basic Flowsheeting Principles Thermart Himmelblau D M and Riggs J B 2003 Basic
No ratings yet
Dokumen - Tips Basic Flowsheeting Principles Thermart Himmelblau D M and Riggs J B 2003 Basic
111 pages
Bunn Programing Manual
No ratings yet
Bunn Programing Manual
18 pages
74 SENR1128-System Overview
No ratings yet
74 SENR1128-System Overview
21 pages
Support Vector Machines: Jeff Wu
No ratings yet
Support Vector Machines: Jeff Wu
35 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
28 pages
2.1 SVM
No ratings yet
2.1 SVM
16 pages
ML Support Vector Machines 2
No ratings yet
ML Support Vector Machines 2
22 pages
ML Unit 3
No ratings yet
ML Unit 3
14 pages
SVM for Customer Classification
No ratings yet
SVM for Customer Classification
28 pages
SVM Basics for Data Scientists
No ratings yet
SVM Basics for Data Scientists
28 pages
Support Vector Machine
No ratings yet
Support Vector Machine
31 pages
Semen Analysis
No ratings yet
Semen Analysis
42 pages
KNN - Feb 19
No ratings yet
KNN - Feb 19
42 pages
Solving Recurrences in Discrete Math
No ratings yet
Solving Recurrences in Discrete Math
8 pages
Grade 5 Maths Exam
No ratings yet
Grade 5 Maths Exam
3 pages
AI Unit 4
No ratings yet
AI Unit 4
11 pages
Proteus CT1628 Electrical Simulation
No ratings yet
Proteus CT1628 Electrical Simulation
4 pages
Carbon Black Surface Area Analysis
No ratings yet
Carbon Black Surface Area Analysis
39 pages
New Pattern Input Output Exam Cart
No ratings yet
New Pattern Input Output Exam Cart
55 pages
HTML
No ratings yet
HTML
24 pages
HTML Forms
No ratings yet
HTML Forms
19 pages
Cascading Style Sheets
No ratings yet
Cascading Style Sheets
18 pages
Javascript 1
No ratings yet
Javascript 1
16 pages
Common SQL Errors & Solutions Guide
No ratings yet
Common SQL Errors & Solutions Guide
13 pages
Oriental College of Technology: Ritika Makhija
No ratings yet
Oriental College of Technology: Ritika Makhija
23 pages
H.S.C Result Distribution 2020
No ratings yet
H.S.C Result Distribution 2020
3 pages
Exact Solutions of The Sextic Oscillator From The Bi-Confluent Heun Equation
No ratings yet
Exact Solutions of The Sextic Oscillator From The Bi-Confluent Heun Equation
17 pages
Dignitas - Style Guide: Vector Logo Pack PNG Logo Pack
No ratings yet
Dignitas - Style Guide: Vector Logo Pack PNG Logo Pack
4 pages
Reviewer
No ratings yet
Reviewer
5 pages

SVM - Feb 15

Uploaded by

SVM - Feb 15

Uploaded by

Support Vector Machine

Which of the linear separators is optimal?

It selects a small number of boundary instances called support

This function is then used for prediction purpose. Support vectors

To find the support vectors we use Lagrange duality.

So as support vector creates a decision boundary between these

On the basis of the support vectors, it will classify it as a cat.

Margin ρ of the separator is the distance between support vectors.

This best boundary is known as the hyperplane of SVM.

The dimensions of the hyperplane depend on the features present in

And if there are 3 features, then hyperplane will be a 2-dimension

We always create a hyperplane that has a maximum margin, which

Implies that only support vectors matter; other training examples

The data points or vectors that are the closest

Non-linear SVM: Non-Linear SVM is used for non-linearly separated

We have to maximize the margin while making sure

There will be a Lagrangian multiplier corresponding to each

The Lagrangian L will be,

So in Soft margin SVM we allow some misclassifications by

Now by inclusion of slack variable our optimization problem

The hardmargin optimization task appears in the form of inner

This allows us to make the SVM non-linear.

Instead of applying it on the original input space x, we apply on the

• Effective in cases where number of features is greater than the

• Uses a subset of training points in the decision function called

• Different kernel functions can be specified for the decision

• SVMs don't directly provide probability estimates.

• Works best on small sample sets because of its high

You might also like