0% found this document useful (0 votes)

5 views23 pages

Complete-SVM Lecture Notes

These notes contain the complete SVM lecture notes taught in prestigious colleges as part of their AI/ML curriculum.

Uploaded by

a27074247

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views23 pages

Complete-SVM Lecture Notes

These notes contain the complete SVM lecture notes taught in prestigious colleges as part of their AI/ML curriculum.

Uploaded by

a27074247

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

Complete-SVM Lecture 10 and 11

August 20, 2025

1 Support Vector Machine (SVM)

Support Vector Machine (SVM) is a powerful supervised machine learning algorithm primarily
used for classification and regression tasks. Developed in the 1990s by Vladimir N. Vapnik and
his colleagues, SVMs work by finding an optimal hyperplane that maximizes the distance between
different classes in the data. The algorithm is particularly renowned for its ability to handle both
linear and nonlinear classification problems through the clever application of kernel functions.

1.1 Core Concepts and Mathematical Foundation

1.1.1 Hyperplane and Decision Boundary
At its core, SVM seeks to find the optimal hyperplane that best separates data points of different
classes. In a two-dimensional space, this hyperplane is a line, while in higher dimensions, it becomes
a plane or hyperplane. The mathematical representation of this decision boundary is given by the
equation w·x + b = 0, where w is the weight vector and b is the bias term.

1.1.2 Support Vectors and Margin

The support vectors are the data points that lie closest to the decision boundary and are crucial
for determining the hyperplane’s position. These points essentially “support” the hyperplane,
giving the algorithm its name. The margin represents the distance between the hyperplane and
the nearest data points from each class. SVM’s primary objective is to maximize this margin, as a
larger margin typically leads to better generalization on unseen data.
The margin can be calculated as 2/||w||, where ||w|| is the norm of the weight vector. By maxi-
mizing the margin, SVM finds the most robust decision boundary that provides the best separation
between classes.

1.1.3 Mathematical Optimization Problem

The SVM optimization problem can be formulated as a constrained optimization problem. For
linearly separable data, the hard margin SVM aims to:
Minimize:
1
||𝑤||2
2

Subject to:
𝑦𝑖 (𝑤 ⋅ 𝑥𝑖 + 𝑏) ≥ 1
for all training examples

1
This formulation ensures that all data points are correctly classified with a margin of at least 1.
However, for real-world data that may not be perfectly separable, SVM introduces the concept of
soft margin.

1.2 Handling Non-Linear Data: The Kernel Trick

One of SVM’s most powerful features is its ability to handle non-linearly separable data through
the kernel trick. When data cannot be separated by a straight line in the original feature space,
SVM can map the data into a higher-dimensional space where linear separation becomes possible.

1.2.1 Common Kernel Functions

Linear Kernel:
𝐾(𝑥𝑖 , 𝑥𝑗 ) = 𝑥𝑇𝑖 𝑥𝑗
- Used when data is already linearly separable.
Polynomial Kernel:
𝐾(𝑥𝑖 , 𝑥𝑗 ) = (𝑥𝑇𝑖 𝑥𝑗 + 𝑐)𝑑
- Considers feature interactions and can create curved boundaries.
Radial Basis Function (RBF) Kernel:

||𝑥𝑖 − 𝑥𝑗 ||2
𝐾(𝑥𝑖 , 𝑥𝑗 ) = exp(− )
2𝜎2
- Creates smooth, non-linear boundaries and is effective for complex patterns.
Sigmoid Kernel: Similar to neural network activation functions, useful for specific applications.
The kernel trick is computationally eﬀicient because it avoids explicitly calculating coordinates in
the higher-dimensional space, instead computing similarity measures between data points.

1.3 Soft Margin and Regularization (EXTRA if you wish you can skip)
Real-world data often contains noise and overlapping classes, making perfect separation impossible.
Soft margin SVM addresses this by allowing some misclassifications while still maximizing the
margin. This approach introduces slack variables �� and a regularization parameter C:
Minimize:
𝑛
1
||𝑤||2 + 𝐶 ∑ 𝜉𝑖
2 𝑖=1

Subject to:
𝑦𝑖 (𝑤 ⋅ 𝑥𝑖 + 𝑏) ≥ 1 − 𝜉𝑖
and
𝜉𝑖 ≥ 0

The parameter C controls the trade-off between maximizing the margin and minimizing classifi-
cation errors. A higher C value imposes stricter penalties for misclassifications, while a lower C
allows for a larger margin at the cost of some training accuracy.

2
1.4 Key Hyperparameters
1.4.1 C Parameter
The C parameter acts as a regularization term that balances margin maximization with mis-
classification penalties. It determines how much the algorithm should avoid misclassifying training
examples:
• High C: Strict boundary with fewer misclassifications but potentially smaller margin
• Low C: Larger margin but more tolerance for misclassifications

1.4.2 Gamma Parameter (for RBF Kernel)

The gamma parameter defines the influence radius of individual training examples:
• High Gamma: Each training example has close reach, creating more complex, tighter deci-
sion boundaries
• Low Gamma: Each training example has far reach, resulting in smoother decision bound-
aries

1.5 Detailed Role of C and gamma in SVM

The hyperparameters C and gamma play crucial roles in the behavior and performance of Support
Vector Machines (SVMs), especially for kernels like the Radial Basis Function (RBF).

1.5.1 Role of C (Regularization Parameter)

• C controls the trade-off between having a wide margin and correctly classifying the training
points.
• A high C value: The SVM tries to classify all training points correctly by allowing less slack
(tolerance). This can lead to a narrow margin and potentially overfitting, where the model
fits very closely to the training data.
• A low C value: The SVM allows some misclassifications to occur, prioritizing a wider
margin that may generalize better to new data but might underfit the training data.
• Intuitively, C is a regularization parameter that balances the complexity of the decision
boundary and training accuracy.

1.5.2 Role of Gamma (Kernel Parameter for RBF, Poly, Sigmoid Kernels)
• Gamma (𝛾) defines the influence radius of a single training point in the feature space in
kernels like RBF.
• A high gamma value means that each point has a very local influence zone, leading to
complex decision boundaries that can wiggle around the training data points (high variance,
risk of overfitting).
• A low gamma value means that each training point’s influence is more spread out, resulting
in a smoother and simpler decision boundary (risk of underfitting).
• Gamma essentially controls the curvature of the decision boundary.

1.5.3 Interaction Between C and Gamma

• Both parameters together control the model’s complexity and generalization.

3
• You may find good combinations of C and gamma on a diagonal in parameter space: higher
gamma with lower C and vice versa can sometimes yield similarly good models.
• Proper tuning using techniques like grid search and cross-validation is essential to find the
best pair.

1.5.4 Summary:

Parameter Effect High Value Low Value

C Trade-off between Small margin, less Large margin, more
margin size and tolerance for tolerance for
classification error misclassification (overfit misclassification
risk) (underfit risk)
Gamma Influence radius of Very local influence, Wide influence, smoother
support vectors in complex boundary (overfit boundary (underfit risk)
kernel space risk)

In practice, careful tuning of C and gamma is essential for optimal SVM performance, balancing
bias and variance for your specific dataset.

1.6 Advantages of SVM

SVMs offer several compelling advantages that make them popular in machine learning applications:
High-Dimensional Performance: SVMs excel in high-dimensional spaces, making them suitable
for text classification, gene expression analysis, and image recognition.
Memory Eﬀiciency: They use only a subset of training points (support vectors) in the decision
function, making them memory-eﬀicient.
Kernel Versatility: The kernel trick allows SVMs to handle non-linear problems effectively with-
out explicitly mapping to higher dimensions.
Robustness to Outliers: The soft margin feature enables SVMs to handle noisy data and outliers
effectively.
No Local Optima: Unlike neural networks, SVMs solve a convex optimization problem, avoiding
local optima issues.
Strong Theoretical Foundation: Based on statistical learning theory and VC dimension, pro-
viding solid mathematical grounding.

1.7 Disadvantages of SVM

Despite their strengths, SVMs have notable limitations:
Computational Complexity: Training can be slow for large datasets, with complexity that can
scale quadratically with the number of samples.
Kernel Selection Challenge: Choosing the appropriate kernel function and tuning hyperparam-
eters can be diﬀicult and requires domain expertise.

4
No Probability Estimates: SVMs don’t directly provide probability estimates, requiring addi-
tional computations like Platt scaling.
Feature Scaling Sensitivity: SVMs are sensitive to the scale of input features, requiring careful
preprocessing.
Limited Interpretability: The final model, especially with non-linear kernels, can be diﬀicult to
interpret compared to simpler models like linear regression.
Parameter Tuning Complexity: Finding optimal values for C, gamma, and other hyperparam-
eters requires extensive cross-validation.

1.8 Applications and Use Cases

SVMs have found widespread application across various domains:
Text Classification: Email spam detection, sentiment analysis, and document categorization.
Image Recognition: Handwriting recognition, face detection, and medical image analysis.
Bioinformatics: Protein structure prediction, gene expression analysis, and drug discovery.
Financial Analysis: Credit scoring, fraud detection, and market prediction.
Intrusion Detection: Network security and anomaly detection in cybersecurity.
Medical Diagnosis: Breast cancer diagnosis and other clinical decision support systems.

1.9 Training and Optimization

The SVM optimization problem is typically solved using specialized algorithms due to the compu-
tational challenges involved. Sequential Minimal Optimization (SMO) is the most popular
approach, which performs a series of two-point optimizations to eﬀiciently solve the quadratic
programming problem.
For very large datasets, modern implementations often prefer solving the primal problem using first-
order methods rather than the traditional dual formulation. This approach can be more eﬀicient
for large-scale applications.

1.10 Best Practices and Implementation

When implementing SVMs, several best practices should be followed:
Data Preprocessing: Feature scaling is crucial for optimal SVM performance.
Hyperparameter Tuning: Use cross-validation with grid search to find optimal C and gamma
values.
Kernel Selection: Start with RBF kernel for most problems, then experiment with others based
on data characteristics.
Performance Validation: Use appropriate metrics like accuracy, precision, recall, and AUC-ROC
for evaluation.
Computational Considerations: For large datasets, consider using linear SVMs or approxima-
tion methods to reduce training time.

5
1.11 Conclusion
Support Vector Machines represent a sophisticated and theoretically grounded approach to machine
learning that excels in many practical applications. Their ability to handle both linear and non-
linear classification problems through the kernel trick, combined with their strong mathematical
foundation and robustness to outliers, makes them a valuable tool in the machine learning toolkit.
While they require careful hyperparameter tuning and can be computationally intensive for large
datasets, their performance and versatility continue to make them relevant in modern machine
learning applications.
The key to successful SVM implementation lies in understanding the data characteristics, selecting
appropriate kernels, and carefully tuning hyperparameters through systematic validation proce-
dures. Despite the emergence of more complex algorithms like deep learning, SVMs remain an
excellent choice for many classification and regression tasks, particularly when interpretability and
theoretical guarantees are important considerations.

2 Python Codes
2.1 First code one synthetically produced data set using RBF kernal
2.2 Second code on breast cancer data set using RBF Kernal 30 features
2.3 Third code on breast cancer data set using RBF Kernal with visualisation
of decision boundary
2.4 Fourth code Irish Flower Data linear kernal plot on two features
2.5 Fifth code Irish Flower Data plot on 4 features
2.6 Sixth Code Irish Flower Data plot on polynomial kernal plot 2 features
2.7 Seventh Code Irish Flower Data plot on polynomial kernal plot four features
[3]: #Code 1
#Synthetic Data
import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm, datasets

# Create non-linearly separable circular data

X, y = datasets.make_circles(n_samples=200, noise=0.1, factor=0.3,␣
↪random_state=42)

y[y == 0] = -1 # Convert class labels to -1 and 1

# Train SVM with RBF kernel

model = svm.SVC(kernel='rbf', C=1, gamma='scale', random_state=42)
model.fit(X, y)

# Plotting decision boundary

h = 0.02 # mesh step size
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1

6
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))

Z = model.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)

plt.contourf(xx, yy, Z, cmap=plt.cm.Paired, alpha=0.8)

plt.scatter(X[:, 0], X[:, 1], c=y, cmap=plt.cm.Paired, edgecolors='k')
plt.title('SVM with RBF Kernel on Non-Linear Circular Data')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.show()

[2]: #Code 2
import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm, datasets
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split

7
# Load breast cancer dataset
data = datasets.load_breast_cancer()
X = data.data[:, :2] # Using first two features for 2D visualization
y = data.target

# Split dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,␣
↪random_state=42)

# Scale features (important for SVM performance)

scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# Train SVM with RBF kernel

model = svm.SVC(kernel='rbf', C=1.0, gamma='scale', random_state=42)
model.fit(X_train, y_train)

# Plot decision boundary

def plot_decision_boundary(X, y, clf, scaler):
h = 0.02
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))

# Predict class for each point in the mesh

Z = clf.predict(scaler.transform(np.c_[xx.ravel(), yy.ravel()]))
Z = Z.reshape(xx.shape)

plt.contourf(xx, yy, Z, alpha=0.3, cmap=plt.cm.coolwarm)

plt.scatter(X[:, 0], X[:, 1], c=y, edgecolors='k', cmap=plt.cm.coolwarm)
plt.xlabel(data.feature_names[0])
plt.ylabel(data.feature_names[9])
plt.title('SVM with RBF Kernel - Breast Cancer Dataset (2 Features)')
plt.show()

# Plot on training data

plot_decision_boundary(X_train, y_train, model, scaler)

8
2.8 Explanation:
The breast cancer dataset contains real clinical data for predicting malignant or benign tumors
using 30 features.
• We split the data into training and testing with 80/20 ratio.
• Features are standardized using StandardScaler to improve the SVM performance.
• The SVM uses the RBF kernel which is suitable for non-linear boundaries typical in real-world
medical data.
• The model’s quality is measured using accuracy and a classification report with precision,
recall, and F1-score.
This serves as a practical example of applying SVM with an RBF kernel to a real dataset. You can
extend this to other datasets and tune hyperparameters like C and gamma for better performance.

[4]: #Code 3
import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm, datasets

9
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split

# Load breast cancer dataset and select first two features for 2D visualization
data = datasets.load_breast_cancer()
X = data.data[:, :2] # Use first two features: mean radius, mean texture
y = data.target

# Split the dataset

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,␣
↪random_state=42)

# Feature scaling
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# Train SVM with RBF kernel

#model = svm.SVC(kernel='rbf', C=1.0, gamma='scale')
model = svm.SVC(kernel='rbf', C=1, gamma=0.1)
model.fit(X_train, y_train)

# Create mesh grid for plotting

h = 0.02 # step size in mesh
x_min, x_max = X_train[:, 0].min() - 1, X_train[:, 0].max() + 1
y_min, y_max = X_train[:, 1].min() - 1, X_train[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
np.arange(y_min, y_max, h))

# Compute decision function for each point in mesh

Z = model.decision_function(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)

# Plot decision boundary and margins

plt.contourf(xx, yy, Z > 0, alpha=0.3, cmap=plt.cm.coolwarm) # Filled regions
plt.contour(xx, yy, Z, colors='k', levels=[-1, 0, 1], linestyles=['--', '-',␣
↪'--']) # Margins and boundary

# Plot training points

plt.scatter(X_train[:, 0], X_train[:, 1], c=y_train, cmap=plt.cm.coolwarm,␣
↪edgecolors='k')

plt.xlabel(data.feature_names[0])
plt.ylabel(data.feature_names[9])
plt.title('SVM with RBF Kernel - Decision Boundary and Margins')
plt.show()

10
2.9 Explanation:
• This above example reduces the dataset to only two features for easy 2D plotting.
• The model is trained with the RBF kernel which can capture nonlinear boundaries.
• The decision boundary is plotted based on model predictions across a mesh grid.
• You see visually how the SVM separates malignant and benign samples with a curved bound-
ary using these two features.
You can extend this to more features by using dimensionality reduction or visualize projections but
plotting higher dimensions directly is not possible.
Following Python program that demonstrates the basics of Support Vector Machines (SVM)
using the popular scikit-learn library (this uses Irish Flower Data Set). ***

[5]: #Code 4
# Step 1: Import libraries
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets

11
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Step 2: Load a sample dataset (for example, two classes of the Iris dataset)
iris = datasets.load_iris()
X = iris.data[:100, :2] # Take only two features for easy visualization
y = iris.target[:100] # Take first two classes only (binary)

# Step 3: Split into training and test data

X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.3, random_state=0
)

# Step 4: Feature scaling (critical for SVMs)

scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Step 5: Train the SVM classifier with linear kernel

clf = SVC(kernel='linear', C=1)
clf.fit(X_train_scaled, y_train)

# Step 6: Evaluate accuracy on test data

accuracy = clf.score(X_test_scaled, y_test)
print(f"Test set accuracy: {accuracy:.2f}")

# Step 7: Visualize decision boundary (for 2D data)

def plot_svm(X, y, model):
plt.scatter(X[:, 0], X[:, 1], c=y, cmap='bwr', edgecolors='k')

# Create grid to evaluate model

ax = plt.gca()
xlim = ax.get_xlim()
ylim = ax.get_ylim()
xx, yy = np.meshgrid(np.linspace(*xlim, 50),
np.linspace(*ylim, 50))
xy = np.vstack([xx.ravel(), yy.ravel()]).T
Z = model.decision_function(xy).reshape(xx.shape)

# Plot margin and decision boundary

plt.contour(xx, yy, Z, colors='k', levels=[-1, 0, 1], linestyles=['--',␣
↪'-', '--'])

plt.xlabel("Feature 1")
plt.ylabel("Feature 2")
plt.title("SVM Decision Boundary and Margins")
plt.show()

12
plot_svm(X_train_scaled, y_train, clf)

# Step 8: Print support vectors

print("Support Vectors:")
print(clf.support_vectors_)
print (len ((clf.support_vectors_)))

Test set accuracy: 1.00

Support Vectors:
[[-1.63250544 -1.68209104]
[-0.14840959 0.58226228]
[-0.14840959 0.58226228]
[-0.80800774 -0.24113893]
[ 0.01648995 0.78811259]
[-0.47820866 -0.85868983]
[-0.97290728 -1.47624074]
[ 0.84098765 0.58226228]
[-0.14840959 -0.24113893]
[ 0.18138949 -0.24113893]]

13
10

2.9.1 What this program does:

• Loads a simple dataset (2D view of the Iris dataset for clarity).
• Trains an SVM classifier.
• Visualizes the decision boundary and margins.
• Prints the coordinates of the support vectors.
You can run this in any Python environment with scikit-learn and matplotlib installed.
This program teaches the core SVM concepts: - Decision boundary - Support vectors - Mar-
gin maximization - Importance of feature scaling - Kernel (defaults to linear for easy visualization)
Here’s a Python example that uses an SVM with a non-linear kernel (RBF) on a dataset with
multiple features. In this code, we use the full Iris dataset with all four features and the RBF
kernel, which is suitable for handling non-linear relationships.

[6]: #Code 5
import numpy as np
from sklearn import datasets
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import classification_report, accuracy_score

# 1. Load the complete Iris dataset (all features and classes)

iris = datasets.load_iris()
X = iris.data # shape = (150, 4), four features
y = iris.target # shape = (150,), three classes

# 2. Split into train and test sets

X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.3, random_state=42, stratify=y
)

# 3. Standardize features (important for SVM)

scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# 4. Fit SVM with a non-linear RBF kernel

clf = SVC(kernel='rbf', C=1, gamma='scale', random_state=42)
clf.fit(X_train_scaled, y_train)

# 5. Predict and Evaluate

14
y_pred = clf.predict(X_test_scaled)
accuracy = accuracy_score(y_test, y_pred)
print(f"SVM (RBF kernel) accuracy: {accuracy:.2f}")
print(classification_report(y_test, y_pred, target_names=iris.target_names))

# Optional: Show which features are used and support vector stats
print(f"Support vectors per class: {clf.n_support_}")
print(f"Total support vectors: {clf.support_vectors_.shape[0]}")

SVM (RBF kernel) accuracy: 0.93

precision recall f1-score support

setosa 1.00 1.00 1.00 15

versicolor 0.88 0.93 0.90 15
virginica 0.93 0.87 0.90 15

accuracy 0.93 45
macro avg 0.93 0.93 0.93 45
weighted avg 0.93 0.93 0.93 45

Support vectors per class: [ 6 18 16]

Total support vectors: 40

2.9.2 Key Points:

• All features are used (X = iris.data), making the model higher-dimensional.
• RBF kernel (kernel='rbf') enables the SVM to model complex, non-linear class bound-
aries.
• Feature scaling is always performed for SVMs.
• The accuracy and detailed classification report are printed for model evaluation.
You can change C and gamma parameters for tuning, or substitute another dataset for more fea-
tures/classes!

Since our data now has four features, direct 2D or 3D plotting isn’t possible for all dimensions at
once. But you can visualize SVM results in two common ways:
1. Pairwise Feature Plots (2D projections)
2. PCA (Principal Component Analysis) projections to 2D
Here’s how to do both with code!

[7]: #Code 5 Continuing

import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.svm import SVC

15
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA

# Load Iris dataset

iris = datasets.load_iris()
X = iris.data
y = iris.target

# Split and scale

X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.3, random_state=42, stratify=y
)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Train SVM with RBF kernel

clf = SVC(kernel='rbf', C=1, gamma='scale', random_state=42)
clf.fit(X_train_scaled, y_train)

# Project data to 2D using PCA for visualization

pca = PCA(n_components=2)
X_train_pca = pca.fit_transform(X_train_scaled)
X_test_pca = pca.transform(X_test_scaled)

# Plot training data

plt.figure(figsize=(8, 6))
for class_value, color, label in zip([0, 1, 2], 'rbg', iris.target_names):
plt.scatter(X_train_pca[y_train == class_value, 0],
X_train_pca[y_train == class_value, 1],
color=color, label=f"Train {label}", s=40, alpha=0.7,␣
↪marker='o')

for class_value, color, label in zip([0, 1, 2], 'rbg', iris.target_names):

plt.scatter(X_test_pca[y_test == class_value, 0],
X_test_pca[y_test == class_value, 1],
color=color, edgecolor='k', label=f"Test {label}", s=80,␣
↪alpha=0.3, marker='*')

plt.title('Iris: PCA projection with train and test classes')

plt.xlabel('PCA Component 1')
plt.ylabel('PCA Component 2')
plt.legend(loc='best')
plt.grid(True)
plt.tight_layout()
plt.show()

16
2.10 Visualize Data and SVM Results Using PCA (2D Projection)
What does this show? - Projects 4D data to a 2D plane using PCA. - Training points: Circle
markers; Test points: Star markers. - Different colors for each class.

[5]: import seaborn as sns

import pandas as pd

df = pd.DataFrame(X, columns=iris.feature_names)
df['species'] = pd.Categorical.from_codes(y, iris.target_names)
sns.pairplot(df, hue='species')
plt.suptitle('Iris Feature Pairplots', y=1.02, fontsize=16)
plt.show()

17
2.11 Visualize Pairwise Feature Plots
Plot pairs of features (like sepal length vs. sepal width):

Both methods give valuable visual insight.

If you want to plot the decision boundaries of the SVM, that’s best done on a reduced 2D PCA
projection, but for multi-class SVMs, such plots are illustrative, not exact.

[15]: for gamma_val in [0.001, 0.01, 0.1, 1, 10]:

clf = SVC(kernel='rbf', C=1, gamma=gamma_val, random_state=42)
clf.fit(X_train_scaled, y_train)
accuracy = clf.score(X_test_scaled, y_test)

18
print(f"Gamma={gamma_val}, Accuracy={accuracy:.3f}, Support␣
↪vectors={sum(clf.n_support_)}")

Gamma=0.001, Accuracy=0.822, Support vectors=105

Gamma=0.01, Accuracy=0.844, Support vectors=85
Gamma=0.1, Accuracy=0.911, Support vectors=42
Gamma=1, Accuracy=0.911, Support vectors=48
Gamma=10, Accuracy=0.800, Support vectors=95

[16]: for C_val in [0.01, 0.1, 1, 10, 100]:

clf = SVC(kernel='rbf', C=C_val, gamma='scale', random_state=42)
clf.fit(X_train_scaled, y_train)
accuracy = clf.score(X_test_scaled, y_test)
print(f"C={C_val}, Accuracy={accuracy:.3f}, Support vectors={sum(clf.
↪n_support_)}")

C=0.01, Accuracy=0.867, Support vectors=105

C=0.1, Accuracy=0.867, Support vectors=87
C=1, Accuracy=0.933, Support vectors=40
C=10, Accuracy=0.933, Support vectors=27
C=100, Accuracy=0.933, Support vectors=20

2.12 Selects only two Iris features and two classes (for 2D decision boundary).
• Fits a polynomial SVM and predicts on a grid covering the feature space.
• Uses contourf to plot regions classified as class 0 or 1.
• Overlays train/test points with distinct markers/colors.
• Shows a clear nonlinear decision boundary determined by the polynomial kernel. You can
change degree in SVC(kernel='poly', degree=3, ...) to see more complex boundaries!
If you want to visualize all three classes or more features, boundaries can only be shown in
projected space (PCA/t-SNE), but are less interpretable and not “true” boundaries.

[8]: #Code 6
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC

# Step 1: Load the Iris dataset, use only first two features for 2D␣
↪visualization

iris = datasets.load_iris()
X = iris.data[:, :2] # Only first two features (for plotting)
y = iris.target

# To make the boundary clear, visualize between two classes (0 and 1)

X = X[y != 2]

19
y = y[y != 2] # Remove class 2 (so two-class, easier boundary plot)

# Step 2: Train/test split and scaling

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3,␣
↪random_state=42)

scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Step 3: Train SVM with polynomial kernel

clf = SVC(kernel='poly', degree=3, C=1, gamma='scale', random_state=42)
clf.fit(X_train_scaled, y_train)

# Step 4: Create grid for decision boundary plot

h = .02
x_min, x_max = X_train_scaled[:, 0].min() - 1, X_train_scaled[:, 0].max() + 1
y_min, y_max = X_train_scaled[:, 1].min() - 1, X_train_scaled[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
np.arange(y_min, y_max, h))
Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)

# Step 5: Plot
plt.figure(figsize=(8,6))
plt.contourf(xx, yy, Z, cmap=plt.cm.coolwarm, alpha=0.3)
plt.scatter(X_train_scaled[:, 0], X_train_scaled[:, 1], c=y_train, cmap=plt.cm.
↪coolwarm, edgecolors='k', label='Train')

plt.scatter(X_test_scaled[:, 0], X_test_scaled[:, 1], c=y_test, cmap=plt.cm.

↪coolwarm, s=80, marker='*', edgecolors='gray', label='Test')

plt.xlabel('Feature 1 (scaled)')
plt.ylabel('Feature 2 (scaled)')
plt.title('SVM with Polynomial Kernel: Decision Boundary')
plt.legend()
plt.show()

20
2.13 Selects only two Iris features and two classes (for 2D decision boundary).
• Fits a polynomial SVM and predicts on a grid covering the feature space.
• Uses contourf to plot regions classified as class 0 or 1.
• Overlays train/test points with distinct markers/colors.
• Shows a clear nonlinear decision boundary determined by the polynomial kernel. You can
change degree in SVC(kernel='poly', degree=3, ...) to see more complex boundaries!
If you want to visualize all three classes or more features, boundaries can only be shown in
projected space (PCA/t-SNE), but are less interpretable and not “true” boundaries.

[9]: #Code 7
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.decomposition import PCA

21
# Step 1: Load Iris (all four features, three classes)
iris = datasets.load_iris()
X = iris.data
y = iris.target
target_names = iris.target_names

# Step 2: Train/test split + scaling

X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.3, random_state=42, stratify=y)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Step 3: Train polynomial SVM on scaled data (4D)

clf = SVC(kernel='poly', degree=3, C=1, gamma='scale', probability=False)
clf.fit(X_train_scaled, y_train)

# Step 4: PCA projection to 2D for visualization

pca = PCA(n_components=2)
X_train_pca = pca.fit_transform(X_train_scaled)
X_test_pca = pca.transform(X_test_scaled)

# Step 5: Plot SVM boundaries in PCA space

# Create a mesh grid in PCA space
x_min, x_max = X_train_pca[:, 0].min() - 1, X_train_pca[:, 0].max() + 1
y_min, y_max = X_train_pca[:, 1].min() - 1, X_train_pca[:, 1].max() + 1
xx, yy = np.meshgrid(np.linspace(x_min, x_max, 300),
np.linspace(y_min, y_max, 300))
mesh_points = np.c_[xx.ravel(), yy.ravel()]
X_mesh_4d = pca.inverse_transform(mesh_points) # Project grid points back to 4D
Z = clf.predict(X_mesh_4d)
Z = Z.reshape(xx.shape)

# Step 6: Plot
plt.figure(figsize=(10, 8))
plt.contourf(xx, yy, Z, alpha=0.3, cmap=plt.cm.coolwarm)

# Training points
for idx, label in enumerate(target_names):
plt.scatter(X_train_pca[y_train == idx, 0], X_train_pca[y_train == idx, 1],
edgecolors='k', label=f"Train: {label}", s=40)
# Test points (larger, distinct marker)
for idx, label in enumerate(target_names):
plt.scatter(X_test_pca[y_test == idx, 0], X_test_pca[y_test == idx, 1],
edgecolors='k', marker='*', s=160, label=f"Test: {label}")

22
plt.title('SVM with Polynomial Kernel (degree=3) — Decision Boundaries in PCA␣
↪Space')

plt.xlabel('PCA Component 1')

plt.ylabel('PCA Component 2')
plt.legend()
plt.grid(True, linestyle='--', alpha=0.3)
plt.tight_layout()
plt.show()

2.14 Explanation for above plot:

• All four features and all three classes are used in the model training.
• PCA projects data to 2D for visualization only.
• The mesh grid is also inverse-transformed into original 4D space so the SVM makes valid
predictions for plotting boundaries.
• Decision boundaries separate colored regions.
• Train set uses circles; Test set uses stars. This lets you see how the SVM divides the PCA-
projected feature space among all three Iris species with a nonlinear polynomial boundary!

L12.1 - Linear & Non-Linear SVM
No ratings yet
L12.1 - Linear & Non-Linear SVM
41 pages
Third Year Engineering: Unit II: Supervised Machine Learning
No ratings yet
Third Year Engineering: Unit II: Supervised Machine Learning
11 pages
Honours Endsem Notes
No ratings yet
Honours Endsem Notes
163 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
29 pages
Module 3 ML 24
No ratings yet
Module 3 ML 24
65 pages
Support Vector Machine: Abinas Panda
No ratings yet
Support Vector Machine: Abinas Panda
52 pages
SVM Notes
No ratings yet
SVM Notes
8 pages
Support Vector Machine
100% (1)
Support Vector Machine
40 pages
Support Vector Machine: Prof. Subodh Kumar Mohanty
No ratings yet
Support Vector Machine: Prof. Subodh Kumar Mohanty
52 pages
7 - Support Vector Machines (SVM)
No ratings yet
7 - Support Vector Machines (SVM)
29 pages
Support Vector Machines
No ratings yet
Support Vector Machines
43 pages
Detailed SVM Presentation
No ratings yet
Detailed SVM Presentation
15 pages
Ad 5 Case Study
No ratings yet
Ad 5 Case Study
11 pages
Support Vector Machines: Detailed Notes: Compiled From Geeksforgeeks and Other Sources September 14, 2025
No ratings yet
Support Vector Machines: Detailed Notes: Compiled From Geeksforgeeks and Other Sources September 14, 2025
6 pages
Unit II 2.2 ML Kernel Machines SVM
No ratings yet
Unit II 2.2 ML Kernel Machines SVM
50 pages
MODULE - 4 - PART 2 - Support Vector Machines
No ratings yet
MODULE - 4 - PART 2 - Support Vector Machines
6 pages
Support Vector Machine
No ratings yet
Support Vector Machine
18 pages
Support Vector Machines (SVMS) - Introduction and Key Concepts
No ratings yet
Support Vector Machines (SVMS) - Introduction and Key Concepts
52 pages
Support Vector Machines
No ratings yet
Support Vector Machines
4 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
32 pages
Unit5 ML
No ratings yet
Unit5 ML
12 pages
Course Title: Fundamentals of Machine Learning Course Code: Group Assignment On
No ratings yet
Course Title: Fundamentals of Machine Learning Course Code: Group Assignment On
9 pages
SVM Presentation
No ratings yet
SVM Presentation
13 pages
DMML Unit4 - SVM
No ratings yet
DMML Unit4 - SVM
50 pages
Notes On Support Vector Machines
No ratings yet
Notes On Support Vector Machines
2 pages
Support Vector Machine
No ratings yet
Support Vector Machine
13 pages
SVM Algorithm: Key Concepts & Implementation
No ratings yet
SVM Algorithm: Key Concepts & Implementation
30 pages
Support Vector Machine
No ratings yet
Support Vector Machine
9 pages
Conflict Style Self-Assessment
No ratings yet
Conflict Style Self-Assessment
2 pages
Support Vector Machine
No ratings yet
Support Vector Machine
52 pages
Support Vector Machine (SVM) Classifier:: Key Features
No ratings yet
Support Vector Machine (SVM) Classifier:: Key Features
6 pages
Support Vector Machine (SVM) Terminology Hyperplane WX + B 0 Support Vectors Margin Kernel Hard Margin Soft Margin
No ratings yet
Support Vector Machine (SVM) Terminology Hyperplane WX + B 0 Support Vectors Margin Kernel Hard Margin Soft Margin
6 pages
Ankita
No ratings yet
Ankita
10 pages
Support Vector Machine 1713797806
No ratings yet
Support Vector Machine 1713797806
6 pages
Machine Learning Note 3
No ratings yet
Machine Learning Note 3
2 pages
SVM Basics for Data Scientists
No ratings yet
SVM Basics for Data Scientists
28 pages
SVM Unit 2
No ratings yet
SVM Unit 2
12 pages
SVM Everything
No ratings yet
SVM Everything
5 pages
Chapter 07
No ratings yet
Chapter 07
18 pages
Bucket Bag
100% (1)
Bucket Bag
8 pages
SVM - Feb 15
No ratings yet
SVM - Feb 15
34 pages
SVM
No ratings yet
SVM
12 pages
Unit - 2-1
No ratings yet
Unit - 2-1
7 pages
SVM
No ratings yet
SVM
4 pages
Support Vector Machine Explained
No ratings yet
Support Vector Machine Explained
4 pages
Ann Unit III
No ratings yet
Ann Unit III
20 pages
Hands On Machine Learning 3 Edition
No ratings yet
Hands On Machine Learning 3 Edition
43 pages
Maths Grade 12 15 August 2025
No ratings yet
Maths Grade 12 15 August 2025
9 pages
Support Vector Machine (SVM) : Basic Terminologies
100% (1)
Support Vector Machine (SVM) : Basic Terminologies
2 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
28 pages
SVM Manual
No ratings yet
SVM Manual
7 pages
Unit2 Notes What Is A Support Vector Machine
No ratings yet
Unit2 Notes What Is A Support Vector Machine
11 pages
SVM1
No ratings yet
SVM1
4 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
4 pages
Support Vector Machine Guide
No ratings yet
Support Vector Machine Guide
21 pages
Ml-Ii Unit-1
No ratings yet
Ml-Ii Unit-1
4 pages
Support Vactor Machine Final
No ratings yet
Support Vactor Machine Final
11 pages
Support Vector Machine
No ratings yet
Support Vector Machine
17 pages
Pharmacy Lab Instrument Guide
No ratings yet
Pharmacy Lab Instrument Guide
8 pages
Foundation (NCA) Sample PAGES 1
No ratings yet
Foundation (NCA) Sample PAGES 1
3 pages
SVM: High Accuracy Classifier Guide
No ratings yet
SVM: High Accuracy Classifier Guide
7 pages
Support Vector Machine: Classification, Regression and Outliers Detection
No ratings yet
Support Vector Machine: Classification, Regression and Outliers Detection
26 pages
A Review On Springback Effect in Sheet Metal Forming Process
No ratings yet
A Review On Springback Effect in Sheet Metal Forming Process
7 pages
Introduction To Support Vector Machines SVM
No ratings yet
Introduction To Support Vector Machines SVM
9 pages
Plant Maintenance
No ratings yet
Plant Maintenance
14 pages
KSP Response To LINK Nky Records Request
No ratings yet
KSP Response To LINK Nky Records Request
2 pages
RAN Network Optimization Parameter Reference RAN6 1
No ratings yet
RAN Network Optimization Parameter Reference RAN6 1
371 pages
Geotech 1 Lecture 2 Structure
No ratings yet
Geotech 1 Lecture 2 Structure
38 pages
Batocera Installation Guide
No ratings yet
Batocera Installation Guide
14 pages
Quantitative Methods in Procurement
No ratings yet
Quantitative Methods in Procurement
15 pages
Spray Booth Design English
No ratings yet
Spray Booth Design English
7 pages
Master Study & Visa Guide Germany
100% (1)
Master Study & Visa Guide Germany
6 pages
1.rakitanprinter 20 Januari 2020-1 1
No ratings yet
1.rakitanprinter 20 Januari 2020-1 1
1 page
Fastpath SAP Extractor
No ratings yet
Fastpath SAP Extractor
8 pages
StuffIt Expander Read Me
No ratings yet
StuffIt Expander Read Me
10 pages
Planmeca
No ratings yet
Planmeca
27 pages
DDO26B1101
No ratings yet
DDO26B1101
6 pages
Bro vd10 20140115
No ratings yet
Bro vd10 20140115
2 pages
Engineer Onboarding Form
No ratings yet
Engineer Onboarding Form
12 pages
Engineering Cover Letter Example
No ratings yet
Engineering Cover Letter Example
3 pages
2 Template 11& 14, Annex 3A
No ratings yet
2 Template 11& 14, Annex 3A
7 pages
AFC Notes Last Year
No ratings yet
AFC Notes Last Year
81 pages
GFSI Terms & Definitions Guide
No ratings yet
GFSI Terms & Definitions Guide
7 pages
Casemine Judgments 12
No ratings yet
Casemine Judgments 12
8 pages
GSCH003 - Rev04 24.11.2021
No ratings yet
GSCH003 - Rev04 24.11.2021
55 pages
Digital Media City
No ratings yet
Digital Media City
37 pages
Bhatti 062014
No ratings yet
Bhatti 062014
41 pages