SVM Hyperparameter Tuning using GridSearchCV | ML
Last Updated :
11 Jan, 2023
A Machine Learning model is defined as a mathematical model with a number of parameters that need to be learned from the data. However, there are some parameters, known as Hyperparameters and those cannot be directly learned. They are commonly chosen by humans based on some intuition or hit and trial before the actual training begins. These parameters exhibit their importance by improving the performance of the model such as its complexity or its learning rate. Models can have many hyper-parameters and finding the best combination of parameters can be treated as a search problem.
SVM also has some hyper-parameters (like what C or gamma values to use) and finding optimal hyper-parameter is a very hard task to solve. But it can be found by just trying all combinations and see what parameters work best. The main idea behind it is to create a grid of hyper-parameters and just try all of their combinations (hence, this method is called Gridsearch, But don’t worry! we don’t have to do it manually because Scikit-learn has this functionality built-in with GridSearchCV.
GridSearchCV takes a dictionary that describes the parameters that could be tried on a model to train it. The grid of parameters is defined as a dictionary, where the keys are the parameters and the values are the settings to be tested.
This article demonstrates how to use the GridSearchCV searching method to find optimal hyper-parameters and hence improve the accuracy/prediction results
Import necessary libraries and get the Data:
We’ll use the built-in breast cancer dataset from Scikit Learn. We can get with the load function:
Python3
import pandas as pd
import numpy as np
from sklearn.metrics import classification_report, confusion_matrix
from sklearn.datasets import load_breast_cancer
from sklearn.svm import SVC
cancer = load_breast_cancer()
print (cancer.keys())
|
dict_keys(['data', 'target', 'target_names', 'DESCR', 'feature_names', 'filename'])
Now we will extract all features into the new data frame and our target features into separate data frames.
Python3
df_feat = pd.DataFrame(cancer[ 'data' ],
columns = cancer[ 'feature_names' ])
df_target = pd.DataFrame(cancer[ 'target' ],
columns = [ 'Cancer' ])
print ( "Feature Variables: " )
print (df_feat.info())
|
Python3
print ( "Dataframe looks like : " )
print (df_feat.head())
|

Train Test Split
Now we will split our data into train and test set with a 70: 30 ratio
Python3
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(
df_feat, np.ravel(df_target),
test_size = 0.30 , random_state = 101 )
|
Train the Support Vector Classifier without Hyper-parameter Tuning –
First, we will train our model by calling the standard SVC() function without doing Hyperparameter Tuning and see its classification and confusion matrix.
Python3
model = SVC()
model.fit(X_train, y_train)
predictions = model.predict(X_test)
print (classification_report(y_test, predictions))
|
We got 61 % accuracy but did you notice something strange?
Notice that recall and precision for class 0 are always 0. It means that the classifier is always classifying everything into a single class i.e class 1! This means our model needs to have its parameters tuned.
Here is when the usefulness of GridSearch comes into the picture. We can search for parameters using GridSearch!
Use GridsearchCV
One of the great things about GridSearchCV is that it is a meta-estimator. It takes an estimator like SVC and creates a new estimator, that behaves exactly the same – in this case, like a classifier. You should add refit=True and choose verbose to whatever number you want, the higher the number, the more verbose (verbose just means the text output describing the process).
Python3
from sklearn.model_selection import GridSearchCV
param_grid = { 'C' : [ 0.1 , 1 , 10 , 100 , 1000 ],
'gamma' : [ 1 , 0.1 , 0.01 , 0.001 , 0.0001 ],
'kernel' : [ 'rbf' ]}
grid = GridSearchCV(SVC(), param_grid, refit = True , verbose = 3 )
grid.fit(X_train, y_train)
|
What fit does is a bit more involved than usual. First, it runs the same loop with cross-validation, to find the best parameter combination. Once it has the best combination, it runs fit again on all data passed to fit (without cross-validation), to build a single new model using the best parameter setting.
You can inspect the best parameters found by GridSearchCV in the best_params_ attribute, and the best estimator in the best_estimator_ attribute:
Python3
print (grid.best_params_)
print (grid.best_estimator_)
|
Then you can re-run predictions and see a classification report on this grid object just like you would with a normal model.
Python3
grid_predictions = grid.predict(X_test)
print (classification_report(y_test, grid_predictions))
|
We have got almost 95 % prediction result.
Similar Reads
Hyperparameter tuning using GridSearchCV and KerasClassifier
Hyperparameter tuning is done to increase the efficiency of a model by tuning the parameters of the neural network. Some scikit-learn APIs like GridSearchCV and RandomizedSearchCV are used to perform hyper parameter tuning. In this article, you'll learn how to use GridSearchCV to tune Keras Neural N
2 min read
Hyperparameter tuning SVM parameters using Genetic Algorithm
The performance support Vector Machines (SVMs) are heavily dependent on hyperparameters such as the regularization parameter (C) and the kernel parameters (gamma for RBF kernel). Genetic Algorithms (GAs) leverage evolutionary principles to search for optimal hyperparameter values. This article explo
9 min read
Hyperparameter Tuning with R
In R Language several techniques and packages can be used to optimize these hyperparameters, leading to better, more reliable models. in this article, we will discuss all the techniques and packages for Hyperparameter Tuning with R. What are Hyperparameters?Hyperparameters are the settings that cont
5 min read
Hyperparameter tuning
Machine Learning model is defined as a mathematical model with several parameters that need to be learned from the data. By training a model with existing data we can fit the model parameters. However there is another kind of parameter known as hyperparameters which cannot be directly learned from t
8 min read
How to tune a Decision Tree in Hyperparameter tuning
Decision trees are powerful models extensively used in machine learning for classification and regression tasks. The structure of decision trees resembles the flowchart of decisions helps us to interpret and explain easily. However, the performance of decision trees highly relies on the hyperparamet
14 min read
Random Forest Hyperparameter Tuning in Python
Random Forest is one of the most popular and powerful machine learning algorithms used for both classification and regression tasks. It works by building multiple decision trees and combining their outputs to improve accuracy and control overfitting. While Random Forest is already a robust model fin
6 min read
Cross-validation and Hyperparameter tuning of LightGBM Model
In a variety of industries, including finance, healthcare, and marketing, machine learning models have become essential for resolving challenging real-world issues. Gradient boosting techniques have become incredibly popular among the myriad of machine learning algorithms due to their remarkable pre
14 min read
CatBoost Cross-Validation and Hyperparameter Tuning
CatBoost is a powerful gradient-boosting algorithm of machine learning that is very popular for its effective capability to handle categorial features of both classification and regression tasks. To maximize the potential of CatBoost, it's essential to fine-tune its hyperparameters which can be done
11 min read
Hyperparameter Tuning in Linear Regression
Linear regression is one of the simplest and most widely used algorithms in machine learning. Despite its simplicity, it can be quite powerful, especially when combined with proper hyperparameter tuning. Hyperparameter tuning is the process of tuning a machine learning model's parameters to achieve
7 min read
Sklearn | Model Hyper-parameters Tuning
Hyperparameter tuning is the process of finding the optimal values for the hyperparameters of a machine-learning model. Hyperparameters are parameters that control the behaviour of the model but are not learned during training. Hyperparameter tuning is an important step in developing machine learnin
12 min read