0% found this document useful (0 votes)

32 views48 pages

Grid Search

This is Grid Search Algorithm used in AI. It is used to find the best possible combination of hyperparameters.It tries all the possible combination in the given predefined set.

Uploaded by

sakshamsharma886588

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views48 pages

Grid Search

This is Grid Search Algorithm used in AI. It is used to find the best possible combination of hyperparameters.It tries all the possible combination in the given predefined set.

Uploaded by

sakshamsharma886588

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 48

Introducing Grid

Search
H Y P E R PA R A M E T E R T U N I N G I N P Y T H O N

Alex Scriven
Data Scientist
Automating 2 Hyperparameters
Your previous work:

neighbors_list = [3,5,10,20,50,75]
accuracy_list = []
for test_number in neighbors_list:
model = KNeighborsClassifier(n_neighbors=test_number)
predictions = model.fit(X_train, y_train).predict(X_test)
accuracy = accuracy_score(y_test, predictions)
accuracy_list.append(accuracy)

Which we then collated in a dataframe to analyse.

HYPERPARAMETER TUNING IN PYTHON

Automating 2 Hyperparameters

What about testing values of 2 hyperparameters?

Using a GBM algorithm:

learn_rate – [0.001, 0.01, 0.05]

max_depth –[4,6,8,10]

We could use a (nested) for loop!

HYPERPARAMETER TUNING IN PYTHON

Automating 2 Hyperparameters
Firstly a model creation function:

def gbm_grid_search(learn_rate, max_depth):

model = GradientBoostingClassifier(
learning_rate=learn_rate,
max_depth=max_depth)

predictions = model.fit(X_train, y_train).predict(X_test)

return([learn_rate, max_depth, accuracy_score(y_test, predictions)])

HYPERPARAMETER TUNING IN PYTHON

Automating 2 Hyperparameters

Now we can loop through our lists of hyperparameters and call our function:

results_list = []

for learn_rate in learn_rate_list:

for max_depth in max_depth_list:
results_list.append(gbm_grid_search(learn_rate,max_depth))

HYPERPARAMETER TUNING IN PYTHON

Automating 2 Hyperparameters

We can put these results into a DataFrame as well and print out:

results_df = pd.DataFrame(results_list, columns=['learning_rate', 'max_depth', 'accuracy

print(results_df)

HYPERPARAMETER TUNING IN PYTHON

How many models?

There were many more models built by adding more hyperparameters and values.

The relationship is not linear, it is exponential

One more value of a hyperparameter is not just one model

5 for Hyperparameter 1 and 10 for Hyperparameter 2 is 50 models!

What about cross-validation?

10-fold cross-validation would make 50x10 = 500 models!

HYPERPARAMETER TUNING IN PYTHON

From 2 to N hyperparameters

What about adding more hyperparameters?

We could nest our loop!

# Adjust the list of values to test

learn_rate_list = [0.001, 0.01, 0.1, 0.2, 0.3, 0.4, 0.5]
max_depth_list = [4,6,8, 10, 12, 15, 20, 25, 30]
subsample_list = [0.4,0.6, 0.7, 0.8, 0.9]
max_features_list = ['auto', 'sqrt']

HYPERPARAMETER TUNING IN PYTHON

From 2 to N hyperparameters
Adjust our function:

def gbm_grid_search(learn_rate, max_depth,subsample,max_features):

model = GradientBoostingClassifier(
learning_rate=learn_rate,
max_depth=max_depth,
subsample=subsample,
max_features=max_features)
predictions = model.fit(X_train, y_train).predict(X_test)
return([learn_rate, max_depth, accuracy_score(y_test, predictions)])

HYPERPARAMETER TUNING IN PYTHON

From 2 to N hyperparameters
Adjusting our for loop (nesting):

for learn_rate in learn_rate_list:

for max_depth in max_depth_list:
for subsample in subsample_list:
for max_features in max_features_list:
results_list.append(gbm_grid_search(learn_rate,max_depth,
subsample,max_features))
results_df = pd.DataFrame(results_list, columns=['learning_rate',
'max_depth', 'subsample', 'max_features','accuracy'])
print(results_df)

HYPERPARAMETER TUNING IN PYTHON

From 2 to N hyperparameters

How many models now?

7x9x5x2 = 630 (6,300 if cross-validated!)

We can't keep nesting forever!

Plus, what if we wanted:

Details on training times & scores

Details on cross-validation scores

HYPERPARAMETER TUNING IN PYTHON

Introducing Grid Search
Let's create a grid:

Down the left all values of max_depth

Across the top all values of learning_rate

HYPERPARAMETER TUNING IN PYTHON

Introducing Grid Search
Working through each cell on the grid:

(4,0.001) is equivalent to making an estimator like so:

GradientBoostingClassifier(max_depth=4, learning_rate=0.001)

HYPERPARAMETER TUNING IN PYTHON

Grid Search Pros & Cons

Some advantages of this approach:

Advantages:

You don’t have to write thousands of lines of code

Finds the best model within the grid (*special note here!)

Easy to explain

HYPERPARAMETER TUNING IN PYTHON

Grid Search Pros & Cons

Some disadvantages of this approach:

Computationally expensive! Remember how quickly we made 6,000+ models?

It is 'uninformed'. Results of one model don't help creating the next model.

We will cover 'informed' methods later!

HYPERPARAMETER TUNING IN PYTHON

Let's practice!
H Y P E R PA R A M E T E R T U N I N G I N P Y T H O N
Grid Search with
Scikit Learn
H Y P E R PA R A M E T E R T U N I N G I N P Y T H O N

Alex Scriven
Data Scientist
GridSearchCV Object

Introducing a GridSearchCV object:

sklearn.model_selection.GridSearchCV(
estimator,
param_grid, scoring=None, fit_params=None,
n_jobs=None, iid=’warn’, refit=True, cv=’warn’,
verbose=0, pre_dispatch=‘2*n_jobs’,
error_score=’raise-deprecating’,
return_train_score=’warn’)

HYPERPARAMETER TUNING IN PYTHON

Steps in a Grid Search

Steps in a Grid Search:

1. An algorithm to tune the hyperparameters. (Sometimes called an ‘estimator’)

2. De ning which hyperparameters we will tune

3. De ning a range of values for each hyperparameter

4. Setting a cross-validation scheme; and

5. De ne a score function so we can decide which square on our grid was ‘the best’.

6. Include extra useful information or functions

HYPERPARAMETER TUNING IN PYTHON

GridSearchCV Object Inputs
The important inputs are:

estimator

param_grid

scoring

refit

n_jobs

return_train_score

HYPERPARAMETER TUNING IN PYTHON

GridSearchCV 'estimator'

The estimator input:

Essentially our algorithm

You have already worked with KNN, Random Forest, GBM, Logistic Regression

Remember:

Only one estimator per GridSearchCV object

HYPERPARAMETER TUNING IN PYTHON

GridSearchCV 'param_grid'
The param_grid input:

Setting which hyperparameters and values to test

Rather than a list:

max_depth_list = [2, 4, 6, 8]
min_samples_leaf_list = [1, 2, 4, 6]

This would be:

param_grid = {'max_depth': [2, 4, 6, 8],

'min_samples_leaf': [1, 2, 4, 6]}

HYPERPARAMETER TUNING IN PYTHON

GridSearchCV 'param_grid'
The param_grid input:

Remember: The keys in your param_grid dictionary must be valid hyperparameters.

For example, for a Logistic regression estimator:

# Incorrect
param_grid = {'C': [0.1,0.2,0.5],
'best_choice': [10,20,50]}

ValueError: Invalid parameter best_choice for estimator LogisticRegression

HYPERPARAMETER TUNING IN PYTHON

GridSearchCV 'cv'
The cv input:

Choice of how to undertake cross-validation

Using an integer undertakes k-fold cross validation where 5 or 10 is usually standard

HYPERPARAMETER TUNING IN PYTHON

GridSearchCV 'scoring'

The scoring input:

Which score to use to choose the best grid square (model)

Use your own or Scikit Learn's metrics module

You can check all the built in scoring functions this way:

from sklearn import metrics

sorted(metrics.SCORERS.keys())

HYPERPARAMETER TUNING IN PYTHON

GridSearchCV 're t'

The refit input:

Fits the best hyperparameters to the training data

Allows the GridSearchCV object to be used as an estimator (for prediction)

A very handy option!

HYPERPARAMETER TUNING IN PYTHON

GridSearchCV 'n_jobs'
The n_jobs input:

Assists with parallel execution

Allows multiple models to be created at the same time, rather than one after the other

Some handy code:

import os
print(os.cpu_count())

Careful using all your cores for modelling if you want to do other work!

HYPERPARAMETER TUNING IN PYTHON

GridSearchCV 'return_train_score'

The return_train_score input:

Logs statistics about the training runs that were undertaken

Useful for analyzing bias-variance trade-off but adds computational expense.

Does not assist in picking the best model, only for analysis purposes

HYPERPARAMETER TUNING IN PYTHON

Building a GridSearchCV object

Building our own GridSearchCV Object:

# Create the grid

param_grid = {'max_depth': [2, 4, 6, 8], 'min_samples_leaf': [1, 2, 4, 6]}

#Get a base classifier with some set parameters.

rf_class = RandomForestClassifier(criterion='entropy', max_features='auto')

HYPERPARAMETER TUNING IN PYTHON

Building a GridSearchCv Object

Putting the pieces together:

grid_rf_class = GridSearchCV(
estimator = rf_class,
param_grid = parameter_grid,
scoring='accuracy',
n_jobs=4,
cv = 10,
refit=True,
return_train_score=True)

HYPERPARAMETER TUNING IN PYTHON

Using a GridSearchCV Object

Because we set refit to True we can directly use the object:

#Fit the object to our data

grid_rf_class.fit(X_train, y_train)

# Make predictions
grid_rf_class.predict(X_test)

HYPERPARAMETER TUNING IN PYTHON

Let's practice!
H Y P E R PA R A M E T E R T U N I N G I N P Y T H O N
Understanding a grid
search output
H Y P E R PA R A M E T E R T U N I N G I N P Y T H O N

Alex Scriven
Data Scientist
Analyzing the output
Let's analyze the GridSearchCV outputs.

Three different groups for the GridSearchCV properties;

A results log
cv_results_

The best results

best_index_ , best_params_ & best_index_

'Extra information'
scorer_ , n_splits_ & refit_time_

HYPERPARAMETER TUNING IN PYTHON

Accessing object properties

Properties are accessed using the dot notation.

For example:

grid_search_object.property

Where property is the actual property you want to retrieve

HYPERPARAMETER TUNING IN PYTHON

The `.cv_results_` property
The cv_results_ property:

Read this into a DataFrame to print and analyze:

cv_results_df = pd.DataFrame(grid_rf_class.cv_results_)

print(cv_results_df.shape)

(12, 23)

The 12 rows for the 12 squares in our grid or 12 models we ran

HYPERPARAMETER TUNING IN PYTHON

The `.cv_results_` 'time' columns
The test_score columns contain the scores on our test set for each of our cross-folds as well as some
summary statistics:

HYPERPARAMETER TUNING IN PYTHON

The .cv_results_ 'param_' columns

The param_ columns store the parameters it tested on that row, one column per parameter

HYPERPARAMETER TUNING IN PYTHON

The `.cv_results_` 'param' column
The params column contains dictionary of all the parameters:

pd.set_option("display.max_colwidth", -1)
print(cv_results_df.loc[:, "params"])

HYPERPARAMETER TUNING IN PYTHON

The `.cv_results_` 'test_score' columns

The test_score columns contain the scores on our test set for each of our cross-folds as well as some
summary statistics:

HYPERPARAMETER TUNING IN PYTHON

The `.cv_results_` 'rank_test_score' column

The rank column, ordering the mean_test_score from best to worst:

HYPERPARAMETER TUNING IN PYTHON

Extracting the best row

We can select the best grid square easily from cv_results_ using the rank_test_score column

best_row = cv_results_df[cv_results_df["rank_test_score"] == 1]
print(best_row)

HYPERPARAMETER TUNING IN PYTHON

The .cv_results_ 'train_score' columns
The test_score columns are then repeated for the training_scores .

Some important notes to keep in mind:

return_train_score must be True to include training scores columns.

There is no ranking column for the training scores, as we only care about test set performance

HYPERPARAMETER TUNING IN PYTHON

The best grid square

Information on the best grid square is neatly summarized in the following three properties:

best_params_ , the dictionary of parameters that gave the best score.

best_score_ , the actual best score.

best_index , the row in our cv_results_.rank_test_score that was the best.

HYPERPARAMETER TUNING IN PYTHON

The `best_estimator_` property

The best_estimator_ property is an estimator built using the best parameters from the grid search.

For us this is a Random Forest estimator:

type(grid_rf_class.best_estimator_)

sklearn.ensemble.forest.RandomForestClassifier

We could also directly use this object as an estimator if we want!

HYPERPARAMETER TUNING IN PYTHON

The `best_estimator_` property
print(grid_rf_class.best_estimator_)

HYPERPARAMETER TUNING IN PYTHON

Extra information
Some extra information is available in the following properties:

scorer_

What scorer function was used on the held out data. (we set it to AUC)

n_splits_

How many cross-validation splits. (We set to 5)

refit_time_

The number of seconds used for re tting the best model on the whole dataset.

HYPERPARAMETER TUNING IN PYTHON

Let's practice!
H Y P E R PA R A M E T E R T U N I N G I N P Y T H O N

Microsoft AI - 900 - Exam
100% (1)
Microsoft AI - 900 - Exam
14 pages
Hyperparameter Tuning
No ratings yet
Hyperparameter Tuning
17 pages
Module-02 AIML NOTES
No ratings yet
Module-02 AIML NOTES
29 pages
Unit 1
No ratings yet
Unit 1
11 pages
Hyperparameter Tuning Mits
No ratings yet
Hyperparameter Tuning Mits
17 pages
Module 1 Introduction To AI in Project Management
No ratings yet
Module 1 Introduction To AI in Project Management
16 pages
EfficientNet Tutorial
No ratings yet
EfficientNet Tutorial
20 pages
Lec 04 05
No ratings yet
Lec 04 05
37 pages
ML - Unit4pdf
No ratings yet
ML - Unit4pdf
65 pages
Unit 4
100% (1)
Unit 4
7 pages
Machine Learning Cheat Sheet: Karn Singh
No ratings yet
Machine Learning Cheat Sheet: Karn Singh
13 pages
Grid Search for ML Enthusiasts
No ratings yet
Grid Search for ML Enthusiasts
1 page
DL Chpter 3
No ratings yet
DL Chpter 3
8 pages
Hyperparameter Tuning - GeeksforGeeks
No ratings yet
Hyperparameter Tuning - GeeksforGeeks
23 pages
Updated Lecture 12 Zainab
No ratings yet
Updated Lecture 12 Zainab
17 pages
Data Mining Unit 3
No ratings yet
Data Mining Unit 3
50 pages
Hyperparameter Tuning
No ratings yet
Hyperparameter Tuning
4 pages
Hyperparameter - Tuning
No ratings yet
Hyperparameter - Tuning
3 pages
Hyper Parameters
No ratings yet
Hyper Parameters
24 pages
Hyper Parameter New
No ratings yet
Hyper Parameter New
4 pages
ML Chap 5
No ratings yet
ML Chap 5
14 pages
#Machinelearning: Mastering Tuning Hyperparameter
No ratings yet
#Machinelearning: Mastering Tuning Hyperparameter
7 pages
Machine Learning Cheat Sheet
No ratings yet
Machine Learning Cheat Sheet
15 pages
Drones Wind
No ratings yet
Drones Wind
26 pages
Lecture 9 Model Selection
No ratings yet
Lecture 9 Model Selection
15 pages
Unit 3-Fuzzy Clustering
No ratings yet
Unit 3-Fuzzy Clustering
34 pages
D2 Deep Learning Workshop Session 3
No ratings yet
D2 Deep Learning Workshop Session 3
5 pages
Module2.3 Hyperparameter Optimization
No ratings yet
Module2.3 Hyperparameter Optimization
29 pages
Hyperparameter Tuning
No ratings yet
Hyperparameter Tuning
3 pages
Week 1 Assignment Solution
No ratings yet
Week 1 Assignment Solution
4 pages
Hyper Parameter Tuning
No ratings yet
Hyper Parameter Tuning
4 pages
Anas Project
No ratings yet
Anas Project
114 pages
Predictive Analytics Assignment
No ratings yet
Predictive Analytics Assignment
29 pages
Hyper Parameters
No ratings yet
Hyper Parameters
7 pages
Hyperparameter Tuning
No ratings yet
Hyperparameter Tuning
9 pages
Hyperparameter Tuning
No ratings yet
Hyperparameter Tuning
4 pages
Hyper Parameter Optimization
No ratings yet
Hyper Parameter Optimization
13 pages
Hyperparameter Tuning Guide
No ratings yet
Hyperparameter Tuning Guide
7 pages
Hyperparameter Tuning
No ratings yet
Hyperparameter Tuning
6 pages
Optimized Hyperparameters Tuning of Multi-Class Classification Algorithms
No ratings yet
Optimized Hyperparameters Tuning of Multi-Class Classification Algorithms
17 pages
ML Lab Programs 2
No ratings yet
ML Lab Programs 2
16 pages
Hyperparameter Tuning Guide
No ratings yet
Hyperparameter Tuning Guide
9 pages
Skit Learn Cheatsheet
No ratings yet
Skit Learn Cheatsheet
11 pages
Hyperparameter Tuning Is The Process of Optimizing The Model
No ratings yet
Hyperparameter Tuning Is The Process of Optimizing The Model
3 pages
XG Boosting Reference
No ratings yet
XG Boosting Reference
6 pages
Model Fine-Tuning - Hyperparameter Optimization
No ratings yet
Model Fine-Tuning - Hyperparameter Optimization
9 pages
Hyperparameter Tuning The Random Forest in Python BOM 3 - by Will Koehrsen - Towards Data Science
No ratings yet
Hyperparameter Tuning The Random Forest in Python BOM 3 - by Will Koehrsen - Towards Data Science
15 pages
Random Forest Hyperparameter Tuning Guide
No ratings yet
Random Forest Hyperparameter Tuning Guide
5 pages
Reference Guide - Validation & Cross-Validation
No ratings yet
Reference Guide - Validation & Cross-Validation
7 pages
Hyperparameters
No ratings yet
Hyperparameters
8 pages
8 Machine Learning in Trading
No ratings yet
8 Machine Learning in Trading
17 pages
Tadlo MCL
No ratings yet
Tadlo MCL
11 pages
AI Lab Record
No ratings yet
AI Lab Record
42 pages
Plant Disease Detection With Machine and Deep Lea-Groen Kennisnet 634493
No ratings yet
Plant Disease Detection With Machine and Deep Lea-Groen Kennisnet 634493
82 pages
QB 1
No ratings yet
QB 1
11 pages
Hyperparameters
No ratings yet
Hyperparameters
2 pages
ML Individual Assigenment 1
No ratings yet
ML Individual Assigenment 1
11 pages
Hyper Parameter Turning
No ratings yet
Hyper Parameter Turning
4 pages
Hyperparameter Tuning For Machine Learning Models
No ratings yet
Hyperparameter Tuning For Machine Learning Models
14 pages
Assignmnet
No ratings yet
Assignmnet
25 pages
Grid Random Search
No ratings yet
Grid Random Search
6 pages
Module 6
No ratings yet
Module 6
4 pages
Lecture6c HyperparameterOptimization
No ratings yet
Lecture6c HyperparameterOptimization
19 pages
Automatic Hyperparameter Tuning With Sklearn Using Grid and Random Search - by Bex T. - Towards Data Science
No ratings yet
Automatic Hyperparameter Tuning With Sklearn Using Grid and Random Search - by Bex T. - Towards Data Science
8 pages
1 s2.0 S1674862X19300047 Main
No ratings yet
1 s2.0 S1674862X19300047 Main
15 pages
Supple Maximizing Performance in Cs CuBiCl
No ratings yet
Supple Maximizing Performance in Cs CuBiCl
5 pages
Hyperparameter Search Guide
No ratings yet
Hyperparameter Search Guide
6 pages
8 Ejercicio - Optimización y Guardado de Modelos - Training - Microsoft Learn Ingles
No ratings yet
8 Ejercicio - Optimización y Guardado de Modelos - Training - Microsoft Learn Ingles
13 pages
Tuning A CART's Hyperparameters: Elie Kawerk
No ratings yet
Tuning A CART's Hyperparameters: Elie Kawerk
26 pages
Ind4 0
No ratings yet
Ind4 0
75 pages
The Importance of Hyperparameters in Machine Learning
No ratings yet
The Importance of Hyperparameters in Machine Learning
8 pages
Model Training: (Anything Done While We Train The Model)
No ratings yet
Model Training: (Anything Done While We Train The Model)
194 pages
Derby County and Football Analytics
No ratings yet
Derby County and Football Analytics
2 pages
Future Trends in Internet Technologies
No ratings yet
Future Trends in Internet Technologies
5 pages
Hyperparameter Optimization For Machine Learning Models Based On Bayesian Optimization
No ratings yet
Hyperparameter Optimization For Machine Learning Models Based On Bayesian Optimization
15 pages
Optimizing Deep Learning Models From Multi-Objective Perspective Via Bayesian Optimization
No ratings yet
Optimizing Deep Learning Models From Multi-Objective Perspective Via Bayesian Optimization
10 pages
Artificial Intelligence and Machine Learning
No ratings yet
Artificial Intelligence and Machine Learning
2 pages
Gautam - Nagpal - Resume - 16 09 2022 13 44 25 PDF
No ratings yet
Gautam - Nagpal - Resume - 16 09 2022 13 44 25 PDF
2 pages
Tunability: Importance of Hyperparameters of Machine Learning Algorithms
No ratings yet
Tunability: Importance of Hyperparameters of Machine Learning Algorithms
32 pages
01 Intro
No ratings yet
01 Intro
45 pages
PE 24 2024 INIT - en
No ratings yet
PE 24 2024 INIT - en
419 pages
Deep Learning & Hyperspectral Imaging in Agriculture
No ratings yet
Deep Learning & Hyperspectral Imaging in Agriculture
9 pages
The Leadership Quarterly: George C. Banks, Shelley D. Dionne, Hiroki Sayama, Marianne Schmid Mast
No ratings yet
The Leadership Quarterly: George C. Banks, Shelley D. Dionne, Hiroki Sayama, Marianne Schmid Mast
2 pages
BC2407 Course Outline
No ratings yet
BC2407 Course Outline
3 pages
2020 Student Handbook: June 21st - August 1st
No ratings yet
2020 Student Handbook: June 21st - August 1st
14 pages
Programa
No ratings yet
Programa
2 pages
Building K-Means Clustering Algorithm From Scratch
No ratings yet
Building K-Means Clustering Algorithm From Scratch
10 pages
Topic: Non-Negative Matrix Factorisation: Assignment - 2
No ratings yet
Topic: Non-Negative Matrix Factorisation: Assignment - 2
6 pages
Research Engineer's Career Profile
No ratings yet
Research Engineer's Career Profile
4 pages