Model Parameters
Dr. Nabeela Kausar
Introduction
• Model optimization is one of the toughest challenges in the
implementation of machine learning solutions.
• Entire branches of machine learning and deep learning theory have
been dedicated to the optimization of models.
Model Parameters in Machine
Learning
In a machine learning model, there are 2 types of parameters:
Model Parameters
Model Hyperparameters
Model Parameters
• Model Parameters: These are the parameters in the model that must
be determined using the training data set. These are the fitted
parameters.
• Hyperparameters: These are adjustable parameters that must be
tuned in order to obtain a model with optimal performance.
Model Parameters
A model parameter is a configuration variable that is internal to the
model and whose value can be estimated from data.
• They are required by the model when making predictions.
• They values define the skill of the model on your problem.
• They are estimated or learned from data.
• They are often not set manually by the practitioner.
• They are often saved as part of the learned model.
• In classical machine learning literature, we may think of the model as
the hypothesis and the parameters as the tailoring of the hypothesis
to a specific set of data.
What is a Model
Hyperparameter?
• A model hyperparameter is a configuration that is external to the
model and whose value cannot be estimated from data.
• They are often used in processes to help estimate model parameters.
• They are often specified by the practitioner.
• They can often be set using heuristics.
• They are often tuned for a given predictive modeling problem.
Hyperparameter Optimization
methods
• Hyperparameters can have a direct impact on the training of machine
learning algorithms. Thus, to achieve maximal performance, it is
important to understand how to optimize them. Here are some
common strategies for optimizing hyperparameters:
Manual Hyperparameter Tuning
• Traditionally, hyperparameters were tuned manually by trial and error.
This is still commonly done, and experienced engineers can “guess”
parameter values that will deliver very high accuracy for ML models.
However, there is a continual search for better, faster, and more
automatic methods to optimize hyperparameters.
Grid Search
• Suppose, you defined the grid as:
a1 = [0,1,2,3,4,5]
a2 = [10,20,30,40,5,60]
a3 = [105,105,110,115,120,125]
Random Search
• Often some of the hyperparameters matter much more than others.
Performing random search rather than grid search allows a much
more precise discovery of good values for the important ones.
• Random Search sets up a grid of hyperparameter values and selects
random combinations to train the model and score.
Grid Search VS Random Search
Evolutionary Optimization
Bias and Variance
• A supervised Machine Learning model aims to train itself on the input
variables(X) in such a way that the predicted values(Y) are as close to
the actual values as possible. This difference between the actual
values and predicted values is the error and it is used to evaluate the
model. The error for any supervised Machine Learning algorithm
comprises of 3 parts:
• Bias error
• Variance error
• The noise
• In supervised machine learning, an algorithm is trained on the
training data to build model that is well designed to make correct
prediction on the unseen data that is not available for training.
• A machine learning model is nothing but a mathematical function
which describes relationship between Predictors ( Features and
Machine Learning terminology) and Traget variable.
• While the noise is the irreducible error that we cannot eliminate, the
other two i.e. Bias and Variance are reducible errors that we can
attempt to minimize as much as possible.
Bias
• In the simplest terms, Bias is the difference between the Predicted
Value and the Expected Value. To explain further, the model makes
certain assumptions when it trains on the data provided. When it is
introduced to the testing/validation data, these assumptions may not
always be correct.
What is Bias
• Bias is the difference between the average prediction of our model
and the correct value which we are trying to predict. Model with high
bias pays very little attention to the training data and oversimplifies
the model. It always leads to high error on training and test data.
What is variance?
• Contrary to bias, the Variance is when the model takes into account
the fluctuations in the data i.e. the noise as well. So, what happens
when our model has a high variance?
• The model will still consider the variance as something to learn from.
That is, the model learns too much from the training data, so much
so, that when confronted with new (testing) data, it is unable to
predict accurately based on it.
• Mathematically, the variance error in the model is:
• Variance[f(x))=E[X^2]−E[X]^2
Overfitting
• Since in the case of high variance, the model learns too much from
the training data, it is called overfitting.
• To summarise,
• A model with a high bias error underfits data and makes very
simplistic assumptions on it
• A model with a high variance error overfits the data and learns too
much from it
• A good model is where both Bias and Variance errors are balanced
Bias-Variance Tradeoff