Machine Learning Unit 1 Overview
Machine Learning Unit 1 Overview
U20IT602-MACHINE LEARNING
UNIT I NOTES
UNIT I INTRODUCTION TO MACHINE LEARNING 9
Machine Learning - Machine Learning Foundations –Overview – applications - Types of machine
learning - basic concepts in machine learning Examples of Machine Learning -Applications - Linear
Models for Regression - Linear Basis Function Models - The Bias-Variance Decomposition -Bayesian
Linear Regression - Bayesian Model Comparison.
1
Machine Learning is said as a subset of artificial intelligence that is mainly concerned with the
development of algorithms which allow a computer to learn from the data and past experiences on
their own. The term machine learning was first introduced by Arthur Samuel in 1959. We can define
it in a summarized way as:
With the help of sample historical data, which is known as training data, machine learning
algorithms build a mathematical model that helps in making predictions or decisions without
being explicitly programmed. Machine learning brings computer science and statistics together for
creating predictive models. Machine learning constructs or uses the algorithms that learn from
historical data. The more we will provide the information, the higher will be the performance.
A machine has the ability to learn if it can improve its performance by gaining
more data.
Machine Learning system learns from historical data, builds the prediction models, and
whenever it receives new data, predicts the output for it. The accuracy of predicted output depends
upon the amount of data, as the huge amount of data helps to build a better model which predicts
the output more accurately.
Suppose we have a complex problem, where we need to perform some predictions, so instead of
writing a code for it, we just need to feed the data to generic algorithms, and with the help of these
algorithms, machine builds the logic as per the data and predict the output. Machine learning has
changed our way of thinking about the problem. The below block diagram explains the working of
Machine Learning algorithm:
2
Features of Machine Learning:
The need for machine learning is increasing day by day. The reason behind the need for
machine learning is that it is capable of doing tasks that are too complex for a person to implement
directly. As a human, we have some limitations as we cannot access the huge amount of data
manually, so for this, we need some computer systems and here comes the machine learning to
make things easy for us.
We can train machine learning algorithms by providing them the huge amount of data and let them
explore the data, construct the models, and predict the required output automatically. The
performance of the machine learning algorithm depends on the amount of data, and it can be
determined by the cost function. With the help of machine learning, we can save both time and
money.
The importance of machine learning can be easily understood by its uses cases, Currently, machine
learning is used in self-driving cars, cyber fraud detection, face recognition, and friend
suggestion by Facebook, etc. Various top companies such as Netflix and Amazon have build
machine learning models that are using a vast amount of data to analyze the user interest and
recommend product accordingly.
Following are some key points which show the importance of Machine
Learning:
3
Applications of Machine Learning
Machine learning is a buzzword for today's technology, and it is growing very rapidly day
by day. We are using machine learning in our daily life even without knowing it such as Google
Maps, Google assistant, Alexa, etc. Below are some most trending real-world applications of
Machine Learning:
1. Image Recognition:
Image recognition is one of the most common applications of machine learning. It is used
to identify objects, persons, places, digital images, etc. The popular use case of image recognition
and face detection is, Automatic friend tagging suggestion:
Facebook provides us a feature of auto friend tagging suggestion. Whenever we upload a photo
with our Facebook friends, then we automatically get a tagging suggestion with name, and the
technology behind this is machine learning's face detection and recognition algorithm.
It is based on the Facebook project named "Deep Face," which is responsible for face recognition
and person identification in the picture.
2. Speech Recognition
While using Google, we get an option of "Search by voice," it comes under speech
recognition, and it's a popular application of machine learning.
4
Speech recognition is a process of converting voice instructions into text, and it is also known as
"Speech to text", or "Computer speech recognition." At present, machine learning algorithms
are widely used by various applications of speech recognition. Google assistant, Siri, Cortana,
and Alexa are using speech recognition technology to follow the voice instructions.
3. Traffic prediction:
If we want to visit a new place, we take help of Google Maps, which shows us the correct
path with the shortest route and predicts the traffic conditions.
It predicts the traffic conditions such as whether traffic is cleared, slow-moving, or heavily
congested with the help of two ways:
o Real Time location of the vehicle form Google Map app and sensors
o Average time has taken on past days at the same time.
Everyone who is using Google Map is helping this app to make it better. It takes information from
the user and sends back to its database to improve the performance.
4. Product recommendations:
Machine learning is widely used by various e-commerce and entertainment companies such
as Amazon, Netflix, etc., for product recommendation to the user. Whenever we search for some
product on Amazon, then we started getting an advertisement for the same product while internet
surfing on the same browser and this is because of machine learning.
Google understands the user interest using various machine learning algorithms and suggests the
product as per customer interest.
As similar, when we use Netflix, we find some recommendations for entertainment series, movies,
etc., and this is also done with the help of machine learning.
5. Self-driving cars:
One of the most exciting applications of machine learning is self-driving cars. Machine
learning plays a significant role in self-driving cars. Tesla, the most popular car manufacturing
company is working on self-driving car. It is using unsupervised learning method to train the car
models to detect people and objects while driving.
Whenever we receive a new email, it is filtered automatically as important, normal, and spam.
We always receive an important mail in our inbox with the important symbol and spam emails in
5
our spam box, and the technology behind this is Machine learning. Below are some spam filters
used by Gmail:
o Content Filter
o Header filter
o General blacklists filter
o Rules-based filters
o Permission filters
Some machine learning algorithms such as Multi-Layer Perceptron, Decision tree, and Naïve
Bayes classifier are used for email spam filtering and malware detection.
These assistant record our voice instructions, send it over the server on a cloud, and decode it
using ML algorithms and act accordingly.
Machine learning is making our online transaction safe and secure by detecting fraud
transaction. Whenever we perform some online transaction, there may be various ways that a
fraudulent transaction can take place such as fake accounts, fake ids, and steal money in the
middle of a transaction. So to detect this, Feed Forward Neural network helps us by checking
whether it is a genuine transaction or a fraud transaction.
For each genuine transaction, the output is converted into some hash values, and these values
become the input for the next round. For each genuine transaction, there is a specific pattern which
gets change for the fraud transaction hence, it detects it and makes our online transactions more
secure.
Machine learning is widely used in stock market trading. In the stock market, there is
always a risk of up and downs in shares, so for this machine learning's long short term memory
neural network is used for the prediction of stock market trends.
6
10. Medical Diagnosis:
In medical science, machine learning is used for diseases diagnoses. With this, medical
technology is growing very fast and able to build 3D models that can predict the exact position of
lesions in the brain.
Nowadays, if we visit a new place and we are not aware of the language then it is not a
problem at all, as for this also machine learning helps us by converting the text into our known
languages. Google's GNMT (Google Neural Machine Translation) provide this feature, which is a
Neural Machine Learning that translates the text into our familiar language, and it called as
automatic translation.
The technology behind the automatic translation is a sequence to sequence learning algorithm,
which is used with image recognition and translates the text from one language to another
language.
Machine learning is a subset of AI, which enables the machine to automatically learn
from data, improve performance from past experiences, and make predictions. Machine
learning contains a set of algorithms that work on a huge amount of data. Data is fed to these
algorithms to train them, and on the basis of training, they build the model & perform a specific
task.
These ML algorithms help to solve different business problems like Regression, Classification,
Forecasting, Clustering, and Associations, etc.
Based on the methods and way of learning, machine learning is divided into mainly four types,
which are:
7
1. Supervised Machine Learning
As its name suggests, Supervised machine learning is based on supervision. It means in the
supervised learning technique, we train the machines using the "labelled" dataset, and based on the
training, the machine predicts the output. Here, the labelled data specifies that some of the inputs
are already mapped to the output. More preciously, we can say; first, we train the machine with the
input and corresponding output, and then we ask the machine to predict the output using the test
dataset.
Let's understand supervised learning with an example. Suppose we have an input dataset of cats
and dog images. So, first, we will provide the training to the machine to understand the images,
such as the shape & size of the tail of cat and dog, Shape of eyes, colour, height (dogs are
taller, cats are smaller), etc. After completion of training, we input the picture of a cat and ask
the machine to identify the object and predict the output. Now, the machine is well trained, so it
will check all the features of the object, such as height, shape, colour, eyes, ears, tail, etc., and find
that it's a cat. So, it will put it in the Cat category. This is the process of how the machine identifies
the objects in Supervised Learning.
The main goal of the supervised learning technique is to map the input variable(x) with the
output variable(y). Some real-world applications of supervised learning are Risk Assessment,
Fraud Detection, Spam filtering, etc.
Supervised machine learning can be classified into two types of problems, which are given
below:
Classification
8
o Regression
a) Classification
Classification algorithms are used to solve the classification problems in which the output variable
is categorical, such as "Yes" or No, Male or Female, Red or Blue, etc. The classification
algorithms predict the categories present in the dataset. Some real-world examples of classification
algorithms are Spam Detection, Email filtering, etc.
b) Regression
Regression algorithms are used to solve regression problems in which there is a linear
relationship between input and output variables. These are used to predict continuous output
variables, such as market trends, weather prediction, etc.
Advantages:
o Since supervised learning work with the labelled dataset so we can have an exact idea
about the classes of objects.
o These algorithms are helpful in predicting the output on the basis of prior experience.
Disadvantages:
9
Applications of Supervised Learning
o Imagesegmentation-
Supervised Learning algorithms are used in image segmentation. In this process, image
classification is performed on different image data with pre-defined labels.
o Medical Diagnosis -
o Supervised algorithms are also used in the medical field for diagnosis purposes. It is done
by using medical images and past labelled data with labels for disease conditions. With
such a process, the machine can identify a disease for the new patients.
o Fraud Detection - Supervised Learning classification algorithms are used for identifying
fraud transactions, fraud customers, etc. It is done by using historic data to identify the
patterns that can lead to possible fraud.
o Spam detection - In spam detection & filtering, classification algorithms are used. These
algorithms classify an email as spam or not spam. The spam emails are sent to the spam
folder.
o Speech Recognition - Supervised learning algorithms are also used in speech recognition.
The algorithm is trained with voice data, and various identifications can be done using the
same, such as voice-activated passwords, voice commands, etc.
Unsupervised learning is different from the Supervised learning technique; as its name
suggests, there is no need for supervision. It means, in unsupervised machine learning, the
machine is trained using the unlabeled dataset, and the machine predicts the output without any
supervision.
In unsupervised learning, the models are trained with the data that is neither classified nor labelled,
and the model acts on that data without any supervision.
The main aim of the unsupervised learning algorithm is to group or categories the unsorted
dataset according to the similarities, patterns, and differences. Machines are instructed to find
the hidden patterns from the input dataset.
Let's take an example to understand it more preciously; suppose there is a basket of fruit images,
and we input it into the machine learning model. The images are totally unknown to the model,
and the task of the machine is to find the patterns and categories of the objects.
So, now the machine will discover its patterns and differences, such as colour difference, shape
difference, and predict the output when it is tested with the test dataset.
10
Categories of Unsupervised Machine Learning
Unsupervised Learning can be further classified into two types, which are given below:
o Clustering
o Association
1) Clustering
The clustering technique is used when we want to find the inherent groups from the data. It is a
way to group the objects into a cluster such that the objects with the most similarities remain in
one group and have fewer or no similarities with the objects of other groups. An example of the
clustering algorithm is grouping the customers by their purchasing behaviour.
2) Association
Some popular algorithms of Association rule learning are Apriori Algorithm, Eclat, FP-growth
algorithm.
Advantages:
o These algorithms can be used for complicated tasks compared to the supervised ones
because these algorithms work on the unlabeled dataset.
o Unsupervised algorithms are preferable for various tasks as getting the unlabeled dataset is
easier as compared to the labelled dataset.
Disadvantages:
11
o The output of an unsupervised algorithm can be less accurate as the dataset is not labelled,
and algorithms are not trained with the exact output in prior.
o Working with Unsupervised learning is more difficult as it works with the unlabelled
dataset that does not map with the output.
3. Semi-Supervised Learning
Although Semi-supervised learning is the middle ground between supervised and unsupervised
learning and operates on the data that consists of a few labels, it mostly consists of unlabeled data.
As labels are costly, but for corporate purposes, they may have few labels. It is completely
different from supervised and unsupervised learning as they are based on the presence & absence
of labels.
We can imagine these algorithms with an example. Supervised learning is where a student is under
the supervision of an instructor at home and college. Further, if that student is self-analysing the
same concept without any help from the instructor, it comes under unsupervised learning. Under
12
semi-supervised learning, the student has to revise himself after analyzing the same concept under
the guidance of an instructor at college.
Advantages:
Disadvantages:
4. Reinforcement Learning
In reinforcement learning, there is no labelled data like supervised learning, and agents learn from
their experiences only.
The reinforcement learning process is similar to a human being; for example, a child learns various
things by experiences in his day-to-day life. An example of reinforcement learning is to play a
game, where the Game is the environment, moves of an agent at each step define states, and the
goal of the agent is to get a high score. Agent receives feedback in terms of punishment and
rewards.
Due to its way of working, reinforcement learning is employed in different fields such as Game
theory, Operation Research, Information theory, multi-agent systems.
13
o Positive Reinforcement Learning: Positive reinforcement learning specifies increasing
the tendency that the required behaviour would occur again by adding something. It
enhances the strength of the behaviour of the agent and positively impacts it.
o Negative Reinforcement Learning: Negative reinforcement learning works exactly
opposite to the positive RL. It increases the tendency that the specific behaviour would
occur again by avoiding the negative condition.
Advantages
o It helps in solving complex real-world problems which are difficult to be solved by general
techniques.
o The learning model of RL is similar to the learning of human beings; hence most accurate
results can be found.
o Helps in achieving long term results.
Disadvantage
14
The curse of dimensionality limits reinforcement learning for real physical systems.
Machine Learning enables computers to behave like human beings by training them with the help
of past experience and predicted data.
There are three key aspects of Machine Learning, which are as follows:
o Task: A task is defined as the main problem in which we are interested. This task/problem
can be related to the predictions and recommendations and estimations, etc.
o Experience: It is defined as learning from historical or past data and used to estimate and
resolve future tasks.
o Performance: It is defined as the capacity of any machine to resolve any machine learning
task or problem and provide the best outcome for the same. However, performance is
dependent on the type of machine learning problems .
Machine Learning technology has widely changed the lifestyle of a human beings as we are
highly dependent on this technology. It is the subset of Artificial Intelligence, and we all are using
this either knowingly or unknowingly. For example, we use Google Assistant that employs ML
concepts, we take help from online customer support, which is also an example of machine
learning, and many more.
15
Machine Learning uses statistical techniques to make a computer more intelligent, which helps to
fetch entire business data and utilize it automatically as per requirement. There are so many
examples of Machine Learning in real-world, which are as follows:
Voice search, voice dialing, and appliance control are some real-world examples of speech
recognition. Alexa and Google Home are the most widely used speech recognition software.
Similar to speech recognition, Image recognition is also the most widely used example of Machine
Learning technology that helps identify any object in the form of a digital image. There are some
real-world examples of Image recognition, such as,
Tagging the name on any photo as we have seen on Facebook. It is also used in recognizing
handwriting by segmenting a single letter into smaller images.
Further, there is the biggest example of Image recognition is facial recognition. We all are using
new generation mobile phones, where we use facial recognition techniques to unlock our devices.
Hence, it also helps to increase the security of the system.
16
2. Traffic alerts using Google Map
Google Map is one of the widely used applications whenever anyone goes out to reach the
correct destination. The map helps us find the best route or fastest route, traffic, and much more
information. But how it provides this information to us? Google map uses different technologies,
including machine learning which collects information from different users, analyze that
information, update the information, and make predictions. With the help of predictions, it can
also tell us the traffic before we start our journey. Machine Learning also helps identify the best
and fastest route while we are in traffic using Google Maps. Further, we can also answer some
questions like does the route still have traffic? This information and data get stored automatically
in the database, which Machine Learning uses for the exact information for other people in traffic.
Further, Google maps also help find locations like a hotel, mall, restaurant, cinema hall, buses, etc.
A chatbot is the most widely used software in every industry like banking, Medical,
education, health, etc. You can see chatbots in any banking application for quick online support to
customers. These chatbots also work on the concepts of Machine Learning. The programmers feed
some basic questions and answers based on the frequently asked queries. So, whenever a customer
asks a query, the chatbot recognizes the question's keywords from a database and then provides
appropriate resolution to the customer. This helps to make quick and fast customer service
facilities to customers.
4. Google Translation
Suppose you work on an international banking project like French, German, etc., but you
only know English. In that case, this will be a very panic moment for you because you can't
proceed further without reviewing documents. Google Translator software helps to translate any
language into the desired language. So, in this way, you can convert French, German, etc., into
English, Hindi, or any other language. This makes the job of different sectors very easy as a user
can work on any country's project hassle-free.
Google uses the Google Neural Machine Translation to detect any language and translate it into
any desired language.
5. Prediction
Prediction system also uses Machine learning algorithms for making predictions. There are
various sectors where predictions are used. For example, in bank loan systems, error probability
can be determined using predictions with machine learning. For this, the available data are
classified into different groups with the set of rules provided by analysts, and once the
classification is done, the error probability is predicted.
17
6. Extraction
One of the best examples of machine learning is the extraction of information. In this process,
structured data is extracted from unstructured data, and which is used in predictive analytics tools.
The data is usually found in a raw or unstructured form that is not useful, and to make it useful, the
extraction process is used. Some real-world examples of extraction are:
7. Statistical Arbitrage
Arbitrage is an automated trading process, which is used in the finance industry to manage a
large volume of securities. The process uses a trading algorithm to analyze a set of securities using
economic variables and correlations. Some examples of statistical arbitrage are as follows:
9. Self-driving cars
The future of the automobile industry is self-driving cars. These are driverless cars, which
are based on concepts of deep learning and machine learning. Some commonly used machine
learning algorithms in self-driving cars are Scale-invariant feature transform (SIFT), AdaBoost,
TextonBoost, YOLO(You only look once).
Nowadays, most people spend multiple hours on google or the internet surfing. And while
working on any webpage or website, they get multiples ads on each page. But these ads are
different for each user even when two users are using the same internet and on the same location.
These ads recommendations are done with the help of machine learning algorithms. These ads
recommendations are based on the search history of each user. For example, if one user searches
18
for the Shirt on Amazon or any other e-commerce website, he will get start ads recommendation of
shirts after some time.
o Facility protections
o Operation monitoring
o Parking lots
o Traffic monitoring
o Shopping patterns
Emails are filtered automatically when we receive any new email, and it is also an example of
machine learning. We always receive an important mail in our inbox with the important symbol
and spam emails in our spam box, and the technology behind this is Machine learning. Below are
some spam filters used by Gmail:
o Content Filter
o Header filter
o General blacklists filter
o Rules-based filters
o Permission filters
Some machine learning algorithms that are used in email spam filtering and malware detection
are Multi-Layer Perceptron, Decision tree, and Naïve Bayes classifier.
Whenever we book an Uber in peak office hours in the morning or evening, we get a
difference in prices compared to normal hours. The prices are hiked due to surge prices applied by
companies whenever demand is high. But how these surge prices are determined & applied by
19
companies. So, the technologies behind this are AI and machine learning. These technologies
solve two main business queries, which are
Machine Learning technology also helps in finding discounted prices, best prices, promotional
prices, etc., for each customer.
Machine learning technology is widely being used in gaming and education. There are
various gaming and learning apps that are using AI and Machine learning. Among these
apps, Duolingo is a free language learning app, which is designed in a fun and interactive way.
While using this app, people feel like playing a game on the phone.
It collects data from the user's answer and creates a statical model to determine that how long a
person can remember the word, and before requiring a refresher, it provides that information.
Virtual assistants are much popular in today's world, which are the smart software
embedded in smartphones or laptops. These assistants work as personal assistants and assist in
searching for information that is asked over voice. A virtual assistant understands human language
or natural language voice commands and performs the task for that user. Some examples of virtual
assistants are Siri, Alexa, Google, Cortana, etc. To start working with these virtual assistants, first,
they need to be activated, and then we can ask anything, and they will answer it. For example,
"What's the date today?", "Tell me a joke", and many more. The technologies used behind Virtual
assistants are AI, machine learning, natural language processing, etc. Machine learning
algorithms collect and analyze the data based on the previous involvement of the user and predict
data as per the user preferences.
Linear regression is one of the easiest and most popular Machine Learning algorithms. It is
a statistical method that is used for predictive analysis. Linear regression makes predictions for
continuous/real or numeric variables such as sales, salary, age, product price, etc.
Linear regression algorithm shows a linear relationship between a dependent (y) and one or more
independent (y) variables, hence called as linear regression. Since linear regression shows the
linear relationship, which means it finds how the value of the dependent variable is changing
according to the value of the independent variable.
20
The linear regression model provides a sloped straight line representing the relationship between
the variables. Consider the below image:
y= a0+a1 x+ ε
Here,
Y=DependentVariable(TargetVariable)
X=IndependentVariable(predictorVariable)
a0= intercept of the line (Gives an additional degree of freedom)
a1 = Linear regression coefficient (scale factor to each input value).
ε = random error
The values for x and y variables are training datasets for Linear Regression model representation.
Linear regression can be further divided into two types of the algorithm:
o SimpleLinearRegression:
If a single independent variable is used to predict the value of a numerical dependent
variable, then such a Linear Regression algorithm is called Simple Linear Regression.
o MultipleLinearregression:
If more than one independent variable is used to predict the value of a numerical dependent
variable, then such a Linear Regression algorithm is called Multiple Linear Regression.
21
A linear line showing the relationship between the dependent and independent variables is called
a regression line. A regression line can show two types of relationship:
o PositiveLinearRelationship:
If the dependent variable increases on the Y-axis and independent variable increases on X-
axis, then such a relationship is termed as a Positive linear relationship.
o NegativeLinearRelationship:
If the dependent variable decreases on the Y-axis and independent variable increases on the
X-axis, then such a relationship is called a negative linear relationship.
When working with linear regression, our main goal is to find the best fit line that means the error
between predicted values and actual values should be minimized. The best fit line will have the
least error.
22
The different values for weights or the coefficient of lines (a 0, a1) gives a different line of
regression, so we need to calculate the best values for a 0 and a1 to find the best fit line, so to
calculate this we use cost function.
Cost function-
o The different values for weights or coefficient of lines (a 0, a1) gives the different line of
regression, and the cost function is used to estimate the values of the coefficient for the best
fit line.
o Cost function optimizes the regression coefficients or weights. It measures how a linear
regression model is performing.
o We can use the cost function to find the accuracy of the mapping function, which maps
the input variable to the output variable. This mapping function is also known
as Hypothesis function.
For Linear Regression, we use the Mean Squared Error (MSE) cost function, which is the
average of squared error occurred between the predicted values and actual values. It can be written
as:
Where,
Residuals: The distance between the actual value and predicted values is called residual. If the
observed points are far from the regression line, then the residual will be high, and so cost function
will high. If the scatter points are close to the regression line, then the residual will be small and
hence the cost function.
Gradient Descent:
o Gradient descent is used to minimize the MSE by calculating the gradient of the cost
function.
o A regression model uses gradient descent to update the coefficients of the line by reducing
the cost function.
o It is done by a random selection of values of coefficient and then iteratively update the
values to reach the minimum cost function.
23
Model Performance:
The Goodness of fit determines how the line of regression fits the set of observations. The process
of finding the best model out of various models is called optimization. It can be achieved by
below method:
1. R-squared method:
Below are some important assumptions of Linear Regression. These are some formal checks
while building a Linear Regression model, which ensures to get the best possible result from the
given dataset.
o Homoscedasticity Assumption:
Homoscedasticity is a situation when the error term is the same for all the values of
independent variables. With homoscedasticity, there should be no clear pattern distribution
of data in the scatter plot.
24
o Normal distribution of error terms:
Linear regression assumes that the error term should follow the normal distribution pattern.
If error terms are not normally distributed, then confidence intervals will become either too
wide or too narrow, which may cause difficulties in finding coefficients.
It can be checked using the q-q plot. If the plot shows a straight line without any deviation,
which means the error is normally distributed.
o No autocorrelations:
The linear regression model assumes no autocorrelation in error terms. If there will be any
correlation in the error term, then it will drastically reduce the accuracy of the model.
Autocorrelation usually occurs if there is a dependency between residual errors.
25
o Reducible errors: These errors can be reduced to improve the model accuracy. Such
errors can further be classified into bias and Variance.
o Irreducible errors: These errors will always be present in the model regardless of which
algorithm has been used. The cause of these errors is unknown variables whose value can't
be reduced.
What is Bias?
In general, a machine learning model analyses the data, find patterns in it and make
predictions. While training, the model learns these patterns in the dataset and applies them to test
data for prediction. While making predictions, a difference occurs between prediction values
made by the model and actual values/expected values, and this difference is known as bias
errors or Errors due to bias. It can be defined as an inability of machine learning algorithms such
as Linear Regression to capture the true relationship between the data points. Each algorithm
begins with some amount of bias because bias occurs from assumptions in the model, which
makes the target function simple to learn. A model has either:
o Low Bias: A low bias model will make fewer assumptions about the form of the target
function.
o High Bias: A model with a high bias makes more assumptions, and the model becomes
unable to capture the important features of our dataset. A high bias model also cannot
perform well on new data.
Generally, a linear algorithm has a high bias, as it makes them learn fast. The simpler the
algorithm, the higher the bias it has likely to be introduced. Whereas a nonlinear algorithm often
has low bias.
26
Some examples of machine learning algorithms with low bias are Decision Trees, k-Nearest
Neighbours and Support Vector Machines. At the same time, an algorithm with high bias
is Linear Regression, Linear Discriminant Analysis and Logistic Regression.
High bias mainly occurs due to a much simple model. Below are some ways to reduce the high
bias:
The variance would specify the amount of variation in the prediction if the different
training data was used. In simple words, variance tells that how much a random variable is
different from its expected value. Ideally, a model should not vary too much from one training
dataset to another, which means the algorithm should be good in understanding the hidden
mapping between inputs and output variables. Variance errors are either of low variance or high
variance.
Low variance means there is a small variation in the prediction of the target function with changes
in the training data set. At the same time, High variance shows a large variation in the prediction
of the target function with changes in the training dataset.
A model that shows high variance learns a lot and perform well with the training dataset, and does
not generalize well with the unseen dataset. As a result, such a model gives good results with the
training dataset but shows high error rates on the test dataset.
Since, with high variance, the model learns too much from the dataset, it leads to overfitting of the
model. A model with high variance has the below problems:
Usually, nonlinear algorithms have a lot of flexibility to fit the model, have high variance.
27
Some examples of machine learning algorithms with low variance are, Linear Regression,
Logistic Regression, and Linear discriminant analysis. At the same time, algorithms with high
variance are decision tree, Support Vector Machine, and K-nearest neighbours.
1. Low-Bias,Low-Variance:
The combination of low bias and low variance shows an ideal machine learning model.
However, it is not possible practically.
28
2. Low-Bias, High-Variance: With low bias and high variance, model predictions are
inconsistent and accurate on average. This case occurs when the model learns with a large
number of parameters and hence leads to an overfitting
3. High-Bias, Low-Variance: With High bias and low variance, predictions are consistent
but inaccurate on average. This case occurs when a model does not learn well with the
training dataset or uses few numbers of the parameter. It leads to underfitting problems in
the model.
4. High-Bias,High-Variance:
With high bias and high variance, predictions are inconsistent and also inaccurate on
average.
o High training error and the test error is almost similar to training error.
Bias-Variance Trade-Off
While building the machine learning model, it is really important to take care of bias and
variance in order to avoid overfitting and underfitting in the model. If the model is very simple
with fewer parameters, it may have low variance and high bias. Whereas, if the model has a large
number of parameters, it will have high variance and low bias. So, it is required to make a balance
between bias and variance errors, and this balance between the bias error and variance error is
known as the Bias-Variance trade-off.
29
For an accurate prediction of the model, algorithms need a low variance and low bias. But this is
not possible because bias and variance are related to each other:
Bias-Variance trade-off is a central issue in supervised learning. Ideally, we need a model that
accurately captures the regularities in training data and simultaneously generalizes well with the
unseen dataset. Unfortunately, doing this is not possible simultaneously. Because a high variance
algorithm may perform well with training data, but it may lead to overfitting to noisy data.
Whereas, high bias algorithm generates a much simple model that may not even capture important
regularities in the data. So, we need to find a sweet spot between bias and variance to make an
optimal model.
Hence, the Bias-Variance trade-off is about finding the sweet spot to make a balance between
bias and variance errors.
Linear regression is used to predict the continuous Logistic Regression is used to predict the categorical
dependent variable using a given set of independent dependent variable using a given set of independent
variables. variables.
Linear Regression is used for solving Regression Logistic regression is used for solving Classification
problem. problems.
In Linear regression, we predict the value of In logistic Regression, we predict the values of
continuous variables. categorical variables.
30
In linear regression, we find the best fit line, by which In Logistic Regression, we find the S-curve by which
we can easily predict the output. we can classify the samples.
Least square estimation method is used for estimation Maximum likelihood estimation method is used for
of accuracy. estimation of accuracy.
The output for Linear Regression must be a continuous The output of Logistic Regression must be a
value, such as price, age, etc. Categorical value such as 0 or 1, Yes or No, etc.
In Linear regression, it is required that relationship In Logistic regression, it is not required to have the
between dependent variable and independent variable linear relationship between the dependent and
must be linear. independent variable.
In linear regression, there may be collinearity between In logistic regression, there should not be collinearity
the independent variables. between the independent variable.
Linear Regression and Logistic Regression are the two famous Machine Learning
Algorithms which come under supervised learning technique. Since both the algorithms are of
supervised in nature hence these algorithms use labeled dataset to make the predictions. But the
main difference between them is how they are being used. The Linear Regression is used for
solving Regression problems whereas Logistic Regression is used for solving the Classification
problems. The description of both the algorithms is given below along with difference table.
Linear Regression:
o Linear Regression is one of the most simple Machine learning algorithm that comes under
Supervised Learning technique and used for solving regression problems.
31
o It is used for predicting the continuous dependent variable with the help of independent
variables.
o The goal of the Linear regression is to find the best fit line that can accurately predict the
output for the continuous dependent variable.
o If single independent variable is used for prediction then it is called Simple Linear
Regression and if there are more than two independent variables then such regression is
called as Multiple Linear Regression.
o By finding the best fit line, algorithm establish the relationship between dependent variable
and independent variable. And the relationship should be of linear nature.
o The output for Linear regression should only be the continuous values such as price, age,
salary, etc. The relationship between the dependent variable and independent variable can
be shown in below image:
In above image the dependent variable is on Y-axis (salary) and independent variable is on x-
axis(experience). The regression line can be written as:
y= a0+a1x+ ε
Logistic Regression:
o Logistic regression is one of the most popular Machine learning algorithm that comes
under Supervised Learning techniques.
o It can be used for Classification as well as for Regression problems, but mainly used for
Classification problems.
32
o Logistic regression is used to predict the categorical dependent variable with the help of
independent variables.
o The output of Logistic Regression problem can be only between the 0 and 1.
o Logistic regression can be used where the probabilities between two classes is required.
Such as whether it will rain today or not, either 0 or 1, true or false etc.
o Logistic regression is based on the concept of Maximum Likelihood estimation. According
to this estimation, the observed data should be most probable.
o In logistic regression, we pass the weighted sum of inputs through an activation function
that can map values in between 0 and 1. Such activation function is known as sigmoid
function and the curve obtained is called as sigmoid curve or S-curve. Consider the below
image:
33