ML Material
ML Material
In the real world, we are surrounded by humans who can learn everything from their
experiences with their learning capability, and we have computers or machines which work
on our instructions. But can a machine also learn from experiences or past data like a
human does? So here comes the role of Machine Learning.
Machine learning algorithms create a mathematical model that, without being explicitly
programmed, aids in making predictions or decisions with the assistance of sample
historical data, or training data. For the purpose of developing predictive models, machine
learning brings together statistics and computer science. Algorithms that learn from
historical data are either constructed or utilized in machine learning. The performance will
rise in proportion to the quantity of information we provide.
A machine can learn if it can gain more data to improve its performance.
o Geoffrey Hinton and his group presented the idea of profound getting the hang
of utilizing profound conviction organizations.
o The Elastic Compute Cloud (EC2) was launched by Amazon to provide scalable
computing resources that made it easier to create and implement machine
learning models.
2007:
o The goal of explainable AI, which focuses on making machine learning models
easier to understand, received some attention.
o Google's DeepMind created AlphaGo Zero, which accomplished godlike Go
abilities to play without human information, utilizing just support learning.
2017:
Present day AI models can be utilized for making different expectations, including climate
expectation, sickness forecast, financial exchange examination, and so on.
It is based on the Facebook project named "Deep Face," which is responsible for face
recognition and person identification in the picture.
2. Speech Recognition
While using Google, we get an option of "Search by voice," it comes under speech
recognition, and it's a popular application of machine learning.
Speech recognition is a process of converting voice instructions into text, and it is also
known as "Speech to text", or "Computer speech recognition." At present, machine
learning algorithms are widely used by various applications of speech recognition. Google
assistant, Siri, Cortana, and Alexa are using speech recognition technology to follow
the voice instructions.
3. Traffic prediction:
If we want to visit a new place, we take help of Google Maps, which shows us the correct
path with the shortest route and predicts the traffic conditions.
It predicts the traffic conditions such as whether traffic is cleared, slow-moving, or heavily
congested with the help of two ways:
o Real Time location of the vehicle form Google Map app and sensors
o Average time has taken on past days at the same time.
Everyone who is using Google Map is helping this app to make it better. It takes
information from the user and sends back to its database to improve the performance.
4. Product recommendations:
Machine learning is widely used by various e-commerce and entertainment companies
such as Amazon, Netflix, etc., for product recommendation to the user. Whenever we
search for some product on Amazon, then we started getting an advertisement for the
same product while internet surfing on the same browser and this is because of machine
learning.
Google understands the user interest using various machine learning algorithms and
suggests the product as per customer interest.
As similar, when we use Netflix, we find some recommendations for entertainment series,
movies, etc., and this is also done with the help of machine learning.
5. Self-driving cars:
One of the most exciting applications of machine learning is self-driving cars. Machine
learning plays a significant role in self-driving cars. Tesla, the most popular car
manufacturing company is working on self-driving car. It is using unsupervised learning
method to train the car models to detect people and objects while driving.
6. Email Spam and Malware Filtering:
Whenever we receive a new email, it is filtered automatically as important, normal, and
spam. We always receive an important mail in our inbox with the important symbol and
spam emails in our spam box, and the technology behind this is Machine learning. Below
are some spam filters used by Gmail:
o Content Filter
o Header filter
o General blacklists filter
o Rules-based filters
o Permission filters
Some machine learning algorithms such as Multi-Layer Perceptron, Decision tree,
and Naïve Bayes classifier are used for email spam filtering and malware detection.
These assistant record our voice instructions, send it over the server on a cloud, and
decode it using ML algorithms and act accordingly.
For each genuine transaction, the output is converted into some hash values, and these
values become the input for the next round. For each genuine transaction, there is a
specific pattern which gets change for the fraud transaction hence, it detects it and makes
our online transactions more secure.
Example : In Driverless Car, the training data is fed to Algorithm like how to
Drive Car in Highway, Busy and Narrow Street with factors like speed limit,
parking, stop at signal etc. After that, a Logical and Mathematical model is
created on the basis of that and after that, the car will work according to the
logical model. Also, the more data the data is fed the more efficient output is
produced.
Designing a Learning System in Machine Learning :
According to Tom Mitchell, “A computer program is said to be learning from
experience (E), with respect to some task (T). Thus, the performance measure
(P) is the performance at task T, which is measured by P, and it improves with
experience E.”
Example: In Spam E-Mail detection,
• Task, T: To classify mails into Spam or Not Spam.
• Performance measure, P: Total percent of mails being correctly classified
as being “Spam” or “Not Spam”.
• Experience, E: Set of Mails with label “Spam”
Steps for Designing Learning System are:
Step 1) Choosing the Training Experience: The very important and first
task is to choose the training data or training experience which will be fed to
the Machine Learning Algorithm. It is important to note that the data or
experience that we fed to the algorithm must have a significant impact on the
Success or Failure of the Model. So Training data or experience should be
chosen wisely.
Below are the attributes which will impact on Success and Failure of Data:
• The training experience will be able to provide direct or indirect feedback
regarding choices. For example: While Playing chess the training data will
provide feedback to itself like instead of this move if this is chosen the
chances of success increases.
• Second important attribute is the degree to which the learner will control
the sequences of training examples. For example: when training data is fed
to the machine then at that time accuracy is very less but when it gains
experience while playing again and again with itself or opponent the
machine algorithm will get feedback and control the chess game
accordingly.
• Third important attribute is how it will represent the distribution of
examples over which performance will be measured. For example, a
Machine learning algorithm will get experience while going through a
number of different cases and different examples. Thus, Machine Learning
Algorithm will get more and more experience by passing through more and
more examples and hence its performance will increase.
Step 2- Choosing target function: The next important step is choosing the
target function. It means according to the knowledge fed to the algorithm the
machine learning will choose NextMove function which will describe what
type of legal moves should be taken. For example : While playing chess with
the opponent, when opponent will play then the machine learning algorithm
will decide what be the number of possible legal moves taken in order to get
success.
Step 3- Choosing Representation for Target function: When the machine
algorithm will know all the possible legal moves the next step is to choose the
optimized move using any representation i.e. using linear Equations,
Hierarchical Graph Representation, Tabular form etc. The NextMove function
will move the Target move like out of these move which will provide more
success rate. For Example : while playing chess machine have 4 possible
moves, so the machine will choose that optimized move which will provide
success to it.
Step 4- Choosing Function Approximation Algorithm: An optimized move
cannot be chosen just with the training data. The training data had to go
through with set of example and through these examples the training data will
approximates which steps are chosen and after that machine will provide
feedback on it. For Example : When a training data of Playing chess is fed to
algorithm so at that time it is not machine algorithm will fail or get success
and again from that failure or success it will measure while next move what
step should be chosen and what is its success rate.
Step 5- Final Design: The final design is created at last when system goes
from number of examples , failures and success , correct and incorrect
decision and what will be the next step etc. Example: DeepBlue is an
intelligent computer which is ML-based won chess game against the chess
expert Garry Kasparov, and it became the first computer which had beaten a
human chess expert.
Let's say we have a complex problem in which we need to make predictions. Instead of
writing code, we just need to feed the data to generic algorithms, which build the logic
based on the data and predict the output. Our perspective on the issue has changed as a
result of machine learning. The Machine Learning algorithm's operation is depicted in the
following block diagram:
Features of Machine Learning:
o Machine learning uses data to detect various patterns in a given dataset.
o It can learn from past data and improve automatically.
o It is a data-driven technology.
o Machine learning is much similar to data mining as it also deals with the huge amount
of the data.
By providing them with a large amount of data and allowing them to automatically explore
the data, build models, and predict the required output, we can train machine learning
algorithms. The cost function can be used to determine the amount of data and the
machine learning algorithm's performance. We can save both time and money by using
machine learning.
Following are some key points which show the importance of Machine Learning:
1. Supervised learning
2. Unsupervised learning
3. Reinforcement learning
1) Supervised Learning
In supervised learning, sample labelled data are provided to the machine learning system
for training, and the system then predicts the output based on the training data.
The system uses labelled data to build a model that understands the datasets and learns about each
one. After the training and processing are done, we test the model with sample data to see if it can
accurately predict the output.
The mapping of the input data to the output data is the objective of supervised learning. The
managed learning depends on oversight, and it is equivalent to when an understudy learns things
in the management of the educator. Spam filtering is an example of supervised learning.
The working of Supervised learning can be easily understood by the below example and
diagram:
Suppose we have a dataset of different types of shapes which includes square, rectangle,
triangle, and Polygon. Now the first step is that we need to train the model for each shape.
o If the given shape has four sides, and all the sides are equal, then it will be
labelled as a Square.
o If the given shape has three sides, then it will be labelled as a triangle.
o If the given shape has six equal sides then it will be labelled as hexagon.
Now, after training, we test our model using the test set, and the task of the model is to
identify the shape.
The machine is already trained on all types of shapes, and when it finds a new shape, it
classifies the shape on the bases of a number of sides, and predicts the output.
1. Regression
Regression algorithms are used if there is a relationship between the input variable and
the output variable. It is used for the prediction of continuous variables, such as Weather
forecasting, Market Trends, etc. Below are some popular Regression algorithms which
come under supervised learning:
o Linear Regression
o Regression Trees
o Non-Linear Regression
o Bayesian Linear Regression
o Polynomial Regression
2. Classification
Classification algorithms are used when the output variable is categorical, which means
there are two classes such as Yes-No, Male-Female, True-false, etc.
Spam Filtering,
o Random Forest
o Decision Trees
o Logistic Regression
o Support vector Machines
2) Unsupervised Learning
As the name suggests, unsupervised learning is a machine learning technique in which
models are not supervised using training dataset. Instead, models itself find the hidden
patterns and insights from the given data. It can be compared to learning which takes
place in the human brain while learning new things. It can be defined as:
o Unsupervised learning is helpful for finding useful insights from the data.
o Unsupervised learning is much similar as a human learns to think by their own
experiences, which makes it closer to the real AI.
o Unsupervised learning works on unlabeled and uncategorized data which make
unsupervised learning more important.
o In real-world, we do not always have input data with the corresponding output
so to solve such cases, we need unsupervised learning.
Here, we have taken an unlabeled input data, which means it is not categorized and
corresponding outputs are also not given. Now, this unlabeled input data is fed to the
machine learning model in order to train it. Firstly, it will interpret the raw data to find the
hidden patterns from the data and then will apply suitable algorithms such as k-means
clustering, Decision tree, etc.
Once it applies the suitable algorithm, the algorithm divides the data objects into groups
according to the similarities and difference between the objects.
o K-means clustering
o KNN (k-nearest neighbors)
o Hierarchal clustering
o Anomaly detection
o Neural Networks
o Principle Component Analysis
o Independent Component Analysis
o Apriori algorithm
o Singular value decomposition
Supervised learning algorithms are trained Unsupervised learning algorithms are trained
using labeled data. using unlabeled data.
Supervised learning model takes direct
Unsupervised learning model does not take any
feedback to check if it is predicting correct
feedback.
output or not.
Supervised learning model predicts the Unsupervised learning model finds the hidden
output. patterns in data.
In supervised learning, input data is provided In unsupervised learning, only input data is
to the model along with the output. provided to the model.
The goal of supervised learning is to train the The goal of unsupervised learning is to find the
model so that it can predict the output when hidden patterns and useful insights from the
it is given new data. unknown dataset.
Supervised learning needs supervision to Unsupervised learning does not need any
train the model. supervision to train the model.
Supervised learning can be used for those Unsupervised learning can be used for those
cases where we know the input as well as cases where we have only input data and no
corresponding outputs. corresponding output data.
Supervised learning is not close to true Unsupervised learning is more close to the true
Artificial intelligence as in this, we first train Artificial Intelligence as it learns similarly as a
the model for each data, and then only it can child learns daily routine things by his
predict the correct output. experiences.
3) Reinforcement Learning
Reinforcement learning is a feedback-based learning method, in which a learning agent
gets a reward for each right action and gets a penalty for each wrong action. The agent
learns automatically with these feedbacks and improves its performance. In reinforcement
learning, the agent interacts with the environment and explores it. The goal of an agent is
to get the most reward points, and hence, it improves its performance.
The robotic dog, which automatically learns the movement of his arms, is an example of
Reinforcement learning.
We can understand the concept of regression analysis using the below example:
Now, the company wants to do the advertisement of $200 in the year 2019 and wants to
know the prediction about the sales for this year. So to solve such type of prediction
problems in machine learning, we need regression analysis.
egression is a supervised learning technique which helps in finding the correlation
between variables and enables us to predict the continuous output variable based on the
one or more predictor variables. It is mainly used for prediction, forecasting, time series
modeling, and determining the causal-effect relationship between variables.
In Regression, we plot a graph between the variables which best fits the given datapoints,
using this plot, the machine learning model can make predictions about the data. In simple
words, "Regression shows a line or curve that passes through all the datapoints on
target-predictor graph in such a way that the vertical distance between the
datapoints and the regression line is minimum." The distance between datapoints and
line tells whether a model has captured a strong relationship or not.
Types of Regression
There are various types of regressions which are used in data science and machine
learning. Each type has its own importance on different scenarios, but at the core, all the
regression methods analyze the effect of the independent variable on dependent
variables. Here we are discussing some important types of regression which are given
below:
o Linear Regression
o Logistic Regression
o Polynomial Regression
o Support Vector Regression
o Decision Tree Regression
o Random Forest Regression
o Ridge Regression
o Lasso Regression:
Linear Regression:
o Linear regression is a statistical regression method which is used for predictive
analysis.
o It is one of the very simple and easy algorithms which works on regression and shows
the relationship between the continuous variables.
o It is used for solving the regression problem in machine learning.
o Linear regression shows the linear relationship between the independent variable (X-
axis) and the dependent variable (Y-axis), hence called linear regression.
o If there is only one input variable (x), then such linear regression is called simple linear
regression. And if there is more than one input variable, then such linear regression
is called multiple linear regression.
o The relationship between variables in the linear regression model can be explained
using the below image. Here we are predicting the salary of an employee on the basis
of the year of experience.
o Below is the mathematical equation for Linear regression:
1. Y= aX+b
Here, Y = dependent variables (target variables),
X= Independent variables (predictor variables),
a and b are the linear coefficients
Unlike regression, the output variable of Classification is a category, not a value, such as
"Green or Blue", "fruit or animal", etc. Since the Classification algorithm is a Supervised
learning technique, hence it takes labelled input data, which means it contains input with
the corresponding output.
The main goal of the Classification algorithm is to identify the category of a given dataset,
and these algorithms are mainly used to predict the output for the categorical data.
Classification algorithms can be better understood using the below diagram. In the below
diagram, there are two classes, class A and Class B. These classes have features that
are similar to each other and dissimilar to other classes.
1. Lazy Learners: Lazy Learner firstly stores the training dataset and wait until it receives the
test dataset. In Lazy learner case, classification is done on the basis of the most related
data stored in the training dataset. It takes less time in training but more time for
predictions.
Example: K-NN algorithm, Case-based reasoning
2. Eager Learners: Eager Learners develop a classification model based on a training
dataset before receiving a test dataset. Opposite to Lazy learners, Eager Learner takes
more time in learning, and less time in prediction. Example: Decision Trees, Na�ve
Bayes, ANN.
o Linear Models
o Logistic Regression
o Support Vector Machines
o Non-linear Models
o K-Nearest Neighbours
o Kernel SVM
o Na�ve Bayes
o Decision Tree Classification
o Random Forest Classification
2. Confusion Matrix:
3. AUC-ROC curve:
o ROC curve stands for Receiver Operating Characteristics Curve and AUC
stands for Area Under the Curve.
o It is a graph that shows the performance of the classification model at different
thresholds.
o To visualize the performance of the multi-class classification model, we use the
AUC-ROC Curve.
o The ROC curve is plotted with TPR and FPR, where TPR (True Positive Rate)
on Y-axis and FPR(False Positive Rate) on X-axis.
o But we need range between -[infinity] to +[infinity], then take logarithm of the
equation it will become:
o Binomial: In binomial Logistic regression, there can be only two possible types
of the dependent variables, such as 0 or 1, Pass or Fail, etc.
o Multinomial: In multinomial Logistic regression, there can be 3 or more
possible unordered types of the dependent variable, such as "cat", "dogs", or
"sheep"
o Ordinal: In ordinal Logistic regression, there can be 3 or more possible ordered
types of dependent variables, such as "low", "Medium", or "High".
Example: There is a dataset given which contains the information of various users
obtained from the social networking sites. There is a car making company that has recently
launched a new SUV car. So the company wanted to check how many users from the
dataset, wants to purchase the car.
For this problem, we will build a Machine Learning model using the Logistic regression
algorithm. The dataset is shown in the below image. In this problem, we will predict
the purchased variable (Dependent Variable) by using age and salary (Independent
variables).
Steps in Logistic Regression: To implement the Logistic Regression using Python,
Below are the steps:
1. #feature Scaling
2. from sklearn.preprocessing import StandardScaler
3. st_x= StandardScaler()
4. x_train= st_x.fit_transform(x_train)
5. x_test= st_x.transform(x_test)
The scaled output is given below:
2. Fitting Logistic Regression to the Training set:
We have well prepared our dataset, and now we will train the dataset using the training
set. For providing training or fitting the model to the training set, we will import the Logistic
Regression class of the sklearn library.
After importing the class, we will create a classifier object and use it to fit the model to the
logistic regression. Below is the code for it:
Out[5]:
Our model is well trained on the training set, so we will now predict the result by using test
set data. Below is the code for it:
Output: By executing the above code, a new vector (y_pred) will be created under the
variable explorer option. It can be seen as:
The above output image shows the corresponding predicted users who want to
purchase or not purchase the car.
4. Test Accuracy of the result
Now we will create the confusion matrix here to check the accuracy of the classification.
To create it, we need to import the confusion_matrix function of the sklearn library. After
importing the function, we will call it using a new variable cm. The function takes two
parameters, mainly y_true( the actual values) and y_pred (the targeted value return by
the classifier). Below is the code for it:
By executing the above code, a new confusion matrix will be created. Consider the below
image:
We can find the accuracy of the predicted result by interpreting the confusion matrix. By
above output, we can interpret that 65+24= 89 (Correct Output) and 8+3= 11(Incorrect
Output).
Finally, we will visualize the training set result. To visualize the result, we will
use ListedColormap class of matplotlib library. Below is the code for it:
Output: By executing the above code, we will get the below output:
o In the above graph, we can see that there are some Green points within the
green region and Purple points within the purple region.
o All these data points are the observation points from the training set, which
shows the result for purchased variables.
o This graph is made by using two independent variables i.e., Age on the x-
axis and Estimated salary on the y-axis.
o The purple point observations are for which purchased (dependent variable)
is probably 0, i.e., users who did not purchase the SUV car.
o The green point observations are for which purchased (dependent variable)
is probably 1 means user who purchased the SUV car.
o We can also estimate from the graph that the users who are younger with low
salary, did not purchase the car, whereas older users with high estimated salary
purchased the car.
o But there are some purple points in the green region (Buying the car) and some
green points in the purple region(Not buying the car). So we can say that
younger users with a high estimated salary purchased the car, whereas an
older user with a low estimated salary did not purchase the car.
The goal of the classifier:
We have successfully visualized the training set result for the logistic regression, and our
goal for this classification is to divide the users who purchased the SUV car and who did
not purchase the car. So from the output graph, we can clearly see the two regions (Purple
and Green) with the observation points. The Purple region is for those users who didn't
buy the car, and Green Region is for those users who purchased the car.
Linear Classifier:
As we can see from the graph, the classifier is a Straight line or linear in nature as we
have used the Linear model for Logistic Regression. In further topics, we will learn for non-
linear Classifiers.
Visualizing the test set result:
Our model is well trained using the training dataset. Now, we will visualize the result for
new observations (Test set). The code for the test set will remain same as above except
that here we will use x_test and y_test instead of x_train and y_train. Below is the code
for it:
The above graph shows the test set result. As we can see, the graph is divided into two
regions (Purple and Green). And Green observations are in the green region, and Purple
observations are in the purple region. So we can say it is a good prediction and model.
Some of the green and purple data points are in different regions, which can be ignored
as we have already calculated this error using the confusion matrix (11 Incorrect output).
Hence our model is pretty good and ready to make new predictions for this classification
problem.
Naïve Bayes Classifier Algorithm
o Naïve Bayes algorithm is a supervised learning algorithm, which is based on Bayes
theorem and used for solving classification problems.
o It is mainly used in text classification that includes a high-dimensional training dataset.
o Naïve Bayes Classifier is one of the simple and most effective Classification algorithms
which helps in building the fast machine learning models that can make quick
predictions.
o It is a probabilistic classifier, which means it predicts on the basis of the
probability of an object.
o Some popular examples of Naïve Bayes Algorithm are spam filtration, Sentimental
analysis, and classifying articles.
o Naïve: It is called Naïve because it assumes that the occurrence of a certain feature
is independent of the occurrence of other features. Such as if the fruit is identified on
the bases of color, shape, and taste, then red, spherical, and sweet fruit is recognized
as an apple. Hence each feature individually contributes to identify that it is an apple
without depending on each other.
o Bayes: It is called Bayes because it depends on the principle of Bayes' Theorem.
Bayes' Theorem:
o Bayes' theorem is also known as Bayes' Rule or Bayes' law, which is used to
determine the probability of a hypothesis with prior knowledge. It depends on the
conditional probability.
o The formula for Bayes' theorem is given as:
Where,
P(B|A) is Likelihood probability: Probability of the evidence given that the probability of
a hypothesis is true.
Problem: If the weather is sunny, then the Player should play or not?
OutlookOutlook PlayPlay
0 Rainy Yes
1 Sunny Yes
2 Overcast Yes
3 Overcast Yes
4 Sunny No
5 Rainy Yes
6 Sunny Yes
7 Overcast Yes
8 Rainy No
9 Sunny No
10 Sunny Yes
11 Rainy No
12 Overcast Yes
13 Overcast Yes
Weather Yes No
Overcast 5 0
Rainy 2 2
Sunny 3 2
Total 10 4
Weather No Yes
Rainy 2 2 4/14=0.29
Sunny 2 3 5/14=0.35
Applying Bayes'theorem:
P(Yes|Sunny)= P(Sunny|Yes)*P(Yes)/P(Sunny)
P(Sunny)= 0.35
P(Yes)=0.71
P(No|Sunny)= P(Sunny|No)*P(No)/P(Sunny)
P(Sunny|NO)= 2/4=0.5
P(No)= 0.29
P(Sunny)= 0.35