Q1. Describe Supervised Learning technique with example?
.Supervised learning is a type of machine learning in which a
computer algorithm learns to make predictions or decisions
based on labelled data.
.Labelled data is made up of previously known input variables
(also known as features) and output variables (also known as
labels).
.By analysing patterns and relationships between input and
output variables in labelled data, the algorithm learns to make
predictions.
.Image and speech recognition, recommendation systems, and
fraud detection are all examples of how supervised learning is
used.
.The examples below will help explain what supervised
learning is:
. Supervised learning is commonly used in email filtering to
classify incoming emails as spam or legitimate.
.A machine learning algorithm is trained using a labelled
dataset containing examples of both spam and legitimate
emails.
.The algorithm then extracts relevant information from each
email, such as the sender’s information, the subject, the
message body, and so on.
If an email is predicted to be spam, it can be automatically
filtered into a spam folder, saving the user’s inbox space.
Q2. Write and explain cost function of Logistic Regression?
Logistic Regression is one of the simplest classification
algorithms which we learn while exploring machine learning
algorithms.
Cost function for Logistic Regression
It will result in a non-convex cost function as shown above.
So, for Logistic Regression the cost function we use is also
known as the cross entropy or the log loss.
Case: If y = 0, that is the true label of the class is 0.
Cost = 0 if the predicted value of the label is 0 as well. But as
hθ(x) deviates from 0 and approaches 1 cost function
increases exponentially and tends to infinity which can be
appreciated from the below graph as well.
Q3. Explain logistic Regression from Sklearn technique?
.Logistic regression, despite its name, is a classification
algorithm rather than regression algorithm.
. Based on a given set of independent variables, it is used to
estimate discrete value (0 or 1, yes/no, true/false).
.It is also called log it or MaxEnt Classifier.
.Basically, it measures the relationship between the
categorical dependent variable and one or more independent
variables by estimating the probability of occurrence of an
event using its logistics function.
sklearn.linear_model.LogisticRegression is the module used
to implement logistic regression.
Scikit-learn (Sklearn) is the most useful and robust library for
machine learning in Python.
. It provides a selection of efficient tools for machine learning
and statistical modelling including classification, regression,
clustering and dimensionality reduction via a consistence
interface in Python.
. This library, which is largely written in Python, is built
upon NumPy, SciPy and Matplotlib.
Q4. Explain Reinforcement learning with a suitable example?
.Reinforcement learning is an area of Machine Learning.
.It is about taking suitable action to maximize reward in a
particular situation.
. It is employed by various software and machines to find the
best possible behaviour or path it should take in a specific
situation.
.Reinforcement Learning (RL) is the science of decision
making. It is about learning the optimal behaviour in an
environment to obtain maximum reward.
Reinforcement learning uses algorithms that learn from
outcomes and decide which action to take next.
. After each action, the algorithm receives feedback that helps
it determine whether the choice it made was correct, neutral or
incorrect.
.It is a good technique to use for automated systems that have
to make a lot of small decisions without human guidance.
.Reinforcement learning is an autonomous, self-teaching
system that essentially learns by trial and error.
. It performs actions with the aim of maximizing rewards, or
in other words, it is learning by doing in order to achieve the
best outcomes.
Example of reinforcement learning :
1. Automated Robots
2. Natural Language Processing
3. Marketing and Advertising
Q5.Explain Linear Regression by using suitable analysis?
Linear regression analysis is used to predict the value of a
variable based on the value of another variable.
. The variable you want to predict is called the dependent
variable.
.The variable you are using to predict the other variable's
value is called the independent variable.
.This form of analysis estimates the coefficients of the linear
equation, involving one or more independent variables that
best predict the value of the dependent variable.
.Linear regression fits a straight line or surface that minimizes
the discrepancies between predicted and actual output values.
.There are simple linear regression calculators that use a “least
squares” method to discover the best-fit line for a set of paired
data. You then estimate the value of X (dependent variable)
from Y (independent variable).
Simple Linear Regression
We could also describe this relationship with the equation for
a line, Y = a + b(x), where 'a' is the Y-intercept and 'b' is the
slope of the line.
.We could use the equation to predict weight if we knew an
individual's height.
. In this example, if an individual was 70 inches tall, we
would predict his weight to be:
Multiple Linear Regression Analysis
Q6. Explain several Variations of gradient descent?
.Gradient Descent is known as one of the most commonly used
optimization algorithms to train machine learning models by
means of minimizing errors between actual and expected
results.
.Further, gradient descent is also used to train Neural Networks.
Types of Gradient Descent:
.Based on the error in various training models, the Gradient
descent learning algorithm can be divided into
1. Gradient descent, 2. Stochastic gradient descent, and
3.Mini-batch gradient descent.
.Let's understand these different types of gradient descent:
1. Batch Gradient Descent:
Batch gradient descent (BGD) is used to find the error for each
point in the training set and update the model after evaluating
all training examples.
. This procedure is known as the training epoch.
. In simple words, it is a greedy approach where we have to sum
over all examples for each update.
2. Stochastic gradient descent
Stochastic gradient descent (SGD) is a type of gradient descent
that runs one training example per iteration.
Or in other words, it processes a training epoch for each
example within a dataset and updates each training example's
parameters one at a time.
As it requires only one training example at a time, hence it is
easier to store in allocated memory.
3. MiniBatch Gradient Descent:
Mini Batch gradient descent is the combination of both
batch gradient descent and stochastic gradient descent.
It divides the training datasets into small batch sizes then
performs the updates on those batches separately.
.Splitting training datasets into smaller batches make a
balance to maintain the computational efficiency of batch
gradient descent and speed of stochastic gradient descent.
.Hence, we can achieve a special type of gradient descent
with higher computational efficiency and less noisy gradient
descent.
Q7. Explain the concept of Training data, validation data and
testing data with example of each?
While all three are typically split from one large dataset, each
one typically has its own distinct use in ML modelling.
. Let’s start with a high-level definition of each term:
Training data. This type of data builds up the machine
learning algorithm.
The data scientist feeds the algorithm input data, which
corresponds to an expected output.
The model evaluates the data repeatedly to learn more about
the data’s behaviour and then adjusts itself to serve its
intended purpose.
Validation data. During training, validation data infuses new
data into the model that it hasn’t evaluated before.
.Validation data provides the first test against unseen data,
allowing data scientists to evaluate how well the model makes
predictions based on the new data.
.Not all data scientists use validation data, but it can provide
some helpful information to optimize hyperparameters, which
influence how the model assesses data.
Test data. After the model is built, testing data once again
validates that it can make accurate predictions.
.If training and validation data include labels to monitor
performance metrics of the model, the testing data should be
unlabelled.
. Test data provides a final, real-world check of an unseen
dataset to confirm that the ML algorithm was trained
effectively.
.While each of the three dataset has its place in creating and
training ML models, it’s easy to see some overlap between
them.
Q8. Explain Logistics regression with an example?
It is a predictive algorithm using independent variables to
predict the dependent variable, just like Linear Regression, but
with a difference that the dependent variable should be
categorical variable.
.Independent variables can be numeric or categorical variables,
but the dependent variable will always be categorical
.Logistic regression is a statistical model that uses Logistic function
to model the conditional probability.
.This is read as the conditional probability of Y=1, given X or
conditional probability of Y=0, given X.
An example of logistic regression can be to find if a person
will default their credit card payment or not.
. The probability of a person defaulting their credit card
payment can be based on the pending credit card balance and
income etc.