Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
16 views10 pages

Unit 4 Classification & Prediction

The document discusses classification techniques, focusing on model construction and usage, and highlights applications such as sentiment analysis and image classification. It details decision tree algorithms, including ID3 and CART, along with Bayesian classification and the Naïve Bayes algorithm, emphasizing their advantages and limitations. Additionally, it covers rule-based classification, lazy learner algorithms like KNN, and key metrics for evaluating model performance, including accuracy, precision, and recall.

Uploaded by

anjoomvkkl
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views10 pages

Unit 4 Classification & Prediction

The document discusses classification techniques, focusing on model construction and usage, and highlights applications such as sentiment analysis and image classification. It details decision tree algorithms, including ID3 and CART, along with Bayesian classification and the Naïve Bayes algorithm, emphasizing their advantages and limitations. Additionally, it covers rule-based classification, lazy learner algorithms like KNN, and key metrics for evaluating model performance, including accuracy, precision, and recall.

Uploaded by

anjoomvkkl
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

UNIT – 4 CLASSIFICATION AND PREDICTION

CLASSIFICATION

• Classification technique is a systematic approach to building classification models from an input


data set that is used to identify class label.
• Input data set is called Training data set.
• In classification, we arrange new data with the help of current or past data.

There are two stages of classification

1. Model Construction
2. Model Usage.

Model Construction:

Model Usage / Testing:


APPLICATIONS OF CLASSIFICATION:
1. Sentiment Analysis: Sentiment analysis is highly helpful in social media monitoring. We can use it
to extract social media insights.
2. Document Classification: We can use document classification to organize the documents into
sections according to the content. Document classification refers to text classification; we can
classify the words in the entire document.
3. Image Classification: Image classification is used for the trained categories of an image.
4. Machine Learning Classification: It uses the statistically demonstrable algorithm rules to execute
analytical tasks that would take humans hundreds of more hours to perform.
ISSUES IN CLASSIFICATION & PREDICTION:
DECISION TREE INDCUTION ALGORITHM (ID3, CART ALGORITHM)
A machine researcher named J. Ross Quinlan in 1980 developed a decision tree algorithm known as ID3
(Iterative Dichotomiser). In this algorithm, there is no backtracking; the trees are constructed in a top-
down recursive divide-and-conquer manner.
A decision tree is a structure that includes a root node, branches, and leaf nodes. Decision tree
algorithm creates classification or regression models as a tree structure to solve the problem.
Decision Tree Terminologies:
1. Root Node: Top most node in the tree. Contains data which is known as attributes.
2. Internal Node: Nodes in between root node and leaf node. It denotes a test on attributes.
3. Leaf Node: Last node in the tree. It represents an output or class label. S
4. Splitting: Splitting is the process of dividing the decision node/root node into sub-nodes
according to the given conditions.
5. Branch/Sub Tree: A tree formed by splitting the tree.
6. Pruning: Pruning is the process of removing the unwanted branches from the tree.
Root node and Internal Node are represented as Rectangle
Leaf node represented as Ovals.
Attribute Selection Measure / Key Factors:
Entropy: It refers to common way to measure impurity. In the decision tree, it measures the degree of
randomness or uncertainty in the dataset.
Information Gain: It refers to the decline in entropy after the data set is split. It is also called entropy Reduction.
It measures the reduction in entropy or variance that results from splitting a dataset based on a specific
property.
Gini Impurity or index: Gini Impurity is a score that evaluates how accurate a split is among the classified
groups. The Gini Impurity evaluates a score in the range between 0 and 1, where 0 is when all
observations belong to one class, and 1 is a random distribution of the elements within classes.
ALGORITHM:
Step-1: Begin the tree with the root node, says S, which contains the complete dataset.

Step-2: Find the best attribute in the dataset using Attribute Selection Measure (ASM).
Step-3: Divide the S into subsets that contains possible values for the best attributes.
Step-4: Generate the decision tree node, which contains the best attribute.
Step-5: Recursively make new decision trees using the subsets of the dataset
created in step -3.
EXAMPLE:
Advantages of Decision Tree algorithm

• Results are simplistic.


• Classification and regression trees are Nonparametric and Nonlinear.
• Classification and regression trees implicitly perform feature selection.
• Outliers have no meaningful effect on CART.
• It requires minimal supervision and produces easy-to-understand models.
Limitations of Decision Tree algorithm

• Overfitting.
• High Variance.
• low bias.
• the tree structure may be unstable.
Applications of the Decision Tree algorithm

• For quick Data insights.


• In Blood Donors Classification.
• For environmental and ecological data.
• In the financial sectors.

BAYES CLASSIFICATION
Bayesian classification uses Bayes theorem to predict the occurrence of any event. Bayesian classifiers
are the statistical classifiers with the Bayesian probability understanding.
BAYES THEOREM: Bayes’ Theorem is named after Thomas Bayes. He first makes use of conditional
probability to provide an algorithm which uses evidence to calculate limits on an unknown parameter.
1. Prior Probability: Prior Probability is the probability of occurring an event before the collection of new
data.
2. Posterior Probability: When new data or information is collected then the Prior Probability of an event
will be revised to produce a more accurate measure of a possible outcome.
NAÏVE BAYES ALGORITHM

• Naïve Bayes algorithm is a supervised learning algorithm, based on Bayes theorem, used to solve
classification problems.
• Used for text classification that includes a high dimensional training dataset.
• It is simple and most effective algorithm in building the fast ML models that can make quick
predictions.
• It is a probabilistic classifier.
EXAMPLE:

𝑌𝐸𝑆 𝐹𝐿𝑈 𝐶𝑂𝑉𝐼𝐷


P (𝐹𝐿𝑈,𝐶𝐿𝑂𝑉𝐼𝐷 ) = P ( 𝑌𝐸𝑆 ) * P ( ) * P (YES)
𝑌𝐸𝑆

= 3/7 * 4/7 * 7/10


= 0.17
𝑁𝑂 𝐹𝐿𝑈 𝐶𝑂𝑉𝐼𝐷
P (𝐹𝐿𝑈,𝐶𝐿𝑂𝑉𝐼𝐷 ) = P ( 𝑁𝑂 ) * P ( ) * P (NO)
𝑁𝑂

= 2/3 * 2/3 * 3/10


= 0.13

P(YES) > P (NO)


So the classification of the Given Person (flu, Covid) belongs to Fever (N0)

RULE BASED CLASSIFICATION


IF-THEN Rules

Rule-based classifier makes use of a set of IF-THEN rules for classification. We can express a rule in the
following from −

IF condition THEN conclusion


Let us consider a rule R1,
R1: IF age = youth AND student = yes THEN buy_computer = yes

• The IF part of the rule is called rule antecedent or precondition.


• The THEN part of the rule is called rule consequent.
• The antecedent part the condition consist of one or more attribute tests and these tests are
logically ANDed.
• The consequent part consists of class prediction. Assessment of Rule
• In rule-based classification in data mining, there are two factors based on which we can access
the rules.
These are:
Coverage of Rule: The fraction of the records which satisfy the antecedent conditions of a particular rule
is called the coverage of that rule.
Coverage(R)=nCOVERS/n
nCOVERS – Number of records satisfying the rule

n = Total of records in the data set


Accuracy of a rule: The fraction of the records that satisfy the antecedent conditions and meet the
consequent values of a rule is called the accuracy of that rule.
Accuracy(R) = nCORRECT / nCOVERS
nCORRECT = Number of records satisfying the consequent values
nCOVERS – Number of records satisfying the rule
Rule Extraction

Here we will learn how to build a rule-based classifier by extracting IF-THEN rules from a decision tree.
To extract a rule from a decision tree –

• One rule is created for each path from the root to the leaf node.
• To form a rule antecedent, each splitting criterion is logically ANDed.
• The leaf node holds the class prediction, forming the rule consequent
RULES:
1. If weather = cloudy then play = ‘yes’
2. If weather = ‘sunny’ or humidity=‘high’ then play =‘No’
3. If weather=‘sunny’ or humidity=‘normal’ then play =‘yes’
4. If weather=‘rainy’ or wind=‘strong’ then play=‘no’
5. If weather=‘rainy’ or wind=‘weak’ then play=‘yes’
If New test data set such as Day=11, weather = cloudy, Temp=Hot, humidity=high AND wind = weak
THEN PLAY = YES.

LAZY LEARNER ALGORITHM (KNN)


Lazy Learners: Lazy Learners are also known as instance-based learners, lazy learners do not learn a
model during the training phase. Instead, they simply store the training data and use it to classify new
instances at prediction time.
It is very fast at prediction time because it does not require computations during the predictions. it is
less effective in high-dimensional spaces or when the number of training instances is large.
Examples of lazy learners include k-nearest neighbours (KNN)
K – NEAREST NEIGHBOURS (KNN) ALGORITHM
• K-Nearest Neighbour is one of the simplest Machine Learning algorithms based on Supervised
Learning technique.
• K-NN algorithm stores all the available data and classifies a new data point based on the
similarity
• K-NN is a non-parametric algorithm, which means it does not make any assumption on
underlying data.
• It is also called a lazy learner algorithm because it does not learn from the training set
immediately instead it stores the dataset and at the time of classification, it performs an action
on the dataset.

WORKING OF KNN ALGORITHM

• Step-1: Select the number K of the neighbours


• Step-2: Calculate the Euclidean distance of K number of neighbours
• Step-3: Take the K nearest neighbours as per the calculated Euclidean distance.
• Step-4: Among these k neighbours, count the number of the data points in each category.
• Step-5: Assign the new data points to that category for which the number of the neighbour is
maximum.
• Step-6: Our model is ready.
Advantages of KNN Algorithm:

• It is simple to implement.
• It is robust to the noisy training data
• It can be more effective if the training data is large.
Disadvantages of KNN Algorithm:

• Always needs to determine the value of K which may be complex some time.
• The computation cost is high because of calculating the distance between the data points for all
the training samples.
PREDICTION

• Another process of data analysis is prediction. It is used to find a numerical output. Same as in
classification, the training dataset contains the inputs and corresponding numerical output
values.
• The algorithm derives the model or a predictor according to the training dataset. The model
should find a numerical output when the new data is given.
• Unlike in classification, this method does not have a class label.
• The model predicts a continuous-valued function or ordered value.
• Regression is generally used for prediction
ACCURACY
• Accuracy is a metric that measures how often a machine learning model correctly predicts the
outcome.
• You can calculate accuracy by dividing the number of correct predictions by the total number of
predictions.
PRECISION

• Precision is a metric that measures how often a machine learning model correctly predicts the
positive class.
• You can calculate precision by dividing the number of correct positive predictions (true positives)
by the total number of instances the model predicted as positive (both true and false positives).
RECALL

• Recall is a metric that measures how often a machine learning model correctly identifies positive
instances (true positives) from all the actual positive samples in the dataset.
• You can calculate recall by dividing the number of true positives by the number of positive
instances.

You might also like