0% found this document useful (0 votes)

134 views114 pages

DecisionTree Numerical ID3Prob

Uploaded by

HARSH NAYAN

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

134 views114 pages

DecisionTree Numerical ID3Prob

Uploaded by

HARSH NAYAN

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 114

Decision Tree & ID3

Algorithm
Reference book:
R2.Tom Mitchell, Machine Learning, McGraw-Hill, 1997

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

Tree Versus ML Decision Tree
Leaf
Root

Root Leaf

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

1. Satisfy Criteria for salary?
2. Is it a dream company?
3. Commute/travel time is less than a hour?
4. Offers free breakfast & Coffee?

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

You need to buy apples…
How would you choose fresh apples in market?

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

• DT is a method for approximating discrete-valued functions that is robust to
noisy data and capable of learning disjunctive expressions.
• Learned trees can also be re-represented as sets of if-then rules to improve
human readability. Dr.S.Sridevi, ASP/SCOPE, VIT Chennai
1. Satisfy Criteria for salary?
2. Is it a dream company?
3. Commute/travel time is less than a hour?
4. Offers free breakfast & Coffee?

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

Definition – Decision Trees
1. Decision trees classify instances by sorting
them down the tree from the root to
some leaf node, which provides the
classification of the instance.
2. Each node in the tree specifies a test of
• DT is a method for approximating discrete- some attribute of the instance, and each
valued functions that is robust to noisy data branch descending from that node
and capable of learning disjunctive corresponds to one of the possible values
expressions. for this attribute.
3. An instance is classified by starting at the
• Learned trees can also be re-represented as
root node of the tree, testing the attribute
sets of if-then rules to improve human
specified by this node, then moving down
readability.
the tree branch corresponding to the
value of the attribute.
4. This process is then repeated for the sub-
tree rooted at the new node.

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

Types of Decision Trees

• Classification Tree : Classification Tree is used to create a decision tree

for a categorical response (commonly known as target) with many
categorical or continuous predictors (factors). The categorical response
can be in the form of binomial or multinomial (e.g. Pass/Fail, high,
medium & low, etc.). It illustrates important patterns and relationships
between a categorical response and important predictors within highly
complicated data, without using parametric methods. Also, identify
groups in the data with desirable characteristics, and to predict response
values for new observations. For e.g., a credit card company can use
classification tree to identify customers that will take credit card or not
based on several predictors.
• Regression Tree : Regression Tree is used to create a decision tree for a
continuous response (commonly known as target) with many categorical
or continuous predictors (factors). The continuous response can be in the
form of a real number (e.g. piston diameter, blood pressure level, etc.). It
also illustrates the important patterns and relationships between a
continuous response and predictors within highly complicated data,
without using parametric methods. Also, identify groups in the data with
desirable characteristics, and to predict response values for new
observations. For example, a pharmaceutical company can use regression
tree to identify the potential predictors which are affecting the dissolution
rate based on several predictors.

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

Decision Tree Algorithms
ID3

Decision
Tree
Algorithms

C4.5 CART

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

A decision tree is built top-down from a root node and involves partitioning the
data into subsets that contain instances with similar values (homogenous).

Decision Tree algorithms

1. ID3(Iterative Dichotomiser 3): ID3 cannot handle continuous variables

directly; it works only with categorical data. It is also prone to overfitting.
(Splitting Criterion: Information Gain)

2. C4.5: It can handle both categorical and continuous attributes by converting

continuous attributes into categorical ones through thresholding. (Splitting
Criterion: Gain Ratio: C4.5 is an extension of ID3)
3. CART (Classification and Regression Trees): CART splits nodes into exactly
two branches (Splitting Criterion: Gini Index for classification trees, Variance
Reduction for regression trees.)
Dr.S.Sridevi, ASP/SCOPE, VIT Chennai
Terminologies
• Root Node : It represents the entire population or sample and
this further gets divided into two or more homogeneous sets
• Leaf Node: Node cannot be further segregated into further
nodes
• Parent/Child node: Root node is the parent node and all the
other nodes branched from it is know as child node
• Branch/sub tree: Formed by splitting the tree/node
• Splitting : It is dividing the root node/sub node into different
parts on the basis of some condition.
• Pruning: Opposite of splitting, basically removing unwanted
branches from the tree
• Entropy: Measure that tells the purity/impurity of samples
• Information Gain: It is the decrease in entropy after a dataset
is split on the basis of an attribute. Constructing a decision Decision trees represent a
tree is all about finding attribute that returns the highest
information gain (Useful in deciding which attribute can be disjunction of conjunctions of
used as root node ) constraints on the attribute values
• Reduction in variance: It is an algorithm used for continuous of instances.
target variables (regression problems). The split with lower
variance is selected as the criteria to split the population.
• Gini index: the measure of purity or impurity used in building
decision tree in CART ()
• Chi Square: It is an algorithm to find out the statistical
significance between the differences between sub-nodes and
parent node.
Dr.S.Sridevi, ASP/SCOPE, VIT Chennai
DECISION TREE REPRESENTATION
• (Outlook = Sunny, Temperature = Hot, Humidity = High, Wind = Strong)
• In general, decision trees represent a disjunction of conjunctions of constraints on the attribute
values of instances.
• Each path from the tree root to a leaf corresponds to a conjunction of attribute tests, and the tree
itself to a disjunction of these conjunctions.
(Outlook = Sunny ˄ Humidity = Normal) v (Outlook = Overcast) v (Outlook = Rain ˄ Wind = Weak)
If (Outlook = Sunny AND Humidity = Normal) OR (Outlook = Overcast) OR (Outlook = Rain AND Wind = Weak)
Then: Play Tennis = Yes
Else
Then: Play Tennis = No

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

Test Instance:
Whether we PlayTennis ?
If (“Outlook = Sunny ˄ Humidity = Normal”)
Then Play Tennis = Yes

Outlook Temperature Humidity Wind Play Tennis:

Decision
Sunny ----- High ----- Yes

(Outlook = Sunny ˄ Humidity = Normal) v (Outlook = Overcast) v (Outlook = Rain ˄ Wind = Weak)
Dr.S.Sridevi, ASP/SCOPE, VIT Chennai
Concept of Decision Trees
Attribute 1 Attribute 2 Attribute 3 Attribute 4 Class = {M, H}

Class = M

Class = H
Attribute 2

Attribute 3
Class = M

Class = H
Attribute 1

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

Fig. 3.1 Representation of objects (samples) using features.
Colour {Green, Brown, Gray, Other} Has wings?

Abdomen Thorax
length Antennae
length length

Mandible
Size

Spiracle
diameter Leg length

Fig. 3.2 Measuring features for domain of interest.

Table 3.1 Instance, Features and Class
Insect Abdomen Antennae Insect class
ID length length
1 2.7 5.5 Grasshopper
2 8 9.1 Kartydid
3 0.9 4.7 Grasshopper
4 1.1 3.1 Grasshopper
5 5.4 8.5 Kartydid
6 2.9 1.9 Grasshopper
7 6.1 6.6 Kartydid
8 0.5 1 Grasshopper
9 8.3 6.6 Kartydid
10 8.1 4.7 Kartydid
An Example from Medicine
Table 4.1 Medical Data
The main
o Gender Age BP Drug
1 Male 20 Normal A
purpose of the decision tree is 2 Female 73 Normal B
3 Male 37 High A
to expose the structural 4 Male 33 Low B
5 Female 48 High A
6 Male 29 Normal A
information contained in the 7 Female 52 Normal B
8 Male 42 Low B
data. 9 Male 61 Normal B
10 Female 30 Normal A
11 Female 26 Low B
12 Male 54 High A
Grasshoppers Katydids

7
Abdomen length > 7.1?
6
Antenna length

5 No
Yes

4
Antenna length > 6.0? Katydid
3

2 No Yes

1
Grasshopper Katydid

1 2 3 4 5 6 7 8 9 10

Abdomen length Fig. 4.8 Feature space and the decision tree for insect data.
Grasshoppers Katydids

7
Abdomen length > 7.1?
6
Antenna length

5 no yes

4
Antenna length > 6.0? Katydid
3

2 no yes

1
Grasshopper Katydid

1 2 3 4 5 6 7 8 9 10

Abdomen length
Fig. 4.8 Feature space and the decision tree for insect data.
Concept of Decision Tree ML
A decision tree is built top-down from a root node and involves partitioning the
data into subsets that contain instances with similar values (homogenous).

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

Training Dataset
• Input features are also called Attributes
• This example dataset has 2 attributes – Colour & Diameter
• Instances are defined by categorical levels/numerical
values of attributes
• We used to call a dataset as Labelled dataset if class is
defined for input feature – Concept of Supervised
classification: Here Output is categorical
Decision tree - procedure
1. Start with one of the best attribute available in the dataset
2. Start with full dataset in root node (If in case you consider taking Diameter as best
attribute: Ask a question [Is dia>= 3?]

Stop Continue
Dr.S.Sridevi, ASP/SCOPE, VIT Chennai growing tree growing tree
Decision tree - procedure
1. Start with one of the best attribute available in the dataset
2. Start with full dataset in root node (If in case you consider taking Diameter as best attribute:
Ask a question [Is dia>= 3?]
3. Based on attribute values[True/False] – Dataset is divided into 2 subsets and those subsets
becomes input to 2 new child nodes
4. In False side – Data subset has similar type for label (Grape- There is no uncertainty (no
confusion in predicting the label) about the type of leaf so stop growing the tree in that
side)
5. In True side – Subset has mixture of labels so uncertainty exists – So continue splitting the
dataset as well node

Dataset

Dataset Dataset
(Grape) (Mango &
Lemon)

Stop
Continue
growing tree
growing tree
Dr.S.Sridevi, ASP/SCOPE, VIT Chennai (Leaf node)
Decision tree - procedure

Based on the characteristics of attributes, Identify different set

of possible question to ask.
How to identify, a question to continue growing tree is good
indictor? → Information gain metric

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai
Step I
• Identify different set of possible question to ask

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

Method to identify best
attributes

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

Entropy – Is a metric to identify best attribute

A decision tree is built top-down from a root node and involves

partitioning the data into subsets that contain instances with similar
values (homogenous). ID3 algorithm uses entropy to calculate the
homogeneity of a sample. If the sample is completely homogeneous the
entropy is zero and if the sample is an equally divided it has entropy of
one.
CLASSIFICATION METHODS

Challenges

How to represent the entire information in the dataset using minimum number of
rules?
How to develop the smallest tree?

Solution

Select the variable with maximum information (highest relation with Y) for first split

31
ID3 Vs C4.5 Vs CART
Decision Trees

▪ Decision trees are a type of supervised machine learning

▪ Use well “labelled” training data and on basis of that data, predict the
output. This process can then be used to predict the results for
unknown data
▪ Decision trees can be applied for both regression and classification
problems
▪ A decision tree processes data into groups based on the value of the
data and the features it is provided
▪ Decision trees can be used for regression to get real numeric value. Or
they can be used for classification to split data into different categories

33
Decision Trees

34
34
Decision Trees

35
35
Decision Trees

36
36
Decision Trees

37
37
Decision Trees
Decision trees has three types of
nodes
▪ A root node that has no
incoming edges and zero or
more outgoing edges
▪ Internal nodes, each of which
has exactly one incoming edge
and two or more outgoing
edges
▪ Leaf or Terminal nodes, each
of which has exactly one
incoming edge and no
outgoing edges 38
38
Decision Trees
▪ In a decision tree, each leaf
node is assigned a class label
▪ The non-terminal nodes,
which include the root and
other internal nodes, contain
attribute test conditions to
separate records that have
different characteristics

39
39
Decision Trees
▪ Classifying a test record is straight
forward once a decision tree has been
constructed
▪ Starting from the root node, we apply
the test condition to the record and
follow the appropriate branch based
on the outcome of the test
▪ This will lead us either to another
internal node, for which a new test
condition is applied or to a leaf node
▪ Class label associated with a leaf node
is then assigned to the record
40
40
Stepts in Decision Trees

Step-1: Begin the tree with the root node, says S, which contains the complete dataset.
Step-2: Find the best attribute in the dataset using Attribute Selection Measure (ASM).
Step-3: Divide the S into subsets that contains possible values for the best attributes.
Step-4: Generate the decision tree node, which contains the best attribute.
Step-5: Recursively make new decision trees using the subsets of the dataset created in
step -3. Continue this process until a stage is reached where you cannot further classify the
nodes and called the final node as a leaf node.

41
41
Decision Trees

42
42
Decision Trees

Why use Decision Trees?

1. Decision Trees usually mimic human thinking ability

while making a decision, so it is easy to understand.

2. The logic behind the decision tree can be easily

understood because it shows a tree-like structure.

43
43
Terminologies
• Root Node : It represents the entire population or sample and
this further gets divided into two or more homogeneous sets
• Leaf Node: Node cannot be further segregated into further
nodes
• Parent/Child node: Root node is the parent node and all the
other nodes branched from it is know as child node
• Branch/sub tree: Formed by splitting the tree/node
• Splitting : It is dividing the root node/sub node into different
parts on the basis of some condition.
• Pruning: Opposite of splitting, basically removing unwanted
branches from the tree
• Entropy: Measure that tells the purity/impurity of samples
• Information Gain: It is the decrease in entropy after a dataset
is split on the basis of an attribute. Constructing a decision Decision trees represent a
tree is all about finding attribute that returns the highest
information gain (Useful in deciding which attribute can be disjunction of conjunctions of
used as root node ) constraints on the attribute values
• Reduction in variance: It is an algorithm used for continuous of instances.
target variables (regression problems). The split with lower
variance is selected as the criteria to split the population.
• Gini index: the measure of purity or impurity used in building
decision tree in CART ()
• Chi Square: It is an algorithm to find out the statistical
significance between the differences between sub-nodes and
parent node.
Dr.S.Sridevi, ASP/SCOPE, VIT Chennai
How many attributes? Which attribute is significant?

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

Why need to find significant attribute?

• Because to start to construct the decision tree, one of the best

attribute has to assigned as Root node.

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

Decision Trees

Attribute Selection Measures

While implementing a Decision tree, the main issue arises that how to
select the best attribute for the root node and for sub-nodes. So, to solve
such problems there is a technique which is called as Attribute selection
measure or ASM. By this measurement, we can easily select the best
attribute for the nodes of the tree. There are two popular techniques for
ASM, which are:

1. Information Gain
2. Gini Index

47
47
Decision Trees

1. Information Gain:
Information gain is the measurement of changes in entropy after the
segmentation of a dataset based on an attribute.
It calculates how much information a feature provides us about a class.
According to the value of information gain, we split the node and build the
decision tree.

A decision tree algorithm always tries to maximize the value of information

gain, and a node/attribute having the highest information gain is split first. It
can be calculated using the below formula:

Information Gain = Entropy(S )- [(Weighted Avg) *Entropy(each feature)]

48
48
Decision Trees

Entropy: Entropy is a metric to measure the impurity in a given attribute.

It specifies randomness in data.

Entropy can be calculated as:

Entropy(s)= -P(yes)log2 P(yes)- P(no) log2 P(no)

Where,
•S= Total number of samples
•P(yes)= probability of yes
•P(no)= probability of no

49
49
Decision Trees

2. Gini Index:
•Gini index is a measure of impurity or purity used while creating a
decision tree in the CART(Classification and Regression Tree) algorithm.
•An attribute with the low Gini index should be preferred as compared to
the high Gini index.
•It only creates binary splits, and the CART algorithm uses the Gini index
to create binary splits.
•Gini index can be calculated using the below formula:
Gini Index= 1- ∑jPj2

50
50
Information Gain

Few samples
are mixed Dr.S.Sridevi, ASP/SCOPE, VIT Chennai
Information Gain

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai
Dr.S.Sridevi, ASP/SCOPE, VIT Chennai
Entropy

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai
Dr.S.Sridevi, ASP/SCOPE, VIT Chennai
Dr.S.Sridevi, ASP/SCOPE, VIT Chennai
Dr.S.Sridevi, ASP/SCOPE, VIT Chennai
Entropy metric (Numerical example)

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

Entropy metric (Numerical example)

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

Concept of Decision
Trees

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

ID3 ALGORITHM (Iterative Dichotomiser 3)
• ID3 algorithm, learns decision trees by constructing them
top-down, beginning with the question "which attribute
should be tested at the root of the tree?‘ To answer this
question, each instance attribute is evaluated using a
statistical test to determine how well it alone classifies the
training examples.
• The best attribute is selected and used as the test at the
root node of the tree.
• A descendant of the root node is then created for each
possible value of this attribute, and the training examples
are sorted to the appropriate descendant node
• The entire process is then repeated using the training
examples associated with each descendant node to select
the best attribute to test at that point in the tree.
• This forms a greedy search for an acceptable decision tree,
in which the algorithm never backtracks to reconsider
earlier choices.
• A simplified version of the algorithm, specialized to learning
boolean-valued functions (i.e., concept learning)

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

WHICH ATTRIBUTE IS THE BEST
CLASSIFIER?

• The central choice in the ID3 algorithm is selecting which attribute to

test at each node in the tree.
• What is a good quantitative measure of the worth of an attribute?
• We will define a statistical property, called information gain, that
measures how well a given attribute separates the training examples
according to their target classification.
• ID3 uses this information gain measure to select among the
candidate attributes at each step while growing the tree.

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai
Contd.,

• ID3(Examples, Target attribute, Attributes)

• Examples are the training examples. Target attribute is the attribute whose
value is to be predicted by the tree. Attributes is a list of other attributes that
may be tested by the learned decision tree. Returns a decision tree that
correctly classifies the given Examples.
• Create a Root node for the tree
• If all Examples are positive, Return the single-node tree Root, with label = +
• If all Examples are negative, Return the single-node tree Root, with label = -
• If Attributes is empty, Return the single-node tree Root, with label = most
common value of
• Target attribute in Examples
• Otherwise Begin
• A<-the attribute from Attributes that best classifies Examples
• The decision attribute for Root <- A

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

Contd.,

• For each possible value, vi, of A,

• Add a new tree branch below Root, corresponding to the test A = vi
• Let Examplesvi be the subset of Examples that have value vi for A
• If Examplesvi is empty
• Then below this new branch add a leaf node with label = most
common
• value of Target attribute in Examples
• Else below this new branch add the subtree
• ID3(Examplesvi, Targetattribute, Attributes - (A)))
• End
• Return Root

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

Numerical problem
Given a dataset of historical tennis match data, including
features such as outlook, temperature, humidity and windy,
design a decision tree classifier to predict the outcome of a
tennis playing decision making based on these input features.
How would you determine the optimal split criteria at each
node of the decision tree to make accurate predictions about
whether a player should play tennis or not under specific
weather and player condition scenarios?

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

DATASET FOR PLAYING TENNIS

+ Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

-
DATASET FOR PLAYING TENNIS
Question?

Candidate
attributes

+ Positive instances Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

- Negative instances
Take X1:outlook attribute and
analyze what are the sub-attributes
in it and count # of samples in each

4
5 5

• There are 5 samples in Sunny sub-attribute with 2-Yes positive labels + 3-No negative
labels.
• There are 4 samples in Overcast sub-attribute with 4-Yes positive labels + 0-No
negative labels.
• There are 5 samples in Rainy sub-attribute with 3-Yes positive labels + 2-No negative
labels.
Dr.S.Sridevi, ASP/SCOPE, VIT Chennai
4
5 5 Frequency Table for X1

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

Take X2:temp attribute and analyze what are
the sub-attributes in it and count
Temp?

hot mild cool

No Yes Yes
No No No
Yes Yes Yes
Yes Yes Yes
4 Yes Frequency Table for X2
No 4

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

humidity?

Frequency Table for X3

high normal
No Yes
No No
Yes Yes
Yes Yes
No Yes
Yes Yes
No Yes

7 7
Dr.S.Sridevi, ASP/SCOPE, VIT Chennai
Windy?

Frequency Table for X4

false true
No No
Yes No
Yes Yes
Yes Yes
No Yes
Yes No
Yes
Yes 6
8 Dr.S.Sridevi, ASP/SCOPE, VIT Chennai
Frequency Table for entire dataset

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

DATASET FOR PLAYING TENNIS There are 4 attributes (outlook, temperature,
humidity and windy) in the given dataset.
Sub-attributes Attributes Target Which should be considered as root node?
Question?

The given 4
Candidate
attributes

+ Positive instances =9 /14 Solution

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai
- Negative instances = 5/14
Step 1

• Measure entropy for overall samples S in the given dataset. Formula

for Entropy

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

Step 2
• For all the attributes from the dataset, compute entropy and
information gain measures to identify which attribute is significant to
consider it as a root node to start building the decision tree.
• In this dataset there are 4 attributes namely outlook,
temperature, humidity and windy.
• Lets start considering outlook as the first choice in our computation to
calculate Information gain

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

For X1: Overcast compute all the
following measures, Refer to the
basic Entropy formula
Step 1: Calculate Overall Entropy value

Step 2: How many sub attributes in Outlook?

Step 3: Calculate Outlook-Sunny Attribute based

Entropy value

Step 4: Calculate Outlook-Overcast Attribute based

Entropy value
4
5 5
Step 5: Calculate Outlook-Rainy Attribute based
Entropy value

Step 6: Information for outlook

Step7: Calculate Gain for outlook

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai
Step 1: Calculate Overall Entropy value

+ Positive instances =2 /5 Step 2: How many sub attributes in Outlook?

- Negative instances = 3/5 3 = Sunny, Overcast & Rainy
Step 3: Calculate Outlook-Sunny Attribute based
Entropy value

E(Outlook=Sunny) = -(2/5) log2 (2/5) –(3/5) log2 (3/5) = 0.971

Step 4: Calculate Outlook-Overcast Attribute based

Entropy value
4
5 5
Note: In calculator Step 5: Calculate Outlook-Rainy Attribute based
log2 (x) = log(x)/log(2) Entropy value

E(Outlook=Sunny)
= -(2/5) log2 (2/5) –(3/5) log2 (3/5) Step 6: Information for outlook
= -0.4(-1.3219)-0.6(-0.7369)
= 0.52876+0.44214
= 0.971 Step7: Calculate Gain for outlook
Dr.S.Sridevi, ASP/SCOPE, VIT Chennai
Step 1: Calculate Overall Entropy value

+ Positive instances =4 /4 Step 2: How many sub attributes in Outlook?

- Negative instances = 0/4 3 = Sunny, Overcast & Rainy
Step 3: Calculate Outlook-Sunny Attribute based
Entropy value

E(Outlook=Sunny) = -(2/5) log2 (2/5) –(3/5) log2 (3/5) = 0.971

Step 4: Calculate Outlook-Overcast Attribute based

Entropy value
4
5 5 E(Outlook=Overcast) = -(4/4) log2(4/4) -(0/4) log2(0/4) = 0

Note: In calculator Step 5: Calculate Outlook-Rainy Attribute based

log2 (x) = log(x)/log(2) Entropy value

E(Outlook=Sunny)
= -(4/4) log2 (4/4) –(0/4) log2 (0/4) Step 6: Information for outlook
= -0

Step7: Calculate Gain for outlook

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai
Step 1: Calculate Overall Entropy value

+ Positive instances =3 /5 Step 2: How many sub attributes in Outlook?

- Negative instances = 2/5 3 = Sunny, Overcast & Rainy
Step 3: Calculate Outlook-Sunny Attribute based
Entropy value

E(Outlook=Sunny) = -(2/5) log2 (2/5) –(3/5) log2 (3/5) = 0.971

Step 4: Calculate Outlook-Overcast Attribute based

Entropy value
4
5 5 E(Outlook=Overcast) = -(4/4) log2(4/4) -(0/4) log2(0/4) = 0

Note: In calculator Step 5: Calculate Outlook-Rainy Attribute based

log2 (x) = log(x)/log(2) Entropy value
E(Outlook=Rainy) = -(2/5) log2 (2/5) –(3/5) log2 (3/5) = 0.971
E(Outlook=Sunny)
= -(2/5) log2 (2/5) –(3/5) log2 (3/5) Step 6: Information for outlook
= -0.4(-1.3219)-0.6(-0.7369)
= 0.52876+0.44214
= 0.971 Step7: Calculate Gain for outlook
Dr.S.Sridevi, ASP/SCOPE, VIT Chennai
Step 1: Calculate Overall Entropy value

E(S) = -(9/14) log2 (9/14) –(5/14) log2 (5/14) = 0.940

Step 2: How many sub attributes in Outlook?

3 = Sunny, Overcast & Rainy
Step 3: Calculate Outlook-Sunny Attribute based
Entropy value

E(Outlook=Sunny) = -(2/5) log2 (2/5) –(3/5) log2 (3/5) = 0.971

Step 4: Calculate Outlook-Overcast Attribute based

Entropy value
4
5 5 E(Outlook=Overcast) = -(4/4) log2(4/4) -(0/4) log2(0/4) = 0

Note: In calculator Step 5: Calculate Outlook-Rainy Attribute based

log2 (x) = log(x)/log(2) Entropy value
E(Outlook=Rainy) = -(2/5) log2 (2/5) –(3/5) log2 (3/5) = 0.971
I(Outlook) = E(Outlook=Sunny)
+E(Outlook=Overcast)+E(Outlook=Rai Step 6: Information for outlook
ny) = (5/14) * E(Outlook=Sunny)+ I(Outlook) = (5/14) * 0.971 + (4/14)*0 + (5/14)*0.971 = 0.693
(4/14)* +E(Outlook=Overcast)
(5/14)*+E(Outlook=Rainy) Step7: Calculate Gain for outlook
= 0.693 Dr.S.Sridevi, ASP/SCOPE, VIT Chennai
Gain(Outlook) = E(S)-I(Outlook) [Step1-Step 6]
= 0.940 – 0.693=0.247
Take X2:temp attribute and analyze
what are the sub-attributes in it and
count

Temp?

hot mild cool

No Yes Yes
No No No
Yes Yes Yes
Yes Yes Yes
4 Yes
No 4

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

Take X2:temp attribute and analyze
what are the sub-attributes in it and
count
Step 1: Calculate Overall Entropy value

E(S) = -(9/14) log2 (9/14) –(5/14) log2 (5/14) = 0.940

Temp? Step 2: How many sub attributes in temp?

3 = hot, mild, cool

Step 3: Calculate Outlook-Sunny Attribute based
Entropy value
hot mild cool E(temp=hot) = -(2/4) log (2/4) –(2/4) log (2/4) = 1
2 2
No Yes Yes
No Step 4: Calculate Outlook-Overcast Attribute based
No No Entropy value
Yes Yes Yes
Yes Yes Yes E(temp=mild) = -(4/6) log2(4/6)-(2/6) log2 (2/6) = 0.9184
Yes Step 5: Calculate Outlook-Rainy Attribute based
4 4 Entropy value
No
E(temp=cool) = -(3/4) log2 (3/4) –(1/4) log2 (1/4) = 0.8112
6
Step 6: Information for outlook
I(temp) = (4/14) * 1 + (6/14)*0.9184 + (4/14)*0.8112= 0.9149
Step7: Calculate Gain for outlook
Dr.S.Sridevi, ASP/SCOPE, VIT Chennai Gain(Temp) = E(S)-I(Temp)
= 0.94 – 0.9149=0.0251
humidity?

high normal
No Yes
No No
Yes Yes
Yes Yes
No Yes
Yes Yes
No Yes

7 7
Dr.S.Sridevi, ASP/SCOPE, VIT Chennai
Take X3:Humidity attribute and
analyze what are the sub-attributes
in it and count

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

Windy?

false true
No No
Yes No
Yes Yes
Yes Yes
No Yes
Yes No
Yes
Yes 6
8 Dr.S.Sridevi, ASP/SCOPE, VIT Chennai
Dr.S.Sridevi, ASP/SCOPE, VIT Chennai
We have to check which attribute
Step 3 is highest Information Gain value

The outlook attribute which has

maximum value of Information
Gain is assigned as root node to
grow the Decision tree

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

Step 4
Having identified that outlook attribute is the root node, yet we need to make 3 branches for
sunny, overcast and rain from it.

The labels
are similar
and pure
On checking overcast attribute has only (Yes) positive labels with high purity
measure so it is considered as leaf node because it has no possibility to grow
further. Whereas sunny and rain has both Yes and No labels which means impurity
is there and it has possibilityDr.S.Sridevi,
to branch out
ASP/SCOPE, as Yes and No.
VIT Chennai
Below the Sunny attribute we will grow the tree via choosing any of the pending
attributes (temp or humidity or wind)
Which to choose?
Measure the Information Gain and choose the one with highest value.
Step 5: So, Lets consider the data samples D1, D2, D8, D9, D11 pertaining to Sunny sub-
attribute with respect to other attributes namely temp, humidity and windy

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

Step 5a

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

Step 5b

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

Step 5c

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

We have to check which attribute
is having highest Information
Gain value

The Humidity attribute which has

maximum value of Information
Gain is assigned as root node in
Level1 to grow the Decision tree
further
Dr.S.Sridevi, ASP/SCOPE, VIT Chennai
Step 6

Step 6: To grow the tree further below

Rain sub-attribute, Lets consider the data
samples pertaining to Rain sub-attribute
On checking High sub-attribute has only (No)with negative respect to with
labels otherhigh
mainpurity
attributes
measure and Normal sub-attribute has namely
onlyASP/SCOPE,
(Yes) temp, humidity
with and
highwindy
Dr.S.Sridevi, VITpositive
Chennai labels purity
measure so they are considered as leaf nodes.
Step 6

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

Step 7

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai
Dr.S.Sridevi, ASP/SCOPE, VIT Chennai
Dr.S.Sridevi, ASP/SCOPE, VIT Chennai
On checking Strong sub-attribute has only (No) negative labels with high purity
measure and Weak sub-attribute has only (Yes) positive labels with high purity
measure so they are considered as leaf nodes.

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

Finally Decision tree is grown

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

DECISION TREE REPRESENTATION

• (Outlook = Sunny, Temperature = Hot, Humidity = High, Wind = Strong)

• would be sorted down the leftmost branch of this decision tree and would therefore be classified
as a negative instance (i.e., the tree predicts that Play Tennis = no).
• In general, decision trees represent a disjunction of conjunctions of constraints on the attribute
values of instances.
• Each path from the tree root to a leaf corresponds to a conjunction of attribute tests, and the tree
itself to a disjunction of these conjunctions.
(Outlook = Sunny ˄ Humidity = Normal) v (Outlook = Overcast) v (Outlook = Rain ˄ Wind = Weak)

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

Attributes
Target

+
-

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

Attributes

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai
Dr.S.Sridevi, ASP/SCOPE, VIT Chennai
`

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

Dr.S.Sridevi, ASP/SCOPE, VIT Chennai
Dr.S.Sridevi, ASP/SCOPE, VIT Chennai
Dr.S.Sridevi, ASP/SCOPE, VIT Chennai

Eye PPT For Nursing Students by DR - Reshma Ajay
100% (11)
Eye PPT For Nursing Students by DR - Reshma Ajay
46 pages
IS300 Ecu Pinout
50% (2)
IS300 Ecu Pinout
6 pages
2.12 Chapter 6 Decision Tree
No ratings yet
2.12 Chapter 6 Decision Tree
56 pages
Vectors and Projectiles
No ratings yet
Vectors and Projectiles
8 pages
Natural Gas Engineering
100% (1)
Natural Gas Engineering
51 pages
ML L8 Decision Tree
No ratings yet
ML L8 Decision Tree
109 pages
Decision Trees
No ratings yet
Decision Trees
9 pages
Machine Learning and Data Mining in Manufacturing
No ratings yet
Machine Learning and Data Mining in Manufacturing
45 pages
Exercise XD01
No ratings yet
Exercise XD01
8 pages
ML Unit 3 Qa
No ratings yet
ML Unit 3 Qa
26 pages
SAP MDG Top 50 Errors-1
No ratings yet
SAP MDG Top 50 Errors-1
3 pages
Pydantic Ai Implementation Guide
No ratings yet
Pydantic Ai Implementation Guide
26 pages
Gaussian Mixture Models Unit-III
No ratings yet
Gaussian Mixture Models Unit-III
13 pages
Glass Block Technical Presentation
No ratings yet
Glass Block Technical Presentation
16 pages
SAP MDG Client Interview Prep Jeevan FULL
No ratings yet
SAP MDG Client Interview Prep Jeevan FULL
3 pages
Sap How-To Guide - Extend The MDG Business Partner - Node Extension (Reuse Option)
No ratings yet
Sap How-To Guide - Extend The MDG Business Partner - Node Extension (Reuse Option)
65 pages
Anatomy Quiz: Arteries, Nerves, and More
No ratings yet
Anatomy Quiz: Arteries, Nerves, and More
11 pages
SAP MDG Technical Interview QA
No ratings yet
SAP MDG Technical Interview QA
3 pages
The Faerie Prince BONUS SCENES
100% (2)
The Faerie Prince BONUS SCENES
21 pages
Machine Learning
No ratings yet
Machine Learning
31 pages
Neural Networks for CS Students
100% (1)
Neural Networks for CS Students
22 pages
Unit 5 - Data Mining - WWW - Rgpvnotes.in
No ratings yet
Unit 5 - Data Mining - WWW - Rgpvnotes.in
15 pages
Week 2 Descriptive Analytics I Nature of Data, Statistical Modeling, and Visualization
No ratings yet
Week 2 Descriptive Analytics I Nature of Data, Statistical Modeling, and Visualization
45 pages
DL Full Merged
No ratings yet
DL Full Merged
454 pages
Wecall Catalog
100% (2)
Wecall Catalog
20 pages
SAP BI Interview Questions by Kuldeep
No ratings yet
SAP BI Interview Questions by Kuldeep
6 pages
Lookup Functions for Data Services
No ratings yet
Lookup Functions for Data Services
2 pages
Deep-Dive - SAP Master Data Governance, Cloud Edition
No ratings yet
Deep-Dive - SAP Master Data Governance, Cloud Edition
84 pages
EV 200 Trouble Shooting Guid 1
100% (2)
EV 200 Trouble Shooting Guid 1
82 pages
RDO No. 68 - Sorsogon City, Sorsogon 3
No ratings yet
RDO No. 68 - Sorsogon City, Sorsogon 3
703 pages
UC Berkeley Course Reviews Summary
No ratings yet
UC Berkeley Course Reviews Summary
5 pages
Lecture 1
100% (1)
Lecture 1
81 pages
Machine Learning Deep Learning Overview AIST
No ratings yet
Machine Learning Deep Learning Overview AIST
86 pages
AZ2 F+Chassis SM 3a 1 3
No ratings yet
AZ2 F+Chassis SM 3a 1 3
70 pages
How To Do A Successful MDM Project in SAP Using SAS® MDM
No ratings yet
How To Do A Successful MDM Project in SAP Using SAS® MDM
6 pages
AI Statistical Methods Course
No ratings yet
AI Statistical Methods Course
23 pages
The Cost of Obesity and Related NCDs in Brazil
No ratings yet
The Cost of Obesity and Related NCDs in Brazil
9 pages
LSMW Direct Input and Scheduling SM36
No ratings yet
LSMW Direct Input and Scheduling SM36
60 pages
George B. Handley - Literature and Ecotheology - From Chaos To Cosmos (Routledge Environmental Humanities) - Routledge (2024)
No ratings yet
George B. Handley - Literature and Ecotheology - From Chaos To Cosmos (Routledge Environmental Humanities) - Routledge (2024)
249 pages
Data Science vs. Machine Learning
No ratings yet
Data Science vs. Machine Learning
5 pages
Transformer in Production
No ratings yet
Transformer in Production
63 pages
Roles MDG F
No ratings yet
Roles MDG F
70 pages
Bài tập ôn hè lớp 4 lên lớp 5 môn tiếng Anh
No ratings yet
Bài tập ôn hè lớp 4 lên lớp 5 môn tiếng Anh
29 pages
Data Warehousing & ERP Combination
No ratings yet
Data Warehousing & ERP Combination
36 pages
SAP MDG Client Interview Prep Jeevan
No ratings yet
SAP MDG Client Interview Prep Jeevan
4 pages
SAP HANA Predictive Analysis Library PAL en PDF
No ratings yet
SAP HANA Predictive Analysis Library PAL en PDF
570 pages
SAP MDG Training for Professionals
No ratings yet
SAP MDG Training for Professionals
6 pages
01 - ML Introduction - Course Outline
No ratings yet
01 - ML Introduction - Course Outline
21 pages
Data Analyst
No ratings yet
Data Analyst
12 pages
LG - TV - LG Uj6500
100% (1)
LG - TV - LG Uj6500
37 pages
Sap Mii 15.2
100% (1)
Sap Mii 15.2
18 pages
Navsure N400i
No ratings yet
Navsure N400i
76 pages
SAP Table Relationships Guide
No ratings yet
SAP Table Relationships Guide
11 pages
Maths Roadmap For Machine Learning
No ratings yet
Maths Roadmap For Machine Learning
16 pages
Stoic H Practice Key
No ratings yet
Stoic H Practice Key
2 pages
June 2018 Question Paper 11
No ratings yet
June 2018 Question Paper 11
28 pages
District Resource Centre Mahbubnagar: Physics Paper: Ii
No ratings yet
District Resource Centre Mahbubnagar: Physics Paper: Ii
4 pages
Mastertop 1210i M 12-04
No ratings yet
Mastertop 1210i M 12-04
3 pages
CH 6
No ratings yet
CH 6
72 pages
T-GCPBDML-B - M2 - Data Engineering For Streaming Data - ILT Slides
No ratings yet
T-GCPBDML-B - M2 - Data Engineering For Streaming Data - ILT Slides
71 pages
Data Science
100% (1)
Data Science
7 pages
Decision Tree Algorithms Guide
No ratings yet
Decision Tree Algorithms Guide
49 pages
Neural Networks: A Deep Learning Guide
No ratings yet
Neural Networks: A Deep Learning Guide
13 pages
1.6 - Data Integration, 1.10 - Transformation
No ratings yet
1.6 - Data Integration, 1.10 - Transformation
3 pages
Unit of Analysis
No ratings yet
Unit of Analysis
56 pages
DS Cheat Sheets
No ratings yet
DS Cheat Sheets
18 pages
Data Mining: Concepts and Architecture
100% (1)
Data Mining: Concepts and Architecture
10 pages
Bayes Classification for Fish Sorting
No ratings yet
Bayes Classification for Fish Sorting
86 pages
Machine Learning Is Fun 1565131730
No ratings yet
Machine Learning Is Fun 1565131730
48 pages
Sapuniversity - Eu-The Usage of The SAP CRM Role Configuration Key Detailed Example
No ratings yet
Sapuniversity - Eu-The Usage of The SAP CRM Role Configuration Key Detailed Example
10 pages
Dynamic Programming for CS Students
No ratings yet
Dynamic Programming for CS Students
23 pages
ENNUS1 Manual
No ratings yet
ENNUS1 Manual
30 pages
Reliance Commercial Vehicle Policy
No ratings yet
Reliance Commercial Vehicle Policy
21 pages
Data Science Recommended Books
No ratings yet
Data Science Recommended Books
23 pages
Magneto-Optical Kerr Effect Guide
No ratings yet
Magneto-Optical Kerr Effect Guide
22 pages
XPS FOAM - SquareEdge
No ratings yet
XPS FOAM - SquareEdge
4 pages
Machine Learning & Data Mining
No ratings yet
Machine Learning & Data Mining
4 pages
Power BI Field List Icon Updates
No ratings yet
Power BI Field List Icon Updates
3 pages
Gujarat Technological University
No ratings yet
Gujarat Technological University
2 pages
Pilih Jawapan Yang Terbaik Untuk Melengkapkan Ayat Berikut
No ratings yet
Pilih Jawapan Yang Terbaik Untuk Melengkapkan Ayat Berikut
9 pages
Chem Workshop #1
No ratings yet
Chem Workshop #1
2 pages
Seminar Report Machine Learning
No ratings yet
Seminar Report Machine Learning
20 pages
Weka Tutorial
No ratings yet
Weka Tutorial
2 pages
Department of Education: Republic of The Philippines
No ratings yet
Department of Education: Republic of The Philippines
3 pages
Medical Y-Connector Guidelines
No ratings yet
Medical Y-Connector Guidelines
2 pages
SAP HANA Predictive Analysis Library PAL en
100% (2)
SAP HANA Predictive Analysis Library PAL en
243 pages
Winshuttle CS01 StepbyStepGuide
No ratings yet
Winshuttle CS01 StepbyStepGuide
11 pages
Best Practices of Huawei SAP HANA TDI Solution Using OceanStor Dorado V3
No ratings yet
Best Practices of Huawei SAP HANA TDI Solution Using OceanStor Dorado V3
26 pages
Big Data Analytics Notes
No ratings yet
Big Data Analytics Notes
9 pages
Mastering Machine Learning With Scikit-Learn: Chapter No. 5 "Nonlinear Classification and Regression With Decision Trees"
No ratings yet
Mastering Machine Learning With Scikit-Learn: Chapter No. 5 "Nonlinear Classification and Regression With Decision Trees"
23 pages