0% found this document useful (0 votes)

182 views8 pages

Example Decision Tree

The document discusses using gini index and information gain as criteria for constructing a decision tree from sample data. It calculates the gini index and information gain for each of four attributes (A, B, C, D) using the sample data, which has continuous values converted to categorical. The attribute with the lowest gini index or highest information gain would be the best attribute to use as the root node of the decision tree.

Uploaded by

Tu Phung

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

182 views8 pages

Example Decision Tree

Uploaded by

Tu Phung

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 8

Gini Index

Gini Index is a metric to measure how often a randomly chosen element would be incorrectly
identified. It means an attribute with lower gini index should be preferred.

Example: Construct a Decision Tree by using “gini index” as a criterion

We are going to use same data sample that we used for information gain example. Let’s try to
use gini index as a criterion. Here, we have 5 columns out of which 4 columns have continuous
data and 5th column consists of class labels.

A, B, C, D attributes can be considered as predictors and E column class labels can be considered
as a target variable. For constructing a decision tree from this data, we have to convert
continuous data into categorical data.

We have chosen some random values to categorize each attribute:

A B C D

>= 5 >= 3.0 >=4.2 >= 1.4

<5 < 3.0 < 4.2 < 1.4

Gini Index for Var A

Var A has value >=5 for 12 records out of 16 and 4 records with value <5 value.

 For Var A >= 5 & class == positive: 5/12

 For Var A >= 5 & class == negative: 7/12
o gini(5,7) = 1- ( (5/12)2 + (7/12)2 ) = 0.4860
 For Var A <5 & class == positive: 3/4
 For Var A <5 & class == negative: 1/4
o gini(3,1) = 1- ( (3/4)2 + (1/4)2 ) = 0.375

By adding weight and sum each of the gini indices:

Gini Index for Var B

Var B has value >=3 for 12 records out of 16 and 4 records with value <5 value.

 For Var B >= 3 & class == positive: 8/12

 For Var B >= 3 & class == negative: 4/12
o gini(8,4) = 1- ( (8/12)2 + (4/12)2 ) = 0.446
 For Var B <3 & class == positive: 0/4
 For Var B <3 & class == negative: 4/4
o gin(0,4) = 1- ( (0/4)2 + (4/4)2 ) = 0

Gini Index for Var C

Var C has value >=4.2 for 6 records out of 16 and 10 records with value <4.2 value.

 For Var C >= 4.2 & class == positive: 0/6

 For Var C >= 4.2 & class == negative: 6/6
o gini(0,6) = 1- ( (0/8)2 + (6/6)2 ) = 0
 For Var C < 4.2& class == positive: 8/10
 For Var C < 4.2 & class == negative: 2/10
o gin(8,2) = 1- ( (8/10)2 + (2/10)2 ) = 0.32

Gini Index for Var D

Var D has value >=1.4 for 5 records out of 16 and 11 records with value <1.4 value.
 For Var D >= 1.4 & class == positive: 0/5
 For Var D >= 1.4 & class == negative: 5/5
o gini(0,5) = 1- ( (0/5)2 + (5/5)2 ) = 0
 For Var D < 1.4 & class == positive: 8/11
 For Var D < 1.4 & class == negative: 3/11
o gin(8,3) = 1- ( (8/11)2 + (3/11)2 ) = 0.397

wTarget Target

Positive Negative Positive Negative

>= >=
5 7 8 4
5.0 3.0
A B
<5 3 1 < 3.0 0 4

Ginin Index of A = 0.45825 Gini Index of B= 0.3345

Target
Target
Positive Negative
Positive Negative
>=
>= 4.2 0 6 0 5
1.4
C D
< 4.2 8 2
< 1.4 8 3
Gini Index of C= 0.2
Gini Index of D= 0.273
Entropy

Example: Construct a Decision Tree by using “information gain” as a criterion

We are going to use this data sample. Let’s try to use
information gain as a criterion. Here, we have 5 columns out of which 4 columns have
continuous data and 5th column consists of class labels.

We have chosen some random values to categorize each attribute:

A B C D

>= 5 >= 3.0 >= 4.2 >= 1.4

<5 < 3.0 < 4.2 < 1.4

There are 2 steps for calculating information gain for each attribute:

1. Calculate entropy of Target.

2. Entropy for every attribute A, B, C, D needs to be calculated. Using information gain
formula we will subtract this entropy from the entropy of target. The result is Information
Gain.

The entropy of Target: We have 8 records with negative class and 8 records with positive class.
So, we can directly estimate the entropy of target as 1.

Variable E

Positive Negative
8 8

Calculating entropy using formula:

E(8,8) = -1( (p(+ve)log( p(+ve)) + (p(-ve)*log( p(-ve)) )

= -1*( (8/16)*log2(8/16)) + (8/16) * log2(8/16) )
=1

Information gain for Var A

Var A has value >=5 for 12 records out of 16 and 4 records with value <5 value.

 For Var A >= 5 & class == positive: 5/12

 For Var A >= 5 & class == negative: 7/12
o Entropy(5,7) = -1 * ( (5/12)*log2(5/12) + (7/12)*log2(7/12)) = 0.9799
 For Var A <5 & class == positive: 3/4
 For Var A <5 & class == negative: 1/4
o Entropy(3,1) = -1 * ( (3/4)*log2(3/4) + (1/4)*log2(1/4)) = 0.81128

Entropy(Target, A) = P(>=5) * E(5,7) + P(<5) * E(3,1)

= (12/16) * 0.9799 + (4/16) * 0.81128 = 0.937745

Information gain for Var B

Var B has value >=3 for 12 records out of 16 and 4 records with value <5 value.

 For Var B >= 3 & class == positive: 8/12

 For Var B >= 3 & class == negative: 4/12
o Entropy(8,4) = -1 * ( (8/12)*log2(8/12) + (4/12)*log2(4/12)) = 0.39054
 For VarB <3 & class == positive: 0/4
 For Var B <3 & class == negative: 4/4
o Entropy(0,4) = -1 * ( (0/4)*log2(0/4) + (4/4)*log2(4/4)) = 0

Entropy(Target, B) = P(>=3) * E(8,4) + P(<3) * E(0,4)

= (12/16) * 0.39054 + (4/16) * 0 = 0.292905

Information gain for Var C

Var C has value >=4.2 for 6 records out of 16 and 10 records with value <4.2 value.

 For Var C >= 4.2 & class == positive: 0/6

 For Var C >= 4.2 & class == negative: 6/6
o Entropy(0,6) = 0
 For VarC < 4.2 & class == positive: 8/10
 For Var C < 4.2 & class == negative: 2/10
o Entropy(8,2) = 0.72193

Entropy(Target, C) = P(>=4.2) * E(0,6) + P(< 4.2) * E(8,2)

= (6/16) * 0 + (10/16) * 0.72193 = 0.4512

Information gain for Var D

Var D has value >=1.4 for 5 records out of 16 and 11 records with value <5 value.

 For Var D >= 1.4 & class == positive: 0/5

 For Var D >= 1.4 & class == negative: 5/5
o Entropy(0,5) = 0
 For Var D < 1.4 & class == positive: 8/11
 For Var D < 14 & class == negative: 3/11
o Entropy(8,3) = -1 * ( (8/11)*log2(8/11) + (3/11)*log2(3/11)) = 0.84532

Entropy(Target, D) = P(>=1.4) * E(0,5) + P(< 1.4) * E(8,3)

= 5/16 * 0 + (11/16) * 0.84532 = 0.5811575

Target Target

Positive Negative Positive Negative

>= >=
5 7 8 4
5.0 3.0
A B
<5 3 1 < 3.0 0 4

Information Gain of A = 0.062255 Information Gain of B= 0.7070795

Target Target

Positiv Negative Positive Negative

e
>=
0 5
1.4
>= 4.2 0 6 D
C
< 1.4 8 3
< 4.2 8 2
Information Gain of D= 0.41189
Information Gain of C= 0.5488

From the above Information Gain calculations, we can build a decision tree. We should place the
attributes on the tree according to their values.

An Attribute with better value than other should position as root and A branch with entropy 0
should be converted to a leaf node. A branch with entropy more than 0 needs further splitting.

Tham khảo: https://dataaspirant.com/2017/01/30/how-decision-tree-algorithm-works/

MC Female Home Challenge 6.0 Cut
100% (2)
MC Female Home Challenge 6.0 Cut
22 pages
Woman-Centered Coaching Blueprint - Workshop 3 - Handout
No ratings yet
Woman-Centered Coaching Blueprint - Workshop 3 - Handout
14 pages
EWD Camry 2006
No ratings yet
EWD Camry 2006
400 pages
Operation Strategy
100% (1)
Operation Strategy
22 pages
Learning Decision Trees
No ratings yet
Learning Decision Trees
10 pages
Decision Tree - Notes
No ratings yet
Decision Tree - Notes
8 pages
6CS4-02 Machine Learning Manish Bhardwaj
No ratings yet
6CS4-02 Machine Learning Manish Bhardwaj
625 pages
Decision Trees
No ratings yet
Decision Trees
11 pages
01 Section 6.2.1 QR Code Content
No ratings yet
01 Section 6.2.1 QR Code Content
5 pages
2c Decision Tree Algorithm
No ratings yet
2c Decision Tree Algorithm
21 pages
Decision Tree
No ratings yet
Decision Tree
2 pages
Classification Basics & Decision Trees
No ratings yet
Classification Basics & Decision Trees
82 pages
Data Minning Unit 5 PDF
No ratings yet
Data Minning Unit 5 PDF
19 pages
DWDM Final5
No ratings yet
DWDM Final5
45 pages
Decision Tree
No ratings yet
Decision Tree
5 pages
Classification With Decision Trees I: Instructor: Qiang Yang
No ratings yet
Classification With Decision Trees I: Instructor: Qiang Yang
29 pages
MIS416 Chapter6 by DrAsimAlwabel
No ratings yet
MIS416 Chapter6 by DrAsimAlwabel
73 pages
Data Mining Module 3 Important Topics PYQs
No ratings yet
Data Mining Module 3 Important Topics PYQs
43 pages
Slide 07 Chapter8 Classification Basic Concept
No ratings yet
Slide 07 Chapter8 Classification Basic Concept
55 pages
Attribute Selection Presentation by - Rohit Ghosh
No ratings yet
Attribute Selection Presentation by - Rohit Ghosh
11 pages
Decision Tree
No ratings yet
Decision Tree
12 pages
Decision Tree Classification Guide
No ratings yet
Decision Tree Classification Guide
7 pages
Classification With Decision Trees: Instructor: Qiang Yang
100% (1)
Classification With Decision Trees: Instructor: Qiang Yang
62 pages
Concepts and Techniques: Data Mining
100% (1)
Concepts and Techniques: Data Mining
81 pages
ML Mid 1 Solution
No ratings yet
ML Mid 1 Solution
36 pages
ML Unit 3 - Questions
No ratings yet
ML Unit 3 - Questions
7 pages
Id3algorithm 200307175839
No ratings yet
Id3algorithm 200307175839
22 pages
VII - CS8031 - DMDW - Module 6 - Classification - VBP
No ratings yet
VII - CS8031 - DMDW - Module 6 - Classification - VBP
99 pages
Concepts and Techniques: - Chapter 8
No ratings yet
Concepts and Techniques: - Chapter 8
81 pages
Decision Tree Learning Guide
No ratings yet
Decision Tree Learning Guide
33 pages
ID3 Decision Tree Essentials
No ratings yet
ID3 Decision Tree Essentials
20 pages
Concepts and Techniques: - Chapter 8
No ratings yet
Concepts and Techniques: - Chapter 8
42 pages
Classification & Decision Trees Guide
No ratings yet
Classification & Decision Trees Guide
16 pages
Decision Trees
No ratings yet
Decision Trees
13 pages
Mod3 Answers
No ratings yet
Mod3 Answers
15 pages
Data Warehousing & Mining Assignment
No ratings yet
Data Warehousing & Mining Assignment
10 pages
08 Class Basic
No ratings yet
08 Class Basic
81 pages
Decision Tree Classifier: GINI vs Info Gain
No ratings yet
Decision Tree Classifier: GINI vs Info Gain
8 pages
Decision Tree and Cart
No ratings yet
Decision Tree and Cart
6 pages
Decision Tree
No ratings yet
Decision Tree
25 pages
DT Solved Examples
No ratings yet
DT Solved Examples
20 pages
CH 5
No ratings yet
CH 5
81 pages
Machine Learning: BY:Vatsal J. Gajera (09BCE010)
No ratings yet
Machine Learning: BY:Vatsal J. Gajera (09BCE010)
25 pages
Unit 6 Finalized
No ratings yet
Unit 6 Finalized
30 pages
DM 3
No ratings yet
DM 3
37 pages
Ch05-DT1-Dr Amin ML
No ratings yet
Ch05-DT1-Dr Amin ML
26 pages
08 Class Basic
No ratings yet
08 Class Basic
81 pages
Decision Tree Example
No ratings yet
Decision Tree Example
21 pages
Unit-6: Classification and Prediction
No ratings yet
Unit-6: Classification and Prediction
63 pages
Classification - Decision Trees
No ratings yet
Classification - Decision Trees
43 pages
Decision Trees
No ratings yet
Decision Trees
25 pages
Decision Trees - Detailed Notes
No ratings yet
Decision Trees - Detailed Notes
8 pages
04 Classification
No ratings yet
04 Classification
72 pages
06-Classification Part1
No ratings yet
06-Classification Part1
44 pages
Decision Tree Induction
No ratings yet
Decision Tree Induction
80 pages
08 Class Basic
No ratings yet
08 Class Basic
76 pages
Module 3
No ratings yet
Module 3
132 pages
Model Optimization Techniques
No ratings yet
Model Optimization Techniques
7 pages
Decision Trees & Overfi/ng
No ratings yet
Decision Trees & Overfi/ng
32 pages
Power Screw Lift Mechanism Design
No ratings yet
Power Screw Lift Mechanism Design
5 pages
Lecture O03: ENGR90024 Computational Fluid Dynamics
No ratings yet
Lecture O03: ENGR90024 Computational Fluid Dynamics
43 pages
Ethiopian Construction Claims Study
100% (1)
Ethiopian Construction Claims Study
128 pages
UBS Business Plan - Stategic Planning and Financing Basis - Model For Generating A Business Plan - (UBS AG) PDF
No ratings yet
UBS Business Plan - Stategic Planning and Financing Basis - Model For Generating A Business Plan - (UBS AG) PDF
26 pages
Nokia 303 User Guide: Issue 1.1
No ratings yet
Nokia 303 User Guide: Issue 1.1
50 pages
Blockchain's Impact On Marketing by Slidesgo
No ratings yet
Blockchain's Impact On Marketing by Slidesgo
8 pages
The Future of Automotive Manufacturing - Integrating AI... For Next-Gen Automatic Cars
No ratings yet
The Future of Automotive Manufacturing - Integrating AI... For Next-Gen Automatic Cars
9 pages
Summit X460 Series: Scalable Aggregation and Edge Switch
No ratings yet
Summit X460 Series: Scalable Aggregation and Edge Switch
13 pages
Commerce
No ratings yet
Commerce
10 pages
Industrial Two Roll Mill Quotation
No ratings yet
Industrial Two Roll Mill Quotation
3 pages
Faircode Technologies Private Limited - Home
No ratings yet
Faircode Technologies Private Limited - Home
1 page
(Hooker and Monas, 2008) Shoestring Venture - The Startup Bible
No ratings yet
(Hooker and Monas, 2008) Shoestring Venture - The Startup Bible
532 pages
Aspiring Entrepreneur's CV
No ratings yet
Aspiring Entrepreneur's CV
4 pages
Cable Products Pricelist Cable Products Pricelist: Cable Products Price List Cable Products Price List
No ratings yet
Cable Products Pricelist Cable Products Pricelist: Cable Products Price List Cable Products Price List
24 pages
Traction Alternator Type Ta10106cy
No ratings yet
Traction Alternator Type Ta10106cy
64 pages
Reoi Construction Supervision Services Leseru-Kitale Morpus-Lokichar - 28.3.2025
100% (1)
Reoi Construction Supervision Services Leseru-Kitale Morpus-Lokichar - 28.3.2025
3 pages
Mysterious Loan Request at Bank
No ratings yet
Mysterious Loan Request at Bank
28 pages
CH1O3 Questions PDF
No ratings yet
CH1O3 Questions PDF
52 pages
Drugs
No ratings yet
Drugs
22 pages
Oops (Object Oriented Programming System) : Object Class Inheritance Polymorphism Abstraction Encapsulation
No ratings yet
Oops (Object Oriented Programming System) : Object Class Inheritance Polymorphism Abstraction Encapsulation
65 pages
Well Productivity in An Iranian Gas-Cond
No ratings yet
Well Productivity in An Iranian Gas-Cond
11 pages
Current Affairs Weekly Q&A PDF February 2023 2nd Week by AffairsCloud 1
No ratings yet
Current Affairs Weekly Q&A PDF February 2023 2nd Week by AffairsCloud 1
79 pages
Chapter 4 (Answers)
No ratings yet
Chapter 4 (Answers)
5 pages
Some Basic Concepts of Chemistry
No ratings yet
Some Basic Concepts of Chemistry
19 pages
Design and Manufacturing of Carbon Fiber Composite Drive Shaft As An Alternative To Conventional Steel Drive Shaft
No ratings yet
Design and Manufacturing of Carbon Fiber Composite Drive Shaft As An Alternative To Conventional Steel Drive Shaft
10 pages
MSDS Pigment Yellow 14
No ratings yet
MSDS Pigment Yellow 14
3 pages