0% found this document useful (0 votes)

9 views93 pages

Decision Tree#03

Decision tree Lecture #3

Uploaded by

Niha batool

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views93 pages

Decision Tree#03

Decision tree Lecture #3

Uploaded by

Niha batool

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 93

DECISION TREE

• This type of data is easier to classify using logistic

regression.

1
DECISION TREE

•Complex datasets cannot be classified using single decision

boundary.
•We need to split dataset again and again to create multiple
decision boundaries.
•Decision tee helps to build that.

2
DECISION TREE

• An example decision tree.

3
DECISION TREE

• Humans can supply rules to logical reasoning programs

•Another way is to impart the ability of constructing rules to

the machines themselves.

• The machine is given raw data and it is supposed to form

rules (i.e. a model or concept) about the process from which
the data is generated

4
DECISION TREES

Most widely used learning method

It is a method that induces concepts from examples

The learning is supervised: i.e. the classes or categories of the

data instances are known

It represents concepts as decision trees, a representation that

allows us to determine the classification of an object by
testing its values for certain properties

5
DECISION TREES

We may think of each property of an instance as

contributing a certain amount of information to its
classification.

For example, if our goal is to determine the species of an

animal, the discovery that it lays eggs contributes a certain
amount of information to that goal

6
Definition

 Decision tree is a classifier in the form of a tree structure

– Decision node: specifies a test on a single attribute
– Leaf node: indicates the value of the target attribute
– Arc/edge: split of one attribute
– Path: a disjunction of test to make the final decision

 Decision trees classify instances or examples by starting

at the root of the tree and moving through it until a leaf
node.
Why decision tree?

 Decision trees are powerful and popular

tools for classification and prediction.
 Decision trees represent rules, which can
be understood by humans and used in
knowledge system such as database.
key requirements

 Attribute-value description: object or case must

be expressible in terms of a fixed collection of
properties or attributes (e.g., hot, mild, cold).
 Predefined classes (target values): the target
function has discrete output values (bollean or
multiclass)
 Sufficient data: enough training cases should
be provided to learn the model.
Example – Average Disorder
Name Hair Height Weight Lotion Result

Sarah Blonde Average Light No Sunburned

Dana Blonde Tall Average Yes None

Alex Brown Short Average Yes None

Annie Blonde Short Average No Sunburned

Emily Red Average Heavy No Sunburned

Pete Brown Tall Heavy No None

John Brown Average Heavy No None

Katie Blonde Short Light Yes None

10
Average Disorder
 Average Disorder =
∑ Nb / Nt * (∑ - Nbc / Nb log2 Nbc / Nb)
 Where Nb is the number of samples in
branch b,
 Nt is the total number of samples in all
branches,
 Nbc is the total samples in branch b of
class c

11
Example (cont.)

Attribute Name Attribute Values Attribute Occurrences

Hair Blonde 4

Brown 3

Red 1

12
Example (cont.)
 Blonde = 4/8 (-2/4 log2 2/4 -2/4 log2 2/4)
= 4/8 (0.5 + 0.5)
= 0.5
 Brown = 3/8 (-3/3 log2 3/3)

= 3/8 (-1 log2 1)

=0
 Red = 1/8 (-1 log2 1)

13
Example (cont.)
Average Disorder (Hair) = Blonde + Brown
+ Red
= 0.5 + 0 + 0
= 0.5
Average Disorder (Hair) = 0.5

14
Example (cont.)
 Similarly Average Disorder for other
attributes can be calculated; which turns
out to be
 Average Disorder (Hair) = 0.5
 Average Disorder (Height) = 0.6886
 Average Disorder (Weight) = 0.9386
 Average Disorder (Lotion) = 0.6067

15
Example (cont.)
 Most homogeneous attribute is Hair so put
hair as the first test. Tree will be:

Hair

Blonde Brown

Red

16
Example (cont.)
 With red and brown hair color all the training set is
completely classified. So the only problem left is
with blonde hair color.
Attribute Name Attribute Values Attribute Occurrences

Height (with hair = blonde) Tall 1

Average 1

Short 2

17
Example (cont.)
 Tall = 1/4 (-1 log2 1)
=0
 Average = 1/4 (-1 log2 1)

=0
 Short = 2/4 (-1/2 log2 1/2 -1/2 log2 1/2)
= 2/4 (0.5 + 0.5)
= 0.5
 Average Disorder (Height with “hair = blonde”) = 0
+ 0 + 0.5 = 0.5

18
Example (cont.)
 Similarly for other attributes but with hair =
blonde the average disorder is:
 Average Disorder (Height with “hair = blonde”)
= 0.5
 Average Disorder (Weight with “hair =
blonde”) = 1
 Average Disorder (Lotion with “hair = blonde”)
=0

19
Example (cont.)
 Here the lotion is with the minimum
average disorder so it will be the nest test.
Now the tree will become:
Hair

Blonde Brown
Red

Lotion Used

No Yes

20
Example (cont.)
Hair

Blonde Brown

Red

Lotion Used Alex

Emily
Pete
John
Sun burn

No Yes
No Sun burn

Sarah Dana
Annie Katie

Sun burn No Sun burn

21
Entropy
 A measure of homogeneity of the set of examples.

 Given a set S of positive and negative examples of

some target concept (a 2-class problem), the entropy
of set S relative to this binary classification is

E(S) = - p(P)log2 p(P) – p(N)log2 p(N)

Entropy

 Suppose S has 25 examples, 15 positive and 10

negatives [15+, 10-]. Then the entropy of S relative to
this classification is

E(S)=-(15/25) log2(15/25) - (10/25) log2 (10/25)

Some Intuitions
 The entropy is 0 if
the outcome is
``certain’’.
 The entropy is
maximum if we
have no knowledge
of the system (or
any outcome is Entropy of a 2-class problem
with regard to the portion of
equally possible). one of the two groups
Information Gain

 Information gain measures the expected reduction in

entropy, or uncertainty.
Sv
Gain( S , A) Entropy ( S )  
vValues ( A ) S
Entropy ( S v )

 Values(A) is the set of all possible values for attribute A, and

Sv the subset of S for which attribute A has value v Sv = {s in S
| A(s) = v}.
 the first term in the equation for Gain is just the entropy of the
original collection S
 the second term is the expected value of the entropy after S is
partitioned using attribute A
Information Gain

 It is simply the expected reduction in

entropy caused by partitioning the
examples according to this attribute.
 It is the number of bits saved when
encoding the target value of an arbitrary
member of S, by knowing the value of
attribute A.
27
28
29
Example – Entropy/ I.G.

30
Example – Entropy/ I.G.

31
Example – Entropy/ I.G.

32
Example – Entropy/ I.G.

33
Example – Entropy/ I.G.

34
Example – Entropy/ I.G.

35
Example – Entropy/ I.G.

36
Example – Entropy/ I.G.

37
Example – Entropy/ I.G.

38
Example – Entropy/ I.G.

39
Example – Entropy/ I.G.

40
Example – Entropy/ I.G.

41
Example – Entropy/ I.G.

42
Example – Entropy/ I.G.

43
Example – Entropy/ I.G.

44
Example – Entropy/ I.G.

45
Example – Entropy/ I.G.

46
Example – Entropy/ I.G.

47
Example – Entropy/ I.G.

48
Example – Entropy/ I.G.

49
Example – Entropy/ I.G.

50
Example – Entropy/ I.G.

51
Example – Entropy/ I.G.

52
Example – Entropy/ I.G.

53
Example – Entropy/ I.G.

54
Example – Entropy/ I.G.

55
Example – Entropy/ I.G.

56
Example – Entropy/ I.G.

57
Example – Entropy/ I.G.

58
Example – Entropy/ I.G.

59
Example – Entropy/ I.G.

60
Gini Index - Example

61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
From Decision Trees to Rules

84
DECISION TREES

From Decision Trees to Rules

Hair

Blonde Brown

Red

Lotion Used Alex

Emily
Pete
John
Sun burn

No Yes
No Sun burn

Sarah Dana
Annie Katie

Sun burn No Sun burn

85
IDENTIFICATION TREES

From Decision Trees to Rules

Step 3: Make rules from the identification tree

For our example we have:

If the person’s hair is blonde

and the person uses lotion
then the person is not sunburned

If the person’s hair is blonde

and the person uses no lotion
then the person is sunburned

86
IDENTIFICATION TREES

From Decision Trees to Rules

Step 3: Make rules from the identification tree

For our example we have:

If the person’s hair is red

then the person is sunburned

If the person’s hair is brown

then the person is not sunburned

87
IDENTIFICATION TREES

From Decision Trees to Rules

Step 4: Optimize the rules (prune the antecedents)

To simplify a rule, you ask whether any of the antecedents

can be eliminated without changing what the rule does on the
samples

Example:
If hair is blonde and person uses lotion then no sunburn

If we eliminate the 1st antecedent, and check the rule over the
whole database, we find that there are no misclassifications
Hence we can drop this antecedent as unnecessary
88
IDENTIFICATION TREES

From Decision Trees to Rules

Step 4: Optimize the rules (prune the antecedents)

Example:
If hair is blonde and person uses lotion then no sunburn

If we eliminate the 1st antecedent, and check the rule over the
whole database, we find that there are no misclassifications
Hence we can drop this antecedent as unnecessary

89
IDENTIFICATION TREES

From Decision Trees to Rules

Step 4: Optimize the rules (prune the antecedents)

Example:
If hair is blonde and person uses lotion then no sunburn

If we eliminate the 2nd antecedent then the resulting

shortened rule is not consistent with the data, hence it cannot
be eliminated

90
IDENTIFICATION TREES

From Decision Trees to Rules

Step 4: Optimize the rules (eliminate unnecessary rules)

Rules leading to one label can be replaced by a default rule:

If no other rule applies
then label x

However, this means a very big assumption that all of the

uncovered concept space belongs to label x

This may mean a misclassification of an unknown instance

91
Strengths

 can generate understandable rules

 perform classification without much computation
 can handle continuous and categorical variables
 provide a clear indication of which fields are most
important for prediction or classification
Weakness
 Not suitable for prediction of continuous attribute.
 Perform poorly with many class and small data.
 Computationally expensive to train.
 At each node, each candidate splitting field must be sorted
before its best split can be found.
 In some algorithms, combinations of fields are used and a
search must be made for optimal combining weights.
 Pruning algorithms can also be expensive since many
candidate sub-trees must be formed and compared.
 Do not treat well non-rectangular regions.

Cse 445 Lecture 8 Mma
No ratings yet
Cse 445 Lecture 8 Mma
107 pages
Geometric Intuition of Decision Tree: Axis Parallel Hyperplanes
No ratings yet
Geometric Intuition of Decision Tree: Axis Parallel Hyperplanes
7 pages
ML-3-Decision Tree
No ratings yet
ML-3-Decision Tree
17 pages
Unit 1 ML (DT)
No ratings yet
Unit 1 ML (DT)
24 pages
B.Tech CSE Algorithm Design Notes
No ratings yet
B.Tech CSE Algorithm Design Notes
126 pages
Day 5 Supervised Technique-Decision Tree For Classification PDF
100% (1)
Day 5 Supervised Technique-Decision Tree For Classification PDF
58 pages
ML L8 Decision Tree
No ratings yet
ML L8 Decision Tree
109 pages
College and Advanced Algebra (Content)
100% (1)
College and Advanced Algebra (Content)
269 pages
Classification 1
No ratings yet
Classification 1
87 pages
Decision Trees in Machine Learning
No ratings yet
Decision Trees in Machine Learning
64 pages
Decision Trees & Entropy Tutorial
No ratings yet
Decision Trees & Entropy Tutorial
267 pages
Railway Applications Katalog25214
0% (1)
Railway Applications Katalog25214
74 pages
Week 11 - Decision Tree Learning
No ratings yet
Week 11 - Decision Tree Learning
43 pages
Conducting Cambridge IGCSE ICT (0417) Practical Test Instructions
No ratings yet
Conducting Cambridge IGCSE ICT (0417) Practical Test Instructions
5 pages
Decision Tree
No ratings yet
Decision Tree
42 pages
فاينل تعلم
No ratings yet
فاينل تعلم
144 pages
L8 1 Decisiontrees Random Forest
No ratings yet
L8 1 Decisiontrees Random Forest
118 pages
Human Relations in Organizations Applications and Skill Building 10th Edition Lussier Test Bank 1
100% (74)
Human Relations in Organizations Applications and Skill Building 10th Edition Lussier Test Bank 1
26 pages
Decision Tree Induction
No ratings yet
Decision Tree Induction
52 pages
Decision Trees
No ratings yet
Decision Trees
34 pages
Unit6 - 2 Classification-Decision-Trees
No ratings yet
Unit6 - 2 Classification-Decision-Trees
36 pages
Unit 1 ML (NN& ML Techniques)
No ratings yet
Unit 1 ML (NN& ML Techniques)
40 pages
P4-DTRF 1
No ratings yet
P4-DTRF 1
63 pages
CENG313 Introduction To Data Science: Lecture 12: Classification Decision Trees
No ratings yet
CENG313 Introduction To Data Science: Lecture 12: Classification Decision Trees
61 pages
Ikigai The Japanese Secret To PDF
0% (1)
Ikigai The Japanese Secret To PDF
1 page
Decision Tree
No ratings yet
Decision Tree
58 pages
Decision - Tree
No ratings yet
Decision - Tree
75 pages
19 - Decision Tree - ID3
No ratings yet
19 - Decision Tree - ID3
87 pages
Wk. 5.2. Decision Trees (27.10.2020)
No ratings yet
Wk. 5.2. Decision Trees (27.10.2020)
57 pages
Chapter 5 2018 2019
No ratings yet
Chapter 5 2018 2019
5 pages
Lecture 17 18
No ratings yet
Lecture 17 18
52 pages
Decision Tree Classification Guide
No ratings yet
Decision Tree Classification Guide
161 pages
AICh 6
No ratings yet
AICh 6
44 pages
Unit 5. Decision Trees
No ratings yet
Unit 5. Decision Trees
58 pages
Decision Tree Learning Guide
No ratings yet
Decision Tree Learning Guide
79 pages
L3 - Decision Trees
No ratings yet
L3 - Decision Trees
28 pages
Decision Tree Induction Basics
No ratings yet
Decision Tree Induction Basics
55 pages
Pue - Kar.nic - in PUE PDF Files Colleges NN
No ratings yet
Pue - Kar.nic - in PUE PDF Files Colleges NN
18 pages
Worldspan Galileo
No ratings yet
Worldspan Galileo
8 pages
Decision Trees for Data Scientists
No ratings yet
Decision Trees for Data Scientists
75 pages
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
No ratings yet
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
83 pages
Decision Tree Learning
No ratings yet
Decision Tree Learning
70 pages
Learning With Identification Trees: Artificial Intelligence CMSC 25000 February 7, 2002
No ratings yet
Learning With Identification Trees: Artificial Intelligence CMSC 25000 February 7, 2002
28 pages
Lecture 4
No ratings yet
Lecture 4
74 pages
Decision Trees
No ratings yet
Decision Trees
15 pages
Introduction To Computing Using Python An Application Development Focus 2nd Edition Perkovic Test Bank PDF Download
No ratings yet
Introduction To Computing Using Python An Application Development Focus 2nd Edition Perkovic Test Bank PDF Download
401 pages
Lecture 4
No ratings yet
Lecture 4
74 pages
23 Id3
No ratings yet
23 Id3
20 pages
Decision Trees for CS Students
No ratings yet
Decision Trees for CS Students
54 pages
Ch02 DecisionTree
No ratings yet
Ch02 DecisionTree
41 pages
07 - ML - Decision Tree
No ratings yet
07 - ML - Decision Tree
37 pages
Module 3
No ratings yet
Module 3
101 pages
Decision Trees for Data Scientists
No ratings yet
Decision Trees for Data Scientists
61 pages
15 1 Random Forest and Decision Tree
No ratings yet
15 1 Random Forest and Decision Tree
66 pages
E-Commerce Project
No ratings yet
E-Commerce Project
26 pages
Decision Tree
No ratings yet
Decision Tree
18 pages
Decision Trees-Lecture 9&10
No ratings yet
Decision Trees-Lecture 9&10
60 pages
Module 3
No ratings yet
Module 3
102 pages
Module - 2 Decision Tree Learning
No ratings yet
Module - 2 Decision Tree Learning
79 pages
Decision Trees for Business Analysts
No ratings yet
Decision Trees for Business Analysts
32 pages
Presentation Report S2019 Artificial Intelligence-CS360
No ratings yet
Presentation Report S2019 Artificial Intelligence-CS360
9 pages
FAX236S Brochure 2
No ratings yet
FAX236S Brochure 2
1 page
AIML Module-04
No ratings yet
AIML Module-04
46 pages
The Purpose of XML Schema
No ratings yet
The Purpose of XML Schema
12 pages
Unit II Part 1
No ratings yet
Unit II Part 1
62 pages
Lecture 6 - Decision Trees
No ratings yet
Lecture 6 - Decision Trees
43 pages
User's Guide For The AT&T Global Network Client For Linux: System Requirements and Installation
No ratings yet
User's Guide For The AT&T Global Network Client For Linux: System Requirements and Installation
2 pages
UTS - Lec 11 - Digital Self - Panganiban
No ratings yet
UTS - Lec 11 - Digital Self - Panganiban
13 pages
Computer Applications in Hydraulic Engineering Tutorials 2020-Jul-21
No ratings yet
Computer Applications in Hydraulic Engineering Tutorials 2020-Jul-21
100 pages
Decision-Tree Learning .
No ratings yet
Decision-Tree Learning .
29 pages
Chapter 2 Linear Signal Models
No ratings yet
Chapter 2 Linear Signal Models
40 pages
Class 16 Decision Tree
No ratings yet
Class 16 Decision Tree
45 pages
Decision Trees CLS
No ratings yet
Decision Trees CLS
43 pages
Decision Trees
No ratings yet
Decision Trees
53 pages
Decision Trees
No ratings yet
Decision Trees
40 pages
BRO Software
No ratings yet
BRO Software
28 pages
Bits ZG553 Ec-2r First Sem 2019-2020
No ratings yet
Bits ZG553 Ec-2r First Sem 2019-2020
2 pages
Products
No ratings yet
Products
78 pages
Remote Entity Authentication Using Chaotic Maps in Telemedicine (React)
No ratings yet
Remote Entity Authentication Using Chaotic Maps in Telemedicine (React)
13 pages
OM Power BI Notes
No ratings yet
OM Power BI Notes
27 pages
M.Sc. Electronics & Instrumentation
No ratings yet
M.Sc. Electronics & Instrumentation
70 pages
Harnessing The Reasoning Economy A Survey of Efficient Reasoning For Large Language Models
No ratings yet
Harnessing The Reasoning Economy A Survey of Efficient Reasoning For Large Language Models
24 pages
Independent Speed Test Analysis of 4G Mobile Networks Performed by DIKW Consulting
No ratings yet
Independent Speed Test Analysis of 4G Mobile Networks Performed by DIKW Consulting
50 pages
V-Vi Semester Syllabus Cse-Iot 22
No ratings yet
V-Vi Semester Syllabus Cse-Iot 22
39 pages
Windguru - SEVILLA CAPITAL
No ratings yet
Windguru - SEVILLA CAPITAL
1 page
How To Trade Forex and Crypto Beginner
No ratings yet
How To Trade Forex and Crypto Beginner
22 pages
BPM Strategies for Enterprises
No ratings yet
BPM Strategies for Enterprises
10 pages
CVT Chain1
No ratings yet
CVT Chain1
6 pages
GE3151 - Python
No ratings yet
GE3151 - Python
2 pages

Decision Tree#03

Uploaded by

Decision Tree#03

Uploaded by

DECISION TREE

• This type of data is easier to classify using logistic

•Complex datasets cannot be classified using single decision

• An example decision tree.

• Humans can supply rules to logical reasoning programs

•Another way is to impart the ability of constructing rules to

• The machine is given raw data and it is supposed to form

Most widely used learning method

It is a method that induces concepts from examples

The learning is supervised: i.e. the classes or categories of the

It represents concepts as decision trees, a representation that

We may think of each property of an instance as

For example, if our goal is to determine the species of an

 Decision tree is a classifier in the form of a tree structure

 Decision trees classify instances or examples by starting

 Decision trees are powerful and popular

 Attribute-value description: object or case must

Sarah Blonde Average Light No Sunburned

Dana Blonde Tall Average Yes None

Alex Brown Short Average Yes None

Annie Blonde Short Average No Sunburned

Emily Red Average Heavy No Sunburned

Pete Brown Tall Heavy No None

John Brown Average Heavy No None

Katie Blonde Short Light Yes None

Attribute Name Attribute Values Attribute Occurrences

= 3/8 (-1 log2 1)

Height (with hair = blonde) Tall 1

Lotion Used Alex

Sun burn No Sun burn

 Given a set S of positive and negative examples of

E(S) = - p(P)log2 p(P) – p(N)log2 p(N)

 Suppose S has 25 examples, 15 positive and 10

E(S)=-(15/25) log2(15/25) - (10/25) log2 (10/25)

 Information gain measures the expected reduction in

 Values(A) is the set of all possible values for attribute A, and

 It is simply the expected reduction in

From Decision Trees to Rules

Lotion Used Alex

Sun burn No Sun burn

From Decision Trees to Rules

Step 3: Make rules from the identification tree

For our example we have:

If the person’s hair is blonde

If the person’s hair is blonde

From Decision Trees to Rules

Step 3: Make rules from the identification tree

For our example we have:

If the person’s hair is red

If the person’s hair is brown

From Decision Trees to Rules

Step 4: Optimize the rules (prune the antecedents)

To simplify a rule, you ask whether any of the antecedents

From Decision Trees to Rules

Step 4: Optimize the rules (prune the antecedents)

From Decision Trees to Rules

Step 4: Optimize the rules (prune the antecedents)

If we eliminate the 2nd antecedent then the resulting

From Decision Trees to Rules

Step 4: Optimize the rules (eliminate unnecessary rules)

Rules leading to one label can be replaced by a default rule:

However, this means a very big assumption that all of the

This may mean a misclassification of an unknown instance

 can generate understandable rules

You might also like