0% found this document useful (0 votes)

6 views58 pages

Lec 14

The document discusses decision trees, a method for classification and regression that mimics human reasoning. It outlines the process of decision-making using a tree structure, explains key terminology, and describes algorithms for selecting the best splits based on statistical measures like entropy and information gain. The content emphasizes the interpretability and effectiveness of decision trees in handling both categorical and quantitative features.

Uploaded by

Muhammad Furrukh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views58 pages

Lec 14

Uploaded by

Muhammad Furrukh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 58

KASHIF JAVED

EED, UET, Lahore

1
Lecture 14
Decision Trees

Readings:
▪ https://people.eecs.berkeley.edu/~jrs/189/ KASHIF JAVED
▪ Chapter 3 of Tom Mitchell, “Machine Learning”,EED,
McGraw Hill, 1997
UET, Lahore
▪ Luis G. Serrano, “Grokking Machine Learning”, 2021
2
Human Reasoning
• Decision tree learning very much resembles human reasoning.

KASHIF JAVED
EED, UET, Lahore

3
Human Reasoning
• Decision tree learning very much resembles human reasoning.
• Consider the scenario: We want to decide whether we should wear a
jacket today.

KASHIF JAVED
EED, UET, Lahore

4
Human Reasoning
• Decision tree learning very much resembles human reasoning.
• Consider the scenario: We want to decide whether we should wear a
jacket today.
• The decision process looks like:
▪ Look outside and check if it’s raining
− If it’s raining
• then wear a jacket
− If it’s not
• then we may check the temperature
• If it is hot, then don’t wear a jacket
KASHIF JAVED
• If it is cold, then wear a jacket
EED, UET, Lahore

5
Human Reasoning
• The decision process can be
represented as a tree

• The decisions are made by

traversing the tree from top to
bottom.

KASHIF JAVED
EED, UET, Lahore

6
DT Terminology

KASHIF JAVED
EED, UET, Lahore

7
Decision Trees
• Nonlinear method for classification and regression.
• Uses tree with 2 node types:
– internal nodes test feature values (usually just one) & branch accordingly
– leaf nodes specify class ℎ(𝑥)

KASHIF JAVED
EED, UET, Lahore

8
Decision Trees

KASHIF JAVED
EED, UET, Lahore

Deciding whether to go out for a picnic 9

Decision Trees

KASHIF JAVED
EED, UET, Lahore

10
Decision Trees

KASHIF JAVED
EED, UET, Lahore

11
Decision Trees
• Cuts x-space into rectangular cells
• Works well with both categorical (e.g., outlook) and quantitative features
(e.g., humidity)
• Interpretable result (inference)
• Decision boundary can be arbitrarily complicated

KASHIF JAVED
EED, UET, Lahore

12
Decision Trees
Linearly separable dataset Non-Linearly separable dataset

KASHIF JAVED
EED, UET, Lahore
Comparison of linear classifiers vs. decision trees on 2 examples. 13
Decision Trees
• Consider classification first.

• Greedy, top-down learning heuristic:

• This algorithm is more or less obvious and has been rediscovered many
times. It’s naturally recursive.

• I’ll show how it works for classification

KASHIF first; later I’ll talk about how it works
JAVED
for regression. EED, UET, Lahore

14
Greedy Algorithms
• At every point, the algorithm makes the best possible available move

• They tend to work well, but no guarantee that making the best possible
move at each timestep gets you to the best overall outcome

• The algorithm never backtracks to reconsider earlier choices

KASHIF JAVED
EED, UET, Lahore

15
Greedy Algorithms

KASHIF JAVED
EED, UET, Lahore

16
The Basic Algorithm
• Evaluate each attribute using a statistical test to determine how well it
alone classifies the training examples

• Select the best attribute and use it as the test at the root node of the tree

• Create a descendant for each possible value of this attribute

• Sort the training examples to the appropriate descendant node

KASHIF
• Repeat the entire process using JAVED
the training examples associated with
each descendant node EED, UET, Lahore

17
Decision Trees
• Let 𝑆 ⊆ {1, 2, . . . , 𝑛} be set of sample point indices

• Top-level call: 𝑆 = {1, 2, . . . , 𝑛}

KASHIF JAVED
EED, UET, Lahore

18
Decision Trees
GrowTree(S )
if (𝑦𝑖 = C for all i ∈ S and some class C) then {
return new leaf(C) [We say the leaves are pure]
} else {
choose best splitting feature j and splitting value 𝛽 (*)
𝑆𝑙 = {𝑖 ∈ 𝑆 ∶ 𝑋𝑖𝑗 < 𝛽} [Or you could use ≤ and >]
𝑆𝑟 = {𝑖 ∈ 𝑆 ∶ 𝑋𝑖𝑗 ≥ 𝛽}
return new node( j, 𝛽, GrowTree(𝑆𝑙 ), GrowTree(𝑆𝑟 ))
}

KASHIF JAVED
EED, UET, Lahore

19
How to Choose Best Split?

Is this a good attribute Which one should we pick?

to split on?
KASHIF JAVED
EED,
Which attribute/split made more UET, Lahore
progress in helping us classify points
correctly?
20
How to Choose Best Split?
• Try all splits.

• For a set 𝑆 , let 𝐽(𝑆) be the cost of 𝑆.

• Choose the split that minimizes 𝐽(𝑆𝑙 ) + 𝐽(𝑆𝑟 ); or the split that minimizes
weighted average
𝑆𝑙 𝐽(𝑆𝑙 ) + 𝑆𝑟 𝐽(𝑆𝑟 )
𝑆𝑙 + 𝑆𝑟
KASHIF JAVED
vertical brackets | · | to denote set cardinality.
EED, UET, Lahore

21
Decision Trees
• How to choose cost 𝐽(𝑆)?

• I’m going to start by suggesting a mediocre cost function, so you can see
why it’s mediocre.

• Idea 1 (bad): Label 𝑆 with the class 𝐶 that labels the most points in 𝑆.
𝐽(𝑆) ← # of points in 𝑆 not in class 𝐶.

KASHIF JAVED
EED, UET, Lahore

22
Decision Trees

KASHIF
Problem: 𝐽(𝑆𝑙 ) + 𝐽(𝑆𝑟 ) = 10 for both JAVED
splits, but left split is much better. Weighted avg
prefers right split! There are manyEED, UET,
diﬀerent splitsLahore
that all have the same total cost. We
want a cost function that better distinguishes between them.
23
How to Choose Best Split?

KASHIF JAVED
A better split – the oneEED,
that splits
UET, the data into purer subsets
Lahore

24
How to Choose Best Split?

KASHIF JAVED
A perfect attribute would ideally
EED, UET, divide the examples
Lahore
into sub-sets that are all positive or all negative
25
Decision Trees
• Idea 2 (good): Measure the entropy. [An idea from information theory.]
• Let 𝑌 be a random class variable and suppose 𝑃(𝑌 = 𝐶) = 𝑝𝐶
• The surprise of 𝑌 being class 𝐶 is − log 2 𝑝𝐶 . [Always nonnegative.]
– event w/prob. 1 gives us zero surprise
– event w/prob. 0 gives us infinite surprise!

KASHIF JAVED
EED, UET, Lahore

26
Decision Trees
• The entropy of an index set S is the average surprise (Characterizes the
(im)purity of an arbitrary collection of examples)
𝐻 𝑆 = − ෍ 𝑝𝐶 log 2 𝑝𝐶
𝐶

𝑖 ∈ 𝑆: 𝑦𝑖 = 𝐶
𝑝𝐶 =
𝑆
• The proportion of points in 𝑆 that are in class 𝐶
KASHIF JAVED
EED, UET, Lahore

27
Decision Trees
• If all points in 𝑆 belong to same class? 𝐻(𝑆 ) = −1 log 2 1 = 0.

• Half class 𝐶, half class 𝐷? 𝐻 𝑆 = −0.5 log 2 0.5 − 0.5 log 2 0.5 = 1

1
• 𝑛 points, all diﬀerent classes? 𝐻 𝑆 = − log 2 𝑛 = log 2 𝑛

KASHIF JAVED
EED, UET, Lahore

28
Decision Trees

KASHIF JAVED
Plot of the entropy 𝐻(𝑝𝐶 ) when thereEED, UET,
are only Lahore
two classes. The probability of the second
class is 𝑝𝐷 = 1− 𝑝𝐶 , so we can plot the entropy with just one dependent variable.
29
Decision Trees

KASHIF JAVED
EED, UET, Lahore
If you have > 2 classes, you would need a multidimensional chart to plot the entropy,
but the entropy is still strictly concave. 30
Decision Trees
• Weighted avg entropy after split is

𝑆𝑙 𝐻(𝑆𝑙 ) + 𝑆𝑟 𝐻(𝑆𝑟 )
𝐻𝑎𝑓𝑡𝑒𝑟 =
𝑆𝑙 + 𝑆𝑟

• Gives us the remaining uncertainty after getting info on an attribute

• Choose the attribute/split that KASHIF

minimizes 𝐻𝑎𝑓𝑡𝑒𝑟
JAVED
EED, UET, Lahore

31
Information Gain
• Information gain – expected reduction in entropy caused by partitioning the
examples according to an attribute
• Choose split that maximizes information gain: 𝐻(𝑆) − 𝐻𝑎𝑓𝑡𝑒𝑟
• Same as minimizing 𝐻𝑎𝑓𝑡𝑒𝑟
• Information gain can never be negative
• Information gain – measures how well a given attribute separates the
training examples according to their target classification
KASHIF JAVED
EED, UET, Lahore

32
Example

KASHIF JAVED
EED, UET, Lahore

33
Information Gain
• Info gain always positive except it is zero

▪ when one child is empty or

▪ for all 𝐶, 𝑃 𝑦𝑖 = 𝐶 𝑖 ∈ 𝑆𝑙 = 𝑃 𝑦𝑖 = 𝐶 𝑖 ∈ 𝑆𝑟 .

KASHIF JAVED
EED, UET, Lahore

34
Another Example

KASHIF JAVED
EED, UET, Lahore

35
Calculate Entropy

KASHIF JAVED
EED, UET, Lahore

36
Calculate Info Gain

KASHIF JAVED
EED, UET, Lahore

37
Select the root node

KASHIF JAVED
EED, UET, Lahore

38
Create Branches below the root for each
of its possible values

KASHIF JAVED
EED, UET, Lahore

39
Repeat the process for each nonterminal
descendant node

KASHIF JAVED
EED, UET, Lahore

40
Repeat the process for each nonterminal
descendant node

KASHIF JAVED
EED, UET, Lahore

41
Decision Trees
• Suppose we pick two points on the
entropy curve

• One represents the left child and

the other represents the right child

• The parent also has entropy on the

curve

KASHIF JAVED
EED, UET, Lahore

42
Decision Trees
• If you unite the two sets into one
parent set, the parent set’s value
𝑝𝐶 is the weighted average of the
children’s 𝑝𝐶 ’s.

• Therefore, the point directly above

that point on the curve represents
the parent’s entropy.

KASHIF JAVED
EED, UET, Lahore
𝑝𝐶 𝑖𝑛 𝑆 is the 𝑝𝐶 𝑖𝑛 𝑆𝑟
𝑝𝐶 𝑖𝑛 𝑆𝑙
weighted average 43
Decision Trees
• Now draw a line segment
connecting them.

• Because the entropy curve is

strictly concave, the interior of the
line segment is strictly below the
curve

• Any point on that segment

represents a weighted average of
KASHIF JAVED
the two entropies for suitable EED, UET, Lahore
weights 𝑝𝐶 𝑖𝑛 𝑆𝑟
𝑝𝐶 𝑖𝑛 𝑆𝑙 𝑝𝐶 𝑖𝑛 𝑆 is the
weighted average 44
Decision Trees
• The information gain is the vertical
distance between them.

• So the information gain is positive

unless the two child sets both have
exactly the same 𝑝𝐶 and lie at the
same point on the curve

KASHIF JAVED
EED, UET, Lahore

45
Decision Trees
• Now, contrast entropy curve
against a naïve curve - plot of the
% misclassified

• If we draw a line segment

connecting two points on the
curve, the segment might lie
entirely on the curve.

KASHIF JAVED
EED, UET, Lahore

46
Decision Trees
• The problem is that many diﬀerent
splits will get the same weighted
average cost

• this test doesn’t distinguish the

quality of diﬀerent splits well

KASHIF JAVED
EED, UET, Lahore

47
Alternative Measures for Selecting
Attributes
• Natural bias in the information gain measure is that it favors attributes with
many values over those with few values

• Consider Day as an attribute

KASHIF JAVED
EED, UET, Lahore

48
Alternative Measures for Selecting
Attributes

KASHIF JAVED
EED, UET, Lahore

49
Alternative Measures for Selecting
Attributes
• Natural bias in the information gain measure is that it favors attributes with
many values over those with few values

• Consider Day as an attribute

▪ Has a very large number of possible values

▪ Will have the highest IG value as it alone perfectly predicts the target attribute
over the training data
▪ Will be selected for the root node and will lead to a (quite broad) tree of depth
one, which perfectly classifiesKASHIF
the training data
JAVED
EED, UET, Lahore

50
Alternative Measures for Selecting
Attributes
• We need to penalize attributes such as Day.
• Split information is sensitive to how broadly and uniformly the attribute
splits the data:

𝑐 |𝑆𝑖 | |𝑆𝑖 |
𝑆𝑝𝑙𝑖𝑡𝐼𝑛𝑓𝑜𝑟𝑚𝑎𝑡𝑖𝑜𝑛 = − σ𝑖=1 𝑙𝑜𝑔2
|𝑆| |𝑆|

• where S1 through Sc, are the c subsets of examples resulting from

partitioning S by the c-valued attribute
KASHIF A. JAVED
EED, UET, Lahore

51
Gain Ratio
• The Gain Ratio measure is defined in terms of the Gain measure and Split
lnformation, as follows

𝐼𝑛𝑓𝑜 𝐺𝑎𝑖𝑛
𝐺𝑎𝑖𝑛 𝑅𝑎𝑡𝑖𝑜 =
𝑆𝑝𝑙𝑖𝑡 𝐼𝑛𝑓𝑜𝑟𝑚𝑎𝑡𝑖𝑜𝑛

• Split Information term discourages the selection of attributes with many

uniformly distributed values
KASHIF JAVED
EED, UET, Lahore

52
Incorporating Continuous-Valued
Attributes
• More on choosing a split:
– For binary feature 𝑥𝑖 : children are 𝑥𝑖 = 0 & 𝑥𝑖 = 1
– If 𝑥𝑖 has 3+ discrete values: split depends on application
▪ Sometimes it makes sense to use multiway splits; sometimes binary splits.

– If 𝑥𝑖 is quantitative (continuous): sort 𝑥𝑖 values in 𝑆 ; try splitting between

each pair of unequal consecutive values
▪ We can radix sort the points in linear time, and if 𝑛 is huge we should
KASHIF JAVED
EED, UET, Lahore

53
Incorporating Continuous-Valued
Attributes
• Clever bit: As you scan sorted list from left to right, you can update entropy
in 𝑂(1) time per point!

• This is important for obtaining a fast tree-building time.

• Draw a row of 𝐶’s and 𝑋’s; show how we update the # of 𝐶’s and # of 𝑋’s in
each of 𝑆𝑙 and 𝑆𝑟 as we scan from left to right.

KASHIF JAVED
EED, UET, Lahore

54
Incorporating Continuous-Valued
Attributes

KASHIF JAVED
EED, UET, Lahore

55
Incorporating Continuous-Valued
Attributes

KASHIFWe need these 4 numbers to

JAVED
compute entropy
EED, UET, Lahore

56
Time Complexity
• Algs & running times:
• Classify test point: Walk down tree until leaf. Return its label.
Worst-case time is 𝑂(𝑡𝑟𝑒𝑒 𝑑𝑒𝑝𝑡ℎ).

▪ For binary features, that’s ≤ 𝑑. (Quantitative features may go deeper.)

Usually (not always) ≤ 𝑂(log 𝑛).

KASHIF JAVED
EED, UET, Lahore

57
Time Complexity
• Training:
▪ For binary features, try 𝑂(𝑑) splits at each node.
▪ For quantitative features, try 𝑂(𝑛′𝑑) splits; 𝑛′ = points in node
▪ Either way ⇒ 𝑂(𝑛′𝑑) time at this node
▪ Each point participates in 𝑂(𝑑𝑒𝑝𝑡ℎ) nodes, costs O(d) time in each node.
Running time ≤ 𝑂(𝑛𝑑 𝑑𝑒𝑝𝑡ℎ)

KASHIF JAVED
EED, UET, Lahore

Decision Tree#03
No ratings yet
Decision Tree#03
93 pages
Cse 445 Lecture 8 Mma
No ratings yet
Cse 445 Lecture 8 Mma
107 pages
Lec 15
No ratings yet
Lec 15
66 pages
ML - Module 2
No ratings yet
ML - Module 2
41 pages
ML Unit 3 Notes-1
No ratings yet
ML Unit 3 Notes-1
118 pages
ML Unit 3 Notes
No ratings yet
ML Unit 3 Notes
117 pages
Unit 1 ML (DT)
No ratings yet
Unit 1 ML (DT)
24 pages
Wk. 5.2. Decision Trees (27.10.2020)
No ratings yet
Wk. 5.2. Decision Trees (27.10.2020)
57 pages
ML Mod4 Ktunotes - in
No ratings yet
ML Mod4 Ktunotes - in
35 pages
Aircraft IT Ops V10.4 - SEPTEMBER-OCTOBER 2021 - V10.4
No ratings yet
Aircraft IT Ops V10.4 - SEPTEMBER-OCTOBER 2021 - V10.4
77 pages
DecisionTree Numerical ID3Prob
No ratings yet
DecisionTree Numerical ID3Prob
114 pages
Decision Trees
No ratings yet
Decision Trees
34 pages
Unit 1 ML (NN& ML Techniques)
No ratings yet
Unit 1 ML (NN& ML Techniques)
40 pages
Module 3 - Decision Tress and Artificial Neural Networks
No ratings yet
Module 3 - Decision Tress and Artificial Neural Networks
177 pages
Decision Tree
No ratings yet
Decision Tree
42 pages
4th Sem MA Module 3 Notes
No ratings yet
4th Sem MA Module 3 Notes
27 pages
Data Mining Mini Projrct
No ratings yet
Data Mining Mini Projrct
16 pages
Decision Tree Learning Guide
No ratings yet
Decision Tree Learning Guide
79 pages
7 DecisionTree
No ratings yet
7 DecisionTree
58 pages
ESGB - 2025 - Classification and Regression Tress (Enregistré Automatiquement)
No ratings yet
ESGB - 2025 - Classification and Regression Tress (Enregistré Automatiquement)
43 pages
Decision - Tree
No ratings yet
Decision - Tree
75 pages
Lecture 17 18
No ratings yet
Lecture 17 18
52 pages
2025 Lecture07 P1 ID3
No ratings yet
2025 Lecture07 P1 ID3
41 pages
Ms. Mehroz Sadiq: 11/23/2020 Bahria University Islamabad 1
No ratings yet
Ms. Mehroz Sadiq: 11/23/2020 Bahria University Islamabad 1
75 pages
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
No ratings yet
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
83 pages
Springer - Linguistic Decision Trees For Classification-2014
No ratings yet
Springer - Linguistic Decision Trees For Classification-2014
43 pages
Classification Trees
No ratings yet
Classification Trees
48 pages
Unit 5. Decision Trees
No ratings yet
Unit 5. Decision Trees
58 pages
MLT UNIT-3 Notes
No ratings yet
MLT UNIT-3 Notes
35 pages
Decision Trees for Data Scientists
No ratings yet
Decision Trees for Data Scientists
61 pages
Decision Trees
No ratings yet
Decision Trees
45 pages
Decision Trees for CS Students
No ratings yet
Decision Trees for CS Students
54 pages
AI&Ml-module 4 (Complete)
No ratings yet
AI&Ml-module 4 (Complete)
124 pages
Module 3 DecisionTree Notes
100% (1)
Module 3 DecisionTree Notes
14 pages
AI&Ml-module 4 (Part 1)
No ratings yet
AI&Ml-module 4 (Part 1)
85 pages
Machine Learning: Decision Trees & Algorithms
No ratings yet
Machine Learning: Decision Trees & Algorithms
24 pages
Unit II Part 1
No ratings yet
Unit II Part 1
62 pages
ML Unit-2 Material
No ratings yet
ML Unit-2 Material
20 pages
Practice Q Machine Learning Ans
No ratings yet
Practice Q Machine Learning Ans
54 pages
AIML Lec-11
No ratings yet
AIML Lec-11
18 pages
3 - Decision Trees
No ratings yet
3 - Decision Trees
16 pages
Ds 6
No ratings yet
Ds 6
24 pages
Decision Trees-Lecture 9&10
No ratings yet
Decision Trees-Lecture 9&10
60 pages
ML-3-Decision Tree
No ratings yet
ML-3-Decision Tree
17 pages
Machine Learning: Prepared by
No ratings yet
Machine Learning: Prepared by
44 pages
Decision Tree Classifier Project
100% (1)
Decision Tree Classifier Project
20 pages
Decision Tree Basics
No ratings yet
Decision Tree Basics
30 pages
Data Mining Practical 8
No ratings yet
Data Mining Practical 8
7 pages
Module - 2 Decision Tree Learning
No ratings yet
Module - 2 Decision Tree Learning
79 pages
Decision Trees
No ratings yet
Decision Trees
15 pages
Decision Tree
No ratings yet
Decision Tree
43 pages
Decision Trees for CS Students
No ratings yet
Decision Trees for CS Students
14 pages
Decision Tree Learning Guide
No ratings yet
Decision Tree Learning Guide
12 pages
Geometric Intuition of Decision Tree: Axis Parallel Hyperplanes
No ratings yet
Geometric Intuition of Decision Tree: Axis Parallel Hyperplanes
7 pages
Ch02 DecisionTree
No ratings yet
Ch02 DecisionTree
41 pages
Decision Trees: A Recent Overview: S. B. Kotsiantis
No ratings yet
Decision Trees: A Recent Overview: S. B. Kotsiantis
23 pages
CompTIA SY0-401 Exam Prep Guide
100% (1)
CompTIA SY0-401 Exam Prep Guide
6 pages
Types of Components and Objects To Be Measured
No ratings yet
Types of Components and Objects To Be Measured
23 pages
1 - Chapter 3 Product Assurance
No ratings yet
1 - Chapter 3 Product Assurance
82 pages
PowerPoint Tips for Presenters
No ratings yet
PowerPoint Tips for Presenters
12 pages
Machine Learning
No ratings yet
Machine Learning
8 pages
Digital Voltmeters (DVMS)
No ratings yet
Digital Voltmeters (DVMS)
25 pages
Lesson 36 - Rule Induction and Decision Tree II
No ratings yet
Lesson 36 - Rule Induction and Decision Tree II
6 pages
Asotić, Din - Arrays in C++ - The Thrid Step in Mastering C++ Programming (2023, Independent) - Libgen - Li
No ratings yet
Asotić, Din - Arrays in C++ - The Thrid Step in Mastering C++ Programming (2023, Independent) - Libgen - Li
322 pages
Peer Tutoor Platform
No ratings yet
Peer Tutoor Platform
9 pages
Decision Tree Splitting & Pruning
No ratings yet
Decision Tree Splitting & Pruning
6 pages
Session 1 and 2 Course Overview and Intro To R
No ratings yet
Session 1 and 2 Course Overview and Intro To R
147 pages
AI Engineer's Career Profile
No ratings yet
AI Engineer's Career Profile
5 pages
Requirement For Variant Generation in
No ratings yet
Requirement For Variant Generation in
15 pages
Central Finance Overview
No ratings yet
Central Finance Overview
12 pages
Servlet Cookies: Shirin Husain Patel. T18504
No ratings yet
Servlet Cookies: Shirin Husain Patel. T18504
19 pages
Brave MMA Event Expenses 2016
No ratings yet
Brave MMA Event Expenses 2016
18 pages
Ict Chapter 1-4
No ratings yet
Ict Chapter 1-4
9 pages
Cracking Codes With Python Al Sweigart Download
100% (1)
Cracking Codes With Python Al Sweigart Download
47 pages
Geometric Sequences (Using Standard Formulae) - Lesson3
No ratings yet
Geometric Sequences (Using Standard Formulae) - Lesson3
15 pages
8.10 - Gis
No ratings yet
8.10 - Gis
13 pages
Application Model For Travel Recommendations Based On Android
No ratings yet
Application Model For Travel Recommendations Based On Android
8 pages
Lec 10
No ratings yet
Lec 10
61 pages
Lec 12
No ratings yet
Lec 12
55 pages
Lec 8
No ratings yet
Lec 8
50 pages
Lec 16
No ratings yet
Lec 16
46 pages
Lab 1
No ratings yet
Lab 1
36 pages
Srinivasan Padmanabhan Resume
No ratings yet
Srinivasan Padmanabhan Resume
6 pages
IP Security: True/False & MCQs
No ratings yet
IP Security: True/False & MCQs
5 pages
System Based Error Book
No ratings yet
System Based Error Book
16 pages
Work Sheet 2
No ratings yet
Work Sheet 2
2 pages
SIMnet - W10-1 (Up To 10 Points)
No ratings yet
SIMnet - W10-1 (Up To 10 Points)
3 pages
Micro GC 3000
No ratings yet
Micro GC 3000
11 pages
DSB 1610 4X0
No ratings yet
DSB 1610 4X0
2 pages
Installing Configuring Automation Orchestrator July2024
No ratings yet
Installing Configuring Automation Orchestrator July2024
89 pages
2025 SP I2ml Cep-1
No ratings yet
2025 SP I2ml Cep-1
2 pages
Official Non-Regression FIG-LX1 8.0.0.176 (C432) To FIG-LX1 8.0.0.174 (C432)
No ratings yet
Official Non-Regression FIG-LX1 8.0.0.176 (C432) To FIG-LX1 8.0.0.174 (C432)
2 pages
Digital Forensics Midterm Case Study
No ratings yet
Digital Forensics Midterm Case Study
5 pages
Cours de Béton Armé Selon L'eurocode PDF
No ratings yet
Cours de Béton Armé Selon L'eurocode PDF
1 page

Lec 14

Uploaded by

Lec 14

Uploaded by

KASHIF JAVED

EED, UET, Lahore

• The decisions are made by

Deciding whether to go out for a picnic 9

• Greedy, top-down learning heuristic:

• I’ll show how it works for classification

• The algorithm never backtracks to reconsider earlier choices

• Create a descendant for each possible value of this attribute

• Sort the training examples to the appropriate descendant node

• Top-level call: 𝑆 = {1, 2, . . . , 𝑛}

Is this a good attribute Which one should we pick?

• For a set 𝑆 , let 𝐽(𝑆) be the cost of 𝑆.

• Gives us the remaining uncertainty after getting info on an attribute

• Choose the attribute/split that KASHIF

▪ when one child is empty or

• One represents the left child and

• The parent also has entropy on the

• Therefore, the point directly above

• Because the entropy curve is

• Any point on that segment

• So the information gain is positive

• If we draw a line segment

• this test doesn’t distinguish the

• Consider Day as an attribute

• Consider Day as an attribute

▪ Has a very large number of possible values

• where S1 through Sc, are the c subsets of examples resulting from

• Split Information term discourages the selection of attributes with many

– If 𝑥𝑖 is quantitative (continuous): sort 𝑥𝑖 values in 𝑆 ; try splitting between

• This is important for obtaining a fast tree-building time.

KASHIFWe need these 4 numbers to

▪ For binary features, that’s ≤ 𝑑. (Quantitative features may go deeper.)

You might also like