0% found this document useful (0 votes)

44 views83 pages

ML Unit 3

Uploaded by

sanju.25qt

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

44 views83 pages

ML Unit 3

Uploaded by

sanju.25qt

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 83

UNIT 3

MACHINE LEARNING
TREE MODELS
• Feature Tree:
A compact way of representing a number of
conjunctive concepts in the hypothesis space.

• Tree:
1. Internal nodes as Features
2. Edges labelled as Literals.
3. Split : set of literals at a node.
4. Leaf: Logical expression conjunction of literals
in the path from root to that edge
TREE MODELS
• Generic Algorithms
Three functions:
1. Homogenous(D)  all instances belong to
True/False (Single Class)

2. Label(D)  returns label of D

3. BestSplit(D,F) on which feature the dataset

is divided(two classes/more
classes)
TREE MODELS
• Divide-and-Conquer algorithm:
it divides the data into subsets,builds a tree for each
of those and then combines those subtrees into a
single tree.
• Greedy:
whenever there is a choice (such as choosing the
best split), the best alternative is selected on the basis
of the information then available, and this choice is
never reconsidered.
• Backtracking search algorithm:
which can return an optimal solution, at the expense
of increased computation time and memory
requirements
DECISION TREES
• Classification Task: D
Homogenous(D)
Single Label(D)

Non- Homogenous(D) Majority Class Label

D Di =Ø (zero)
D 1 D2
DECISION TREES
• D + 1 = D+ and D -1 = Ø

• D -1 = D - and D+1 = Ø

pure

• Impurity: n+ , n- (impurity depends only on magnitude)

• Impurity is measured in Proportional format  p˙ = n+ / (n++n-)  empirical

probability of positive class.

• Aim : We need a function that returns

0 if p=0 or p=1
½ if p reaches maximum value
FUNCTIONS
1. MINORITY CLASS(Error Rate)

2. GINI INDEX(Expected Error Rate).

3. ENTROPY(Expected Information)
MINORITY CLASS
• Min(p,1-p), it returns error rate.

• Minority class is Proportion to misclassified examples.

• Spam=40 majority class,

Ham=10misclassified(minority class)

• If set of instances are Pure set  fewer(no error)

• Minority class as impurity class then ½ -|p- ½ |

GINI INDEX
• It is an expected error rate.

• Randomly assigns a label to instances.

• P(positive instances), 1-p(negative instances)

• False positive  p (1-p)

• False Negative (1-p) p

ENTROPY
• It is an Expected Information.

• Formula: -p log 2p – (1-p) log 2(1-p)

Decision Trees
Entropy
Gini Index
Decision Tree
• K>2

• One vs rest

• K class Entropy =

• K class Gini Index=

RANKING AND PROBABILITY ESTIMATION
• Grouping classifiers divide instance space into segments.
• Instance space

• Segments

• Rankers by learning an ordering algorithm

• Decision trees (can access Local class distribution) directly used

to construct Leaf ordering in an optimal way.

• Using Empirical Probability easy to calculate leaf ordering.

• Highest priority for Positives.

• Convex ROC Curve.

the empirical probability of the parent is a weighted average of the
empirical probabilities of its children; but this only tells us that p˙1 ≤ p˙ ≤ p˙2 or p˙2 ≤ p˙ ≤
p˙1.
• Tree is a feature tree with unlabelled data.

• How many ways we can label the tree and the

performance.

• If we know the number of positives and

negatives.

• L-labels, C-classes then CL ways to arrange the

leaves.

• Ex:24= 16 ways.
• Graph follows symmetry property.

• +-+-, -+-+  they are locating at same

place(symmetric property).

• Path of coverage corner contains optimal

• ----, --+-, +-+-, +-++, ++++

• L labels then L! permutations are possible.

• Feature tree is turned into

-- Rankers (Order leaves in descending

order based on Empirical
probability.

-- Probability Estimator(Predict Empirical

probability in each leaf or calculate
Laplace or m-estimate)

-- Classifier(choose operating conditions , find

the operating point that fits the
conditions
• the optimal labelling under these operating conditions
is +−++.
• use the second leaf to filter out negatives.
• In other words, the right two leaves can be merged into
one – their parent.
• the operation of merging all leaves in a subtree is called
pruning the subtree.
• The advantage of pruning is that we can simplify the
tree without affecting the chosen operating point,
which is sometimes useful
• if we want to communicate the tree model to
somebody else
• The disadvantage is that we lose ranking performance,
Sensitivity to Skewed Class Distribution
• Parent p Gini index = 2(n + / n)(n - / n)

• Average Weight of Gini index children

n1 = n1+ + n1-

n1/n * 2(n + / n)(n - / n)

• Relative impurity= sqrt(n1+ * n1- )/ (n + * n - )

How you would train decision Trees for
a dataset
• Good Ranking Estimator.
• Distributive-Insensitive data
• Disable Pruning.
• Operating Condition, Operating point ROC.
• Prune all the leaves at the same level.
Tree Learning as Variance Reduction
• Gini Index 2p(1-p)  Expected error rate.

• Label instances randomly.

• Coin---Head, tail  probability of occurring of

head is p then variance is p(1-p)
P is occurring
1-p is non-occurring
REGRESSION TREE
Regression Tree
Model(A100,B3,E112,M102,T202)
• A100[1051,1770,1900]mean=1574
• B3[4513] mean=4513
• E112[77] mean=77
• M102[870] mean=870
• T202[99,270,625] mean= 331

Calculate variance:
• A100
1/9 sq(1574-1051)+sq(1574-1770)+sq(1574-1900)=
1/9(523)+(-196)+(-326)=
273529+38416+106,276=46469
• B3
1/9sq(4513-4513)=0
• E112
1/9sq(77-77)=0
• M102
1/9sq(870-870)=0
• T202
1/9sq(331-99)+sq(331-270)+sq(331-625)=1/9(232+61+(-
294))=15997
• Calculate weigthed average of Model:-
• 3/9(46469)+0+0+0+3/9(15597)=2,686.5978
• Similarly for condition(excellent, good, fair)
excellent[1770,4513]mean=3142
good[270,870,1051,1900] mean=1023
fair[77,99,625] mean=267
Variance:-
• Excellent
1/9 sq(3142-1770)+sq(3142-4513)=1372+1371=418002
• good
1/9sq(1023-270)+sq(1023-870)+sq(1023-1051)+sq(1023-1900)
=1/9*sq(753)+sq(153)+sq(28)+sq(877)=
=1/9*567009+ 23409+ 784+769,129
=1,51,147
• fair
1/9(267-77)+(267-99)+(267-625)=190+168+358=21331
• Weighted Average of condition:-
2/9(418002)+4/9(151147)+3/9(21331)=
167,176.1111
• Similarly for Leslie(yes,no)
yes[625,870,1900] mean=1132
no[77,99,270,1051,1770,4513] mean= 1297
Variance:-
• Yes
1/9 sq(1132-625)+(1132-870)+(1132-1900)
=1/9 sq(507)+262+(-768)=101,704.11
• No
1/9 sq(1297-77)+(1297-99)+(1297-270)+(1297-1051)+(1297-
1770)+(1297-4513)
=1/9 sq(1220)+1198+1027+246+(-473)+(-3216)
=16223803.77
• Calculate weighted average of Leslie:-
• 3/9* 101,704.11 + 6/9* 16223803.77
=33901.36+10815869.180
=10849770.54

Weighted averages :
1. Model= 2,686.5978
2. Condition= 167,176.1111
3. Leslie= 10849770.54
• For A100 the splits are
Condition[excellent,good,fair]
[1770] [1051,1900] []  ignored
Leslie[yes,no]  [1900][1051,1770]calculate variance
• For T202 the splits are
Condition[excellent,good,fair][] [270][99,625]ignored
Leslie[yes,no] - [625][99,270]  calculate variance
Regression Tree
Clustering Trees
• Regressions finds an instance space segment that
target values are tightly clustered around the mean.
• Variance of set of target value is average
squared Euclidean distance to mean.
•
• Learning a clustering tree using
1. Dissimilarity Matrix.
2. Euclidean distance
• For A100 the means of the three numerical
features(price, reserve,bids)
11,8,13
18,15,15
19,19,1
• vectors (means) are(16,14,9.7)
• Variance is:
1/3sq(16-11)+(16-18)+(16-19)=1/3sq(5)+(-2)+(-
3)=12.7
• 1/3sq(14-8)+(14-15)+(14-19)= 20.7
• 1/3sq(9.7-13)+(9.7-15)+(9.7-1)=38.2
RULE MODELS
• Logical Models:
1. Tree models.
2. Rule models.

• Rule models consist of a collection of implications

or if–then rules.

• if-part defines a segment, and the then-part

defines the behaviour of the model in this
segment
• Two Approaches:
1. find a combination of literals – the body of the
rule, which is called a concept – that covers a
sufficiently homogeneous set of examples, and
find a label(class) to put in the head of the rule.
Ordered sequence of Rules  Rule Lists

2. first select a class you want to learn, and then find

rule bodies that cover (large subsets of ) the
examples of that class.
 Unordered collection of Rules Rule Sets
Learning Ordered Rule Lists
• Growing Rule body that improves Homogeneity
• Decision Tree Rule Lists

C1 C2 True False

Impurity for 2 classes Only for 1 children

• Separate and Conquer

many
many

[0+,0-] [0+,0-]

1-
many

0-
Learning Unordered Rule Sets
• Alternative approach to rule learning.

• Rules are learned for one class at a time.

• minimizing min(p˙, 1 − p˙).

• maximize p˙, the empirical probability of the

class.
Descriptive Rule Learning
• Descriptive models can be learned in either a
supervised or an unsupervised way.

• Supervised:
how to adapt the given rule learning
algorithms ---- subgroup discovery.

• Unsupervised Learning:
---frequent item sets and association rule
discovery.
Learning from Sub group Discovery

• Equal Proportion of Positives to Overall

Population.
1. Precision
|Prec – Pos|

2. Average-Recall
|avgrec – 0.5|

3. Weighted Relative Accuracy

= Pos * Neg (TPR - FPR)
Association Rule Mining

Tree Based Methods
No ratings yet
Tree Based Methods
64 pages
ML Unit 3
No ratings yet
ML Unit 3
22 pages
Unit 5 - Data Mining - WWW - Rgpvnotes.in
No ratings yet
Unit 5 - Data Mining - WWW - Rgpvnotes.in
15 pages
ML Unit 3 New
100% (1)
ML Unit 3 New
24 pages
Unit 1 ML (DT)
No ratings yet
Unit 1 ML (DT)
24 pages
Lesson 5.0 Supervised Learning With Decision Trees
No ratings yet
Lesson 5.0 Supervised Learning With Decision Trees
16 pages
Unit 3 - ML (NEW)
No ratings yet
Unit 3 - ML (NEW)
68 pages
Daa Unit-Iv
No ratings yet
Daa Unit-Iv
16 pages
ML Imp Que
No ratings yet
ML Imp Que
57 pages
Decision Trees
No ratings yet
Decision Trees
34 pages
Ch02 DecisionTree
No ratings yet
Ch02 DecisionTree
41 pages
Machine Learning Notes 1
No ratings yet
Machine Learning Notes 1
120 pages
Topic 4
No ratings yet
Topic 4
49 pages
Unit 1 ML (NN& ML Techniques)
No ratings yet
Unit 1 ML (NN& ML Techniques)
40 pages
Unit II
No ratings yet
Unit II
34 pages
Decissin Tree & Over Fitting
No ratings yet
Decissin Tree & Over Fitting
22 pages
ML Unit 03
No ratings yet
ML Unit 03
23 pages
Classification: Basic Concepts, Decision Trees, and Model Evaluation
No ratings yet
Classification: Basic Concepts, Decision Trees, and Model Evaluation
46 pages
Unit 3 Notes
No ratings yet
Unit 3 Notes
25 pages
Unit3 ML
No ratings yet
Unit3 ML
23 pages
Unit 3 by GPT
No ratings yet
Unit 3 by GPT
10 pages
Module 1 Notes
No ratings yet
Module 1 Notes
27 pages
ML Unit-3
No ratings yet
ML Unit-3
23 pages
Class 2a-Decision Trees
No ratings yet
Class 2a-Decision Trees
28 pages
Unit 4 Classification & Prediction
No ratings yet
Unit 4 Classification & Prediction
10 pages
Chapter 7 Supervised Learning
No ratings yet
Chapter 7 Supervised Learning
71 pages
Lec 46
No ratings yet
Lec 46
6 pages
ESGB - 2025 - Classification and Regression Tress (Enregistré Automatiquement)
No ratings yet
ESGB - 2025 - Classification and Regression Tress (Enregistré Automatiquement)
43 pages
ML Unit3
No ratings yet
ML Unit3
24 pages
Unit-4 DM
No ratings yet
Unit-4 DM
19 pages
ML Unit 2 Final - III Yr
No ratings yet
ML Unit 2 Final - III Yr
72 pages
Classification and Regression Trees
No ratings yet
Classification and Regression Trees
48 pages
ML Important
No ratings yet
ML Important
11 pages
M01 Tree-Based Methods
No ratings yet
M01 Tree-Based Methods
38 pages
DAA-UNIT-II - Disjoint Sets
No ratings yet
DAA-UNIT-II - Disjoint Sets
13 pages
Decision Tree
No ratings yet
Decision Tree
15 pages
Chapter 3
No ratings yet
Chapter 3
88 pages
Random Forest
No ratings yet
Random Forest
5 pages
Lecture 17 18
No ratings yet
Lecture 17 18
52 pages
19 - Decision Tree - ID3
No ratings yet
19 - Decision Tree - ID3
87 pages
Machine Learning: Decision Trees & Algorithms
No ratings yet
Machine Learning: Decision Trees & Algorithms
24 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
2025 Ensemble Learning
No ratings yet
2025 Ensemble Learning
25 pages
Decision Trees-Lecture 9&10
No ratings yet
Decision Trees-Lecture 9&10
60 pages
Data Mining NOTES
No ratings yet
Data Mining NOTES
57 pages
ML Unit 2
No ratings yet
ML Unit 2
8 pages
MLunit 2 Mynotes
No ratings yet
MLunit 2 Mynotes
15 pages
Decision Trees
No ratings yet
Decision Trees
45 pages
AI Chapter 3 Part 2
No ratings yet
AI Chapter 3 Part 2
51 pages
Introduction To Big Data and Data Mining
No ratings yet
Introduction To Big Data and Data Mining
130 pages
LECTURE 2 Chapter 5 Sun Ray Transportation Model
No ratings yet
LECTURE 2 Chapter 5 Sun Ray Transportation Model
16 pages
AIML Module 4 Imp
No ratings yet
AIML Module 4 Imp
5 pages
RB's ML2 Notes
No ratings yet
RB's ML2 Notes
5 pages
DT-0 (3 Files Merged)
No ratings yet
DT-0 (3 Files Merged)
143 pages
CZ4032 Data Analytics & Mining Notes
No ratings yet
CZ4032 Data Analytics & Mining Notes
16 pages
Unit - Iii
No ratings yet
Unit - Iii
52 pages
STAT 451: Machine Learning Lecture Notes: Sebastian Raschka Department of Statistics University of Wisconsin-Madison
No ratings yet
STAT 451: Machine Learning Lecture Notes: Sebastian Raschka Department of Statistics University of Wisconsin-Madison
18 pages
Decision Trees
No ratings yet
Decision Trees
37 pages
Decision Trees for Data Mining Students
No ratings yet
Decision Trees for Data Mining Students
30 pages
Big Data Lesson 5 Lucrezia Noli
No ratings yet
Big Data Lesson 5 Lucrezia Noli
30 pages
B.Tech CSE: Advanced Algorithms
No ratings yet
B.Tech CSE: Advanced Algorithms
5 pages
Gradient Boosting Explained
100% (1)
Gradient Boosting Explained
7 pages
Constraint Satisfaction in AI
No ratings yet
Constraint Satisfaction in AI
15 pages
Decision Trees CLS
No ratings yet
Decision Trees CLS
43 pages
Classification Problems
No ratings yet
Classification Problems
53 pages
Hilfinger Data Structures
No ratings yet
Hilfinger Data Structures
253 pages
Feed Forward Neural Network Assignment PDF
No ratings yet
Feed Forward Neural Network Assignment PDF
11 pages
Question Text: Given The Following - What Is Output of Running Duyetgraph Program?
100% (1)
Question Text: Given The Following - What Is Output of Running Duyetgraph Program?
100 pages
Assign (1225911)
No ratings yet
Assign (1225911)
6 pages
Full Syllabus Test No. 9 - HPSC Subjective
No ratings yet
Full Syllabus Test No. 9 - HPSC Subjective
6 pages
Optimization Techniques Exam
No ratings yet
Optimization Techniques Exam
3 pages
Algoritma Untuk Tataletak Fasilitas
No ratings yet
Algoritma Untuk Tataletak Fasilitas
53 pages
Bfs Dfs TSP Assignment
No ratings yet
Bfs Dfs TSP Assignment
4 pages
Unit-2 Complete Notes
No ratings yet
Unit-2 Complete Notes
22 pages
Java Data Structures Q&A Guide
No ratings yet
Java Data Structures Q&A Guide
8 pages
ADA Lab Manual
No ratings yet
ADA Lab Manual
32 pages
Collection Hindustan
No ratings yet
Collection Hindustan
29 pages
HNDSE 21.2F Programming Data Structures and Algorithms
No ratings yet
HNDSE 21.2F Programming Data Structures and Algorithms
4 pages
CSE203
No ratings yet
CSE203
45 pages
List of Experiments
No ratings yet
List of Experiments
2 pages
Convex Optimization in Image Processing: Ernie Esser
No ratings yet
Convex Optimization in Image Processing: Ernie Esser
9 pages
Rohini 87453928097
No ratings yet
Rohini 87453928097
3 pages
Python Hash Table with Chaining
No ratings yet
Python Hash Table with Chaining
4 pages
Computer Science Algorithms Lab Guide
No ratings yet
Computer Science Algorithms Lab Guide
3 pages
Algorithm - Finding Elementary Intervals in Overlapping Intervals - Stack Overflow
No ratings yet
Algorithm - Finding Elementary Intervals in Overlapping Intervals - Stack Overflow
3 pages
DSA III Sem Mod 1 Imp Questions For IA@Vtunetwork
No ratings yet
DSA III Sem Mod 1 Imp Questions For IA@Vtunetwork
2 pages
HW5
No ratings yet
HW5
3 pages

ML Unit 3

Uploaded by

ML Unit 3

Uploaded by

UNIT 3

2. Label(D)  returns label of D

3. BestSplit(D,F) on which feature the dataset

Non- Homogenous(D) Majority Class Label

• Impurity: n+ , n- (impurity depends only on magnitude)

• Impurity is measured in Proportional format  p˙ = n+ / (n++n-)  empirical

• Aim : We need a function that returns

2. GINI INDEX(Expected Error Rate).

• Minority class is Proportion to misclassified examples.

• Spam=40 majority class,

• If set of instances are Pure set  fewer(no error)

• Minority class as impurity class then ½ -|p- ½ |

• Randomly assigns a label to instances.

• P(positive instances), 1-p(negative instances)

• False positive  p (1-p)

• False Negative (1-p) p

• Formula: -p log 2p – (1-p) log 2(1-p)

• K class Gini Index=

• Rankers by learning an ordering algorithm

• Decision trees (can access Local class distribution) directly used

• Using Empirical Probability easy to calculate leaf ordering.

• Highest priority for Positives.

• Convex ROC Curve.

• How many ways we can label the tree and the

• If we know the number of positives and

• L-labels, C-classes then CL ways to arrange the

• +-+-, -+-+  they are locating at same

• Path of coverage corner contains optimal

• ----, --+-, +-+-, +-++, ++++

• L labels then L! permutations are possible.

-- Rankers (Order leaves in descending

-- Probability Estimator(Predict Empirical

-- Classifier(choose operating conditions , find

• Average Weight of Gini index children

n1/n * 2(n + / n)(n - / n)

• Relative impurity= sqrt(n1+ * n1- )/ (n + * n - )

• Label instances randomly.

• Coin---Head, tail  probability of occurring of

• Rule models consist of a collection of implications

• if-part defines a segment, and the then-part

2. first select a class you want to learn, and then find

Impurity for 2 classes Only for 1 children

• Separate and Conquer

• Rules are learned for one class at a time.

• minimizing min(p˙, 1 − p˙).

• maximize p˙, the empirical probability of the

• Equal Proportion of Positives to Overall

3. Weighted Relative Accuracy

You might also like