0% found this document useful (0 votes)

9 views6 pages

Ensemble Algorithms

C’est un algorithme de classification probabiliste pour les ensembles

Uploaded by

tchibindasarah8

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views6 pages

Ensemble Algorithms

C’est un algorithme de classification probabiliste pour les ensembles

Uploaded by

tchibindasarah8

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Valentina Alto

Dec 31, 2019

·
5 min read
·
Member-only
·
Listen

Ensemble Methods for Decision Trees

Decision Trees are popular Machine Learning algorithms used for

both regression and classification tasks. Their popularity mainly
arises from their interpretability and representability, as they mimic
the way the human brain takes decisions.

However, to be interpretable, they pay a price in terms of prediction

accuracy. To overcome this caveat, some techniques have been
developed, with the goal of creating strong and robust models
starting from ‘poor’ models. Those techniques are known as
‘ensemble’ methods and, in this article, I’m going to talk about three
of them: Bagging, Random Forest and Boosting.

Bagging

The idea of bagging is that, if we were able to train on different

datasets multiple trees and then use an average (or, in case of
classification, the majority vote) of their output to predict the label of
a new observation, we would get more accurate results. Namely,
imagine we train on four different datasets, drawn from the same
population, 4 decision trees which are meant to classify an e-mail as
spam or not spam. Then a new e-mail arrives and three of them
classify it as spam, while one of them as not spam.

Thanks to the distribution of the outputs, we can be almost sure that

the e-mail is spam, however, if we had only trained one tree, maybe
on the dataset which returned a not-spam, we would have had a
wrong prediction.
Nevertheless, it is often not feasible to have different datasets
available. Hence, a solution to this problem is bootstrapping. The
idea is that, given an initial dataset, we draw from it (with
replacement) B new samples of the same size, and then train on each
of them a new tree.

Then, the procedure is the same as that described above: when a new
test observation arrives, we feed all the tree with it and then take the
average response (for regression) or the class with the majority vote
(for classification).

With this procedure, we can also estimate the test error in order to
evaluate our full model. Indeed, when we proceed with
bootstrapping, it turns out that, on average, each bootstrapped
sample contains about 2/3 of the original dataset. Hence, since the
goal of test error estimation is feeding the model with data which
have never been seen before, we can use the so-called ‘out of bag’
(OOB) portion of our dataset to test the model, for each
bootstrapped sample. So, if B is the number of samples drawn from
the original dataset, each decision tree will be trained on about 2/3B
of the original data, and tested on about B/3.

Random Forest

Random forest relies on the same intuition as bagging, however with

an additional constraint. Indeed, every time a tree is grown from a
bootstrapped sample, the algorithm allows it to consider only a
subset of size m of the entire covariates spaces of size p (with m<p).
By doing so, each tree is independent of each other.

The reason why it might be more accurate than bagging might be

sized with the following example. Imagine we are using the bagging
algorithm and that, among the predictors, there is a particular one
whose splitting decreases a lot the Gini index (in other words, it
makes the node more ‘pure’, hence it converges towards the final
output). If this is the case, all the trees built on the B samples will
probably elect that predictor as the root of the tree. As a result, all
trees will be similar to each other and the final prediction might be
less accurate.

Now, if the same task is performed using Random Forest, there will
be some samples where the above predictor will be excluded from
the training phase, hence that tree will be different from the others.
So thanks to this constraint, the model is able to explore many more
possible combinations of parameters, without being ‘bounded’ to
best predictors.
Boosting

With boosting, the strategy is a bit different. Indeed, the idea behind
boosting is building a series of trees, each of those being an updated
version of the previous one. We can say that, with boosting, we size
the very essence of Machine Learning algorithms, since the way each
tree is built is by learning from its past errors.

Basically, we start with a very naive model, called f, which predicts 0

for every input, so that actual values=residuals (indeed, residuals are
the difference between actual and fitted values, and fitted values =
0). Then, for each iteration 1,2…. up to a previously set limit ‘B’:

• we fit a tree f(b) on the residuals rather than the output

variable Y. We can choose as many nodes as we fancy,
however in this phase, to keep things simple, it is a rule of
thumb to choose only one split (hence two terminal nodes).
Such a tree is called a stump.

• we update the previous tree by adding a shrunken version of

the tree obtained above:

f = f + lambda*f(b)

where lambda is a shrinkage parameter that has the aim of letting

the algorithm learning slowly, which leads to higher accuracy.

• we update the residuals as well:

residuals = residuals — lambda*f(b)

By doing so iteratively, at each step, the final tree will be updated by
a new stump fitted on the updated residuals, which will be more and
more low.

The advantage of this procedure is that, since it does not involve

building a large and complex tree for each sample, it can prevent
overfitting (provided that the number of iterations B is not too
large). Plus, as it learns slowly (with lambda normally set between
0.01 and 0.001), it guarantees a good fit of the tree without the risk
of over parametrized it, so again it prevents the tree from being
overfitted.

Conclusion

Ensemble methods are powerful techniques that can largely improve

the predictive accuracy of decision trees. The caveat of those,
however, is that they make a bit less easy to present and interpret the
final results. Indeed, as we said at the very beginning, the most
appreciable feature of decision trees is their interpretability and
easiness to understand, which, in a world of algorithms which look
like ‘black box’, is an important value. Nevertheless, some visual
alternative options are available and, if ensembling multiple trees
improve that much accuracy, it is for sure worth it.

Rockwool Installation Guide
100% (1)
Rockwool Installation Guide
8 pages
Lecture 05 Random Forest 07112022 124639pm
No ratings yet
Lecture 05 Random Forest 07112022 124639pm
25 pages
Ensemble Learning in Machine Learning
No ratings yet
Ensemble Learning in Machine Learning
32 pages
Introduction to Classification Techniques
No ratings yet
Introduction to Classification Techniques
10 pages
Random Forest
No ratings yet
Random Forest
25 pages
Eda - M4
No ratings yet
Eda - M4
7 pages
Random Forest
No ratings yet
Random Forest
10 pages
D3 IT Random Forest Apr 2023
No ratings yet
D3 IT Random Forest Apr 2023
32 pages
AIML QB in Short Form
No ratings yet
AIML QB in Short Form
48 pages
ML Unit-3
No ratings yet
ML Unit-3
15 pages
Classification Algorithms
No ratings yet
Classification Algorithms
68 pages
Random Forest Class Lecture Notes
No ratings yet
Random Forest Class Lecture Notes
2 pages
05 - Ensemble Learning
No ratings yet
05 - Ensemble Learning
39 pages
Data Science - Decision Tree - Random Forest
No ratings yet
Data Science - Decision Tree - Random Forest
15 pages
Machine Learning Ensemble Guide
No ratings yet
Machine Learning Ensemble Guide
26 pages
UNIT-3 Material
No ratings yet
UNIT-3 Material
19 pages
Random Forest Summary
No ratings yet
Random Forest Summary
6 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
40 pages
Aiml QB With Ans - 075736
No ratings yet
Aiml QB With Ans - 075736
69 pages
Module 2
No ratings yet
Module 2
34 pages
AIML Micro
No ratings yet
AIML Micro
14 pages
Random Forest Summary
No ratings yet
Random Forest Summary
6 pages
Random Forest
No ratings yet
Random Forest
29 pages
Bagging
No ratings yet
Bagging
7 pages
PDS LVC 2 Post-Session Summary
No ratings yet
PDS LVC 2 Post-Session Summary
11 pages
Ensemble Learning Explained
No ratings yet
Ensemble Learning Explained
32 pages
Lecture+Notes+-+Random Forests
No ratings yet
Lecture+Notes+-+Random Forests
10 pages
UNIT-V (Bagging, Boosting, Random Forest) : by Dr. K. Aditya Shastry Associate Professor Dept. of ISE NMIT, Bengaluru
No ratings yet
UNIT-V (Bagging, Boosting, Random Forest) : by Dr. K. Aditya Shastry Associate Professor Dept. of ISE NMIT, Bengaluru
27 pages
Machine Learning Ensembling Guide
No ratings yet
Machine Learning Ensembling Guide
7 pages
Fin Irjmets1655542455
No ratings yet
Fin Irjmets1655542455
4 pages
ML Unit 3
No ratings yet
ML Unit 3
22 pages
ML Unit-3 Part-1
No ratings yet
ML Unit-3 Part-1
17 pages
Ch5 Data Science
No ratings yet
Ch5 Data Science
60 pages
Lecture 6
No ratings yet
Lecture 6
24 pages
Unit 3
No ratings yet
Unit 3
59 pages
Random Forest-Supervised ML
No ratings yet
Random Forest-Supervised ML
45 pages
Group9 ABA Ensemble Model
No ratings yet
Group9 ABA Ensemble Model
5 pages
Random Forest
No ratings yet
Random Forest
14 pages
5 - EnsembleModeling
No ratings yet
5 - EnsembleModeling
80 pages
Week 7 - Tree-Based Model
100% (1)
Week 7 - Tree-Based Model
8 pages
CS109a Lecture16 Bagging RF Boosting
No ratings yet
CS109a Lecture16 Bagging RF Boosting
48 pages
Unit 3
No ratings yet
Unit 3
63 pages
22AIP3101A Session 11
No ratings yet
22AIP3101A Session 11
30 pages
Lecture 9 PDF
100% (1)
Lecture 9 PDF
28 pages
Enseble LEarning
100% (1)
Enseble LEarning
57 pages
Ens Embling
No ratings yet
Ens Embling
19 pages
lecture19-FromTreesToForests RandomForests
No ratings yet
lecture19-FromTreesToForests RandomForests
50 pages
Random Forests 2
No ratings yet
Random Forests 2
43 pages
Lecture 5
No ratings yet
Lecture 5
53 pages
Tree-Based Methods Explained
No ratings yet
Tree-Based Methods Explained
68 pages
Ensemble Learning: Wisdom of The Crowd
100% (1)
Ensemble Learning: Wisdom of The Crowd
12 pages
Random Forest
No ratings yet
Random Forest
20 pages
ML Unit@4
No ratings yet
ML Unit@4
70 pages
Random Forest Lecture
No ratings yet
Random Forest Lecture
5 pages
Case Study Possible Questions
No ratings yet
Case Study Possible Questions
3 pages
ML Lecture 15 Ensemble
No ratings yet
ML Lecture 15 Ensemble
27 pages
Data Mining Notes
No ratings yet
Data Mining Notes
5 pages
Ensemble Methods Final PDF
No ratings yet
Ensemble Methods Final PDF
25 pages
Assessing Predictive Models
No ratings yet
Assessing Predictive Models
25 pages
05 Dispute
No ratings yet
05 Dispute
29 pages
Raisen PDF
No ratings yet
Raisen PDF
99 pages
Class 12 Physics Electricity Experiment
No ratings yet
Class 12 Physics Electricity Experiment
18 pages
APSC 255 Formula Sheet
No ratings yet
APSC 255 Formula Sheet
3 pages
TCS
No ratings yet
TCS
43 pages
Bajaj Project
No ratings yet
Bajaj Project
36 pages
Hydraulic System CX31 (UENR4778-01)
No ratings yet
Hydraulic System CX31 (UENR4778-01)
4 pages
2023 Reports - Luminate On Diversity
No ratings yet
2023 Reports - Luminate On Diversity
28 pages
Types of Resorts by Seasonality
50% (2)
Types of Resorts by Seasonality
34 pages
The Importance of Corporate Communications During Financial Crisis
No ratings yet
The Importance of Corporate Communications During Financial Crisis
12 pages
M1000H
No ratings yet
M1000H
2 pages
Tiếng Anh thầy Tiểu Đạt - chuyên luyện thi Đại học Mr. Tieu Dat's English Academy Thầy Lưu Tiến Đạt (thầy Tiểu Đạt) Chuyên gia luyện thi môn Tiếng Anh
No ratings yet
Tiếng Anh thầy Tiểu Đạt - chuyên luyện thi Đại học Mr. Tieu Dat's English Academy Thầy Lưu Tiến Đạt (thầy Tiểu Đạt) Chuyên gia luyện thi môn Tiếng Anh
5 pages
Complete Bundle Methods in Behavioral Research 12th Edition Cozby
No ratings yet
Complete Bundle Methods in Behavioral Research 12th Edition Cozby
409 pages
Quantitative Methods For Business and Economics (Jakub Kielbasa)
100% (1)
Quantitative Methods For Business and Economics (Jakub Kielbasa)
187 pages
Cost Concepts Quiz
No ratings yet
Cost Concepts Quiz
11 pages
SWP01 CoreRules 7.5.24
No ratings yet
SWP01 CoreRules 7.5.24
41 pages
Nursing Body Mechanics Guide
No ratings yet
Nursing Body Mechanics Guide
66 pages
LSD Thesis Statement
100% (3)
LSD Thesis Statement
5 pages
Assignment HBEC4503 Action Research in Early Childhood Education Assignment 2 May 2019 Semester
No ratings yet
Assignment HBEC4503 Action Research in Early Childhood Education Assignment 2 May 2019 Semester
10 pages
WinDNC V06 02 NewFeatures en
100% (3)
WinDNC V06 02 NewFeatures en
2 pages
Easter Events & Weather Forecast
No ratings yet
Easter Events & Weather Forecast
10 pages
Tyre Industry in India - Me Project
100% (2)
Tyre Industry in India - Me Project
17 pages
Module 7 Intangibles
No ratings yet
Module 7 Intangibles
14 pages
Smit Vipul Kalamkar - CV
No ratings yet
Smit Vipul Kalamkar - CV
2 pages
04-Random-Variate Generation
No ratings yet
04-Random-Variate Generation
18 pages
CHEMISTRY Exam
No ratings yet
CHEMISTRY Exam
8 pages
DELTA V PDS S-Series Horizontal Carriers
No ratings yet
DELTA V PDS S-Series Horizontal Carriers
7 pages
0 - Ritu Sharma Old CV
No ratings yet
0 - Ritu Sharma Old CV
2 pages

Ensemble Algorithms

Uploaded by

Ensemble Algorithms

Uploaded by

Valentina Alto

Dec 31, 2019

Ensemble Methods for Decision Trees

Decision Trees are popular Machine Learning algorithms used for

However, to be interpretable, they pay a price in terms of prediction

The idea of bagging is that, if we were able to train on different

Thanks to the distribution of the outputs, we can be almost sure that

Random forest relies on the same intuition as bagging, however with

The reason why it might be more accurate than bagging might be

Basically, we start with a very naive model, called f, which predicts 0

• we fit a tree f(b) on the residuals rather than the output

• we update the previous tree by adding a shrunken version of

where lambda is a shrinkage parameter that has the aim of letting

• we update the residuals as well:

residuals = residuals — lambda*f(b)

The advantage of this procedure is that, since it does not involve

Ensemble methods are powerful techniques that can largely improve

You might also like