0% found this document useful (0 votes)

41 views33 pages

Bagging & Random Forests Lecture

The document discusses ensemble learning methods for classification and regression. It introduces bagging, which creates multiple predictive models by training base learners on bootstrap samples of a training dataset and combining their predictions through voting or averaging. Random forests are then presented as an extension of bagging that adds additional randomness when training each tree. Specifically, random forests grow many decision trees where, at each node, a random subset of features is selected as split candidates. This decorrelation of trees reduces the variance of predictions compared to a single decision tree. Extremely randomized trees are also mentioned as another variant of random forests that injects more randomness into the tree construction process.

Uploaded by

Houssam Fouki

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views33 pages

Bagging & Random Forests Lecture

Uploaded by

Houssam Fouki

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

Ensemble learning from theory to practice

Lecture 2: Bagging and Random Forests

Myriam Tami

[email protected]

January - March, 2023

Motivations Bagging Random forests Extremely randomized trees References

1 Motivations

2 Bagging

3 Random forests

4 Extremely randomized trees

M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2020 2 / 33
Motivations Bagging Random forests Extremely randomized trees References

Reminder
In the first course, we saw
Goal of ensemble learning methods:
combine multiple weak learners in order to improve robustness
and prediction performances
Decision tree is an example of weak learners

Figure: source: Hands-On Machine Learning with Scikit-Learn and TensorFlow, A. Géron

M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2020 3 / 33
Motivations Bagging Random forests Extremely randomized trees References

Why to combine? An intuition

Binary classification Y ∈ {−1, 1}, input variables X ∈ Rd
We have a set of K independent initial classification methods (fb )1≤b≤B
such as ∀b,
P {fb (X ) 6= Y } = ε
By aggregating these methods and predicting
B
!
X
F (X ) = sign fb (X )
b=1

By the Hoeffding inequality (not on the course program), the probability

of error of F is:

1 2
P {F (X ) 6= Y } 6 exp − B (2ε − 1)
2

which tends towards zero exponentially in B

M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2020 4 / 33
Motivations Bagging Random forests Extremely randomized trees References

Why to combine? Another intuition

Assume that X1 , . . . , XB are B iid random variables of mean µ = E[X1 ]
and of variance σ 2 = V [X1 ] = E[(X1 − µ)2 ]
Consider the new variable/estimator (empirical mean)
B
1X
X̄ := Xb
B
b=1

The expectation does not change, E[X̄ ] = µ (no bias i.e.,

E[X̄ − µ] = 0, the expected value of the estimator matches that of
the parameter (no error))
The variance is reduced thanks to the decorrelation of the
random variables, V [X̄ ] = B1 σ 2
This is of interest of decision trees: we have seen (cf. Course 1) that
large and unpruned trees have a small bias but a large variance
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2020 5 / 33
Motivations Bagging Random forests Extremely randomized trees References

Bagging (Bootstrap AGGregatING)

Introduced by Breiman [Breiman, 1996]

Based on two key points : boostrap and aggregation
We know that the aggregation of independent initial predictive
methods (base learners) leads to a significant reduction in error of
prediction and variance
⇒ Get initial methods as independent as possible
Naive idea: train our "base learners" (ex: CART) on subsets of
disjoint observations of the training set
Problem: the training set is not infinite → the "base learners" will
have too little data and poor performance
That is where bootstrapping is useful

M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2020 6 / 33
Motivations Bagging Random forests Extremely randomized trees References

Bagging idea

Bagging create training subsets using bootstrap sampling

[Tibshirani and Efron, 1993]
Boostrapping
To create a new “base learner" fb ,
1 we randomly draw with replacement a dataset Db of ntrain
observations from the training set
2 we learn the method (ex: CART) on it
→ the “base learner" fb is obtained

Note: each Db has the same size as the original training set

M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2020 7 / 33
Motivations Bagging Random forests Extremely randomized trees References

Bagging idea
Bagging
Consists to
1 Do boostrapping B times producing
I B bootstrap datasets Db
I then B predictors (“base learners") fb for each of these datasets
2 Aggregate the predictors
I In the regression case (average),
B
1X
fbag (x) = fb (x) (1)
B
b=1

I In the classification case (majority vote over trees),

B
!
1X
fbag (x) = argmax 1{fb (x)=c} (2)
16c6C B
b=1

M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2020 8 / 33
Motivations Bagging Random forests Extremely randomized trees References

Bagging diagram

Figure: Illustration of the bagging principle (with LΘ

n = Db , ĥ(., Θb ) = f̂b and
l

ĥbag = f̂bag )

M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2020 9 / 33
Motivations Bagging Random forests Extremely randomized trees References

Random forests

Method introduced by Leo

Breiman in 2001
[Breiman, 2001]
Based on older ideas:
Bagging [Breiman, 1996],
decision trees CART
[Breiman et al., 1984]
Proofs of the convergence
[Biau et al., 2008]
Random forests method
belongs to the family of
ensemble methods

M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2020 10 / 33
Motivations Bagging Random forests Extremely randomized trees References

Random forests (notations)

D = {(x1 , y1 ) , . . . , (xn , yn )} the learning set, each (xi , yi ) is
independent, realization from the random variables (X , Y )
X ∈ Rd the input variables; Y ∈ Y the output variable, Y = R for
regression and Y = {1, . . . , C} for classification
Goal: build a predictor f̂ : Rd → Y
Random forests idea
n o
f̂b (., Θb ) , 1 6 b 6 B set of decision tree predictors,
(Θb )16b6B characterises the bth tree in terms of split variables,
cutpoints at each node and terminal-node value,
Random forests predictor f̂ obtained by aggregating the set of trees
B
for regression, 1X
f̂ (x) := f̂b (x, Θb ) (3)
B
b=1
for classification, B
!
1X
f̂ (x) := argmax 1{f̂ (x,Θ )=c} (4)
16c6C B b b
b=1
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2020 11 / 33
Motivations Bagging Random forests Extremely randomized trees References

Random forests
Random forests consist of growing a large number (ex: 400) of
randomly constructed decision trees before aggregating them
In statistical terms, if the trees are decorrelated, this reduces the
variance of the predictions
Naïve example with 2 trees,

24.7+23.3
Averaging these regression trees allows the prediction 2 = 24

source: Arbres de décision et forêts aléatoires, P. Gaillard, 2014

M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2020 12 / 33
Motivations Bagging Random forests Extremely randomized trees References

The problem of correlation between trees

Bagging idea: aggregate many noisy but approximately unbiased1

tree models to reduce the variance
However, there is necessarily some overlap between
bootstrapped datasets
⇒ the trees corresponding to each of them are correlated
Intuition: if B trees fb (x) are identically distributed, of variance σ 2 ,
with a correlation coefficient ρ = Corr (fb (x), fb0 (x)), ∀b 6= b0 , the
variance of their mean is then,

(1 − ρ) σ 2
V fbag (X ) = ρσ 2 +

(5)
B
Thus, the variance cannot be shrink below ρσ 2
⇒ Disadvantage of bagging
1
if sufficiently deep
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2020 13 / 33
Motivations Bagging Random forests Extremely randomized trees References

Create low correlated trees

We saw the disadvantage of

Bootstrapping: rather than using all the data to build the trees, we
choose randomly for each tree a subset (with possible repetition)
of the training data.

Let introduce the improvement proposed by random forests: is to lower

the correlation between trees (without increasing the variance too
much) using an additional step of randomization,
a random choice of the input feature j used to split each node
during the tree-growing process

M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2020 14 / 33
Motivations Bagging Random forests Extremely randomized trees References

Definition of Random forests

Definition (Random forests)

A random forest is a set of trees growing on a boostrapped learning
data set and where, before each split, a set of m d input variables
(or features) is randomly selected as candidates for splitting

Note: m is the same for all the nodes of all the trees of the forest but, of
course, the variables considered in each node for the choice of the
best split changes randomly

M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2020 15 / 33
Motivations Bagging Random forests Extremely randomized trees References

Algorithm of Random forests

n o
1: Require: A dataset D = (xi , yi )16i6n , the size B of the ensemble, the number
m of candidates (features) for splitting
2: for b = 1 to B do
3: Draw a bootstrap dataset Db of size n from the original training set D
4: Grow a random tree f̂b using the bootstrapped dataset:
5: repeat
6: for all terminal node do
7: Select m variables among d, at random
8: Pick the best variable and split-point couple among the m
9: Split the node into two child nodes
10: end for
11: until the stopping criterion is met (e.g., minimum number of sample per node
reached)
12: end for
return: the ensemble of B trees
Algorithm 1: Pseudo-code to build Random forests for regression or clas-
sification

M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2020 16 / 33
Motivations Bagging Random forests Extremely randomized trees References

Random forests in practice

Intuitively, reducing m will reduce the correlation between any

pairs of trees in the ensemble
⇒ reduce the variance of the average (cf. Eq. (5))
However, the corresponding hypothesis space will be smaller,
leading to an increased bias
Heuristics,
for regression, choose m = b d3 c and a minimum node size of 5
√
for classification, choose m = b dc and a minimum node size of 1

For further information about random forests, you can refer to

[Hastie et al., 2009] (Chap. 15) that provides a bias-variance analysis

M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2020 17 / 33
Motivations Bagging Random forests Extremely randomized trees References

OOB-error (Out-Of-Bag error)

OOB error (out of bootstrap samples) - the principle

To predict yi , we only aggregate the predictors f̂b (., Θb ) built on
bootstrap samples not containing (xi , yi )
for regression, n
1X
OOB-error = (yi − ŷi )2
n
i=1

where ŷi from an aggregation of f̂b built on Db \ (xi , yi )

for classification,
n
1X
OOB-error = 1yi 6=ŷi
n
i=1

M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2020 18 / 33
Motivations Bagging Random forests Extremely randomized trees References

OOB-error

Many advantages,
An OOB-error estimate is almost identical to that obtained by
N-fold cross-validation
Random forests can be fit in one sequence, with cross-validation
being performed along the way
Once the OOB-error stabilizes, the training can be terminated and
B value is obtained/ tunned

M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2020 19 / 33
Motivations Bagging Random forests Extremely randomized trees References

Variable Importance (VI)

Random Forests (RF) allow to rank the explanatory variables in

order of importance for the prediction
In the RF framework, permutation importance indices are
preferred to total decrease of node impurity measures already
introduced in Breimanet al.(1984)
The default Scikit-learn’s feature importances is based on the
decrease of node impurity

M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2020 20 / 33
Motivations Bagging Random forests Extremely randomized trees References

Variable Importance based on impurity decrease

For one tree,
Variable Importance (VI) of Xj is calculated by the sum of the
decrease in error/impurity when split by the variable Xj
e.g., if Xj is used 2 times to split a terminal node in the tree → you
will sum these two decreases in Gini index (or cross-entropy, etc)
to obtain its VI

The relative importance is the VI divided by the highest VI value

(normalization)
⇒ Values are bounded between 0 and 1

In the case of RF,

we are talking about averaging the decrease in impurity over
trees
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2020 21 / 33
Motivations Bagging Random forests Extremely randomized trees References

Variable Importance based on impurity decrease

Pros,
Fast calculation
Easy to obtain via Scikit-learn: one command
feature_importances_
Cons,
Biased approach: it has a tendency to inflate the importance of
continuous features or high-cardinality categorical variables

M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2020 22 / 33
Motivations Bagging Random forests Extremely randomized trees References

Variable Importance based on permutation

This approach directly measures the feature importance of an input
variable Xj by observing how random permutation of its values (thus
preserving the distribution of the variable) influences model
performance
Goal: measure the prediction strength of each variable

Figure: Illustration of the values permutation from one variable (source: Arbres CART et
Forêts aléatoires - Importance et sélection de variables, R. Genuer and J-M. Poggi, 2016)
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2020 23 / 33
Motivations Bagging Random forests Extremely randomized trees References

Variable Importance based on permutation

The process is the following,

1 Grow the RF on the learning set
2 Record the OOB-error E
3 Permute at random the j-th variable values of these data
4 Pass this modified dataset to the RF again to obtain predictions
5 Compute the OOB-error on this modified dataset
6 The VI of Xj is the difference between the benchmark score E and
the one from the modified (permuted) dataset

⇒ The more the increase of OOB error is, the more important is the
variable

M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2020 24 / 33
Motivations Bagging Random forests Extremely randomized trees References

Variable Importance based on permutation

Pros,
Reasonably efficient
Reliable technique
No need to re-train the model at each modification of the dataset
Cons,
More computationally expensive than the default
feature_importances_
Permutation importance overestimates the importance of
correlated predictors [Strobl et al., 2008]

M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2020 25 / 33
Motivations Bagging Random forests Extremely randomized trees References

Anomalies detection

RFs are well suited to detecting outliers [Liu et al., 2008]

These are indeed quickly isolated in a separate leaf
The anomaly score of an observation xi is determined
approximately by the average length of the path from xi to the
leaves of trees in the forest
The shorter the path, the more likely the observation is atypical

M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2020 26 / 33
Motivations Bagging Random forests Extremely randomized trees References

Pros and cons of Random forests

Pros
no over-learning
usually: better performance than decision trees
direct computation of the"Out-of-Bag" error: cross validation not
required
hyper-parameters (B, m) easy to tune

Cons
black box: difficult to interpret
slower training

M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2020 27 / 33
Motivations Bagging Random forests Extremely randomized trees References

Extremely randomized trees

Randomization can be pushed further with extremely randomized

forests
Method introduced by [Geurts et al., 2006]
It is a RF with two differences,
I m < d of the input variables are selected at random and for each of
these variables a split point is chosen at random
I The full learning set D is used to growth each tree (instead of a
bootstrapped learning set (Db )1≤b≤B )

M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2020 28 / 33
Motivations Bagging Random forests Extremely randomized trees References

Impact on correlation and bias

Theses two differences:

1 Using the full learning set
⇒ achieve a lower bias
But, the price is an increased variance

That should be compensated by the randomization of split-points,

2 choosing also the split-point at random
⇒ reduce the correlation between trees to reduce the variance of
the average of the ensemble more strongly

M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2020 29 / 33
Motivations Bagging Random forests Extremely randomized trees References

Algorithm of Extremely Randomized

n o
Forest
1: Require: A dataset D = (xi , yi )16i6n , the size B of the ensemble, the number
m of candidates for splitting
2: for b = 1 to B do
3: Grow a random tree using the original dataset D:
4: repeat
5: for all terminal node do
6: Select m variables among d, at random
7: for all sampled variables do
8: Select a split at random
9: end for
10: Pick the best variable and split-point couple among the m candidates
11: Split the node into two child nodes
12: end for
13: until the stopping criterion is met (e.g., minimum number of sample per node
reached)
14: end for
return: the ensemble of B trees
Algorithm 2: Pseudo-code describing Extremely Randomized Forest ap-
proach
M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2020 30 / 33
Motivations Bagging Random forests Extremely randomized trees References

Extremely Randomized Forest

Advantages of this approach,

Empirically, it often provides better results than RFs
Lower computational complexity compared to RFs (one chooses
the split among the m randomly drawn split-points)
Disadvantages
Black box: difficult to interpret

M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2020 31 / 33
Motivations Bagging Random forests Extremely randomized trees References

References I
Biau, G., Devroye, L., and Lugosi, G. (2008).
Consistency of random forests and other averaging classifiers.
Journal of Machine Learning Research, 9(Sep):2015–2033.

Breiman, L. (1996).
Bagging predictors.
Machine learning, 24(2):123–140.

Breiman, L. (2001).
Random forests.
Machine learning, 45(1):5–32.

Breiman, L., Friedman, J., Stone, C. J., and Olshen, R. A. (1984).

Classification and regression trees.
CRC press.

Geurts, P., Ernst, D., and Wehenkel, L. (2006).

Extremely randomized trees.
Machine learning, 63(1):3–42.

Hastie, T., Tibshirani, R., and Friedman, J. (2009).

The elements of statistical learning: data mining, inference, and prediction.
Springer Science & Business Media.

Liu, F. T., Ting, K. M., and Zhou, Z.-H. (2008).

Isolation forest.
In 2008 Eighth IEEE International Conference on Data Mining, pages 413–422. IEEE.

M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2020 32 / 33
Motivations Bagging Random forests Extremely randomized trees References

References II

Strobl, C., Boulesteix, A.-L., Kneib, T., Augustin, T., and Zeileis, A. (2008).
Conditional variable importance for random forests.
BMC bioinformatics, 9(1):307.

Tibshirani, R. J. and Efron, B. (1993).

An introduction to the bootstrap.
Monographs on statistics and applied probability, 57:1–436.

M. Tami (CentraleSupélec) Ensemble learning from theory to practice January - March, 2020 33 / 33

Unit 2
No ratings yet
Unit 2
13 pages
Decision Tree
No ratings yet
Decision Tree
7 pages
Notes On Random Forest
No ratings yet
Notes On Random Forest
2 pages
Session 1 On Random Forest 1
No ratings yet
Session 1 On Random Forest 1
8 pages
Random Forests
No ratings yet
Random Forests
2 pages
Random Forest Algorithm Updated
No ratings yet
Random Forest Algorithm Updated
11 pages
Ensemble Learning
No ratings yet
Ensemble Learning
13 pages
Noting and Drafting Skills
100% (2)
Noting and Drafting Skills
33 pages
HandsOnML Ch7E
No ratings yet
HandsOnML Ch7E
43 pages
lecture19-FromTreesToForests RandomForests
No ratings yet
lecture19-FromTreesToForests RandomForests
50 pages
Chapter07 Ensemble Learning
No ratings yet
Chapter07 Ensemble Learning
21 pages
D3 IT Random Forest Apr 2023
No ratings yet
D3 IT Random Forest Apr 2023
32 pages
Random Forest Algorithms - Comprehensive Guide With Examples
No ratings yet
Random Forest Algorithms - Comprehensive Guide With Examples
13 pages
Random Forest
No ratings yet
Random Forest
9 pages
Random Forest Algorithm in Machine Learning Random Forest Random Forests or Random Decision Trees Decision Trees
No ratings yet
Random Forest Algorithm in Machine Learning Random Forest Random Forests or Random Decision Trees Decision Trees
6 pages
Hap Id 12534903
100% (3)
Hap Id 12534903
2 pages
Random Forest for ML Enthusiasts
No ratings yet
Random Forest for ML Enthusiasts
4 pages
Session 7 - Random Forest
No ratings yet
Session 7 - Random Forest
8 pages
Data Mining Notes
No ratings yet
Data Mining Notes
5 pages
Random Forest
No ratings yet
Random Forest
6 pages
ML Unit 3 V2
No ratings yet
ML Unit 3 V2
47 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
40 pages
Lecture 9
No ratings yet
Lecture 9
12 pages
Enseble LEarning
100% (1)
Enseble LEarning
57 pages
Ensemble Learning Explained
No ratings yet
Ensemble Learning Explained
32 pages
Lecture 5
No ratings yet
Lecture 5
53 pages
Random Forest, CNN and Different Algorithm
No ratings yet
Random Forest, CNN and Different Algorithm
14 pages
LEVEL 3: Scope and Sequence: Big Question
No ratings yet
LEVEL 3: Scope and Sequence: Big Question
4 pages
Simple Sabotage Field Manual
50% (2)
Simple Sabotage Field Manual
16 pages
Bagging and Random Forest Presentation1
100% (3)
Bagging and Random Forest Presentation1
23 pages
Random Forest Summary
No ratings yet
Random Forest Summary
6 pages
Model Analysis
100% (3)
Model Analysis
7 pages
Random Forests: H S H H
No ratings yet
Random Forests: H S H H
2 pages
Preparation 7 - Ointments
No ratings yet
Preparation 7 - Ointments
8 pages
Classification Algorithms
No ratings yet
Classification Algorithms
68 pages
Random Forest
No ratings yet
Random Forest
14 pages
Survey-Questionnaire On Teachers Perception Re Distance Blended Learning
100% (1)
Survey-Questionnaire On Teachers Perception Re Distance Blended Learning
5 pages
Beginner's Guide To Accounting
100% (3)
Beginner's Guide To Accounting
70 pages
Random Forest
No ratings yet
Random Forest
29 pages
Lecture-12 Machine Learning With Python
No ratings yet
Lecture-12 Machine Learning With Python
18 pages
ML Mod 5.1
No ratings yet
ML Mod 5.1
18 pages
ML - 5
No ratings yet
ML - 5
53 pages
ML Lecture 15 Ensemble
No ratings yet
ML Lecture 15 Ensemble
27 pages
Topic-Sensitive PageRank Guide
No ratings yet
Topic-Sensitive PageRank Guide
11 pages
Example J.6 Base Plate Bearing On Concrete: Merican Nstitute of Teel Onstruction
100% (1)
Example J.6 Base Plate Bearing On Concrete: Merican Nstitute of Teel Onstruction
4 pages
Ensemble Learning in Machine Learning
No ratings yet
Ensemble Learning in Machine Learning
32 pages
UNIT-3 Material
No ratings yet
UNIT-3 Material
19 pages
Personal Development: 1 Quarter: Module 2 Developing The Whole Person
100% (2)
Personal Development: 1 Quarter: Module 2 Developing The Whole Person
10 pages
Random Forest
No ratings yet
Random Forest
27 pages
5G Wireless Technology: Millimeter Wave Health Effects
No ratings yet
5G Wireless Technology: Millimeter Wave Health Effects
5 pages
Lecture 6
No ratings yet
Lecture 6
24 pages
An Introduction To Random Forest Algorithm For Beginners
No ratings yet
An Introduction To Random Forest Algorithm For Beginners
16 pages
Concrete Prestressing Guide
No ratings yet
Concrete Prestressing Guide
23 pages
Random Forest Summary
No ratings yet
Random Forest Summary
6 pages
Unit 3
No ratings yet
Unit 3
63 pages
Unit I ML (I) 24-25-1
No ratings yet
Unit I ML (I) 24-25-1
152 pages
2.4-Ensemble Methods Lecture Notes
No ratings yet
2.4-Ensemble Methods Lecture Notes
14 pages
ML Unit 3
No ratings yet
ML Unit 3
22 pages
Lecture 9 PDF
100% (1)
Lecture 9 PDF
28 pages
WinDNC V06 02 NewFeatures en
100% (3)
WinDNC V06 02 NewFeatures en
2 pages
Ensemble Learning
No ratings yet
Ensemble Learning
16 pages
Unidad 4
No ratings yet
Unidad 4
12 pages
Splendor Plus
No ratings yet
Splendor Plus
1 page
Ensemble Learning: David Sontag New York University
No ratings yet
Ensemble Learning: David Sontag New York University
17 pages
CS109a Lecture16 Bagging RF Boosting
No ratings yet
CS109a Lecture16 Bagging RF Boosting
48 pages
Module 2
No ratings yet
Module 2
34 pages
UNIT-V (Bagging, Boosting, Random Forest) : by Dr. K. Aditya Shastry Associate Professor Dept. of ISE NMIT, Bengaluru
No ratings yet
UNIT-V (Bagging, Boosting, Random Forest) : by Dr. K. Aditya Shastry Associate Professor Dept. of ISE NMIT, Bengaluru
27 pages
Ensemble Models
No ratings yet
Ensemble Models
52 pages
Random Forests
No ratings yet
Random Forests
35 pages
Tutorial 07-MA 1063
No ratings yet
Tutorial 07-MA 1063
2 pages
GoAnywhere System Architecture Guide
No ratings yet
GoAnywhere System Architecture Guide
29 pages
Random Forests
No ratings yet
Random Forests
43 pages
Lecture 22: Bagging and Random Forest: Wenbin Lu Department of Statistics North Carolina State University Fall 2019
No ratings yet
Lecture 22: Bagging and Random Forest: Wenbin Lu Department of Statistics North Carolina State University Fall 2019
35 pages
Lecture 05 Random Forest 07112022 124639pm
No ratings yet
Lecture 05 Random Forest 07112022 124639pm
25 pages
Random Forest
No ratings yet
Random Forest
25 pages
Adcps: Question Paper Cum Answer Sheet
No ratings yet
Adcps: Question Paper Cum Answer Sheet
5 pages
3.flajolet Martin Algorithm
No ratings yet
3.flajolet Martin Algorithm
31 pages
Lecturenotes 3
No ratings yet
Lecturenotes 3
11 pages
Response of Framed Buildings To Excavation-Induced Movements
No ratings yet
Response of Framed Buildings To Excavation-Induced Movements
19 pages
Parthavi Electricals
No ratings yet
Parthavi Electricals
11 pages
Data Scientists' Guide to Random Forest
No ratings yet
Data Scientists' Guide to Random Forest
10 pages
LSD Thesis Statement
100% (3)
LSD Thesis Statement
5 pages
Bagging Trees & Random Forests Guide
No ratings yet
Bagging Trees & Random Forests Guide
50 pages
Repair & Rehab of Structures Course
No ratings yet
Repair & Rehab of Structures Course
2 pages
Tyre Industry in India - Me Project
100% (2)
Tyre Industry in India - Me Project
17 pages
1 MinHash-1
No ratings yet
1 MinHash-1
4 pages
Bonsai Cendrawasih How To Grow
No ratings yet
Bonsai Cendrawasih How To Grow
2 pages
0 - Ritu Sharma Old CV
No ratings yet
0 - Ritu Sharma Old CV
2 pages
Relational DB Design Lab Guide
No ratings yet
Relational DB Design Lab Guide
2 pages
Wind Meter App for Enthusiasts
No ratings yet
Wind Meter App for Enthusiasts
9 pages
Easter Events & Weather Forecast
No ratings yet
Easter Events & Weather Forecast
10 pages
Raisen PDF
No ratings yet
Raisen PDF
99 pages
Company Profile PDF
No ratings yet
Company Profile PDF
38 pages

Bagging & Random Forests Lecture

Uploaded by

Bagging & Random Forests Lecture

Uploaded by

Ensemble learning from theory to practice

Lecture 2: Bagging and Random Forests

January - March, 2023

4 Extremely randomized trees

Why to combine? An intuition

By the Hoeffding inequality (not on the course program), the probability

which tends towards zero exponentially in B

Why to combine? Another intuition

The expectation does not change, E[X̄ ] = µ (no bias i.e.,

Bagging (Bootstrap AGGregatING)

Introduced by Breiman [Breiman, 1996]

Bagging create training subsets using bootstrap sampling

I In the classification case (majority vote over trees),

Figure: Illustration of the bagging principle (with LΘ

Method introduced by Leo

Random forests (notations)

source: Arbres de décision et forêts aléatoires, P. Gaillard, 2014

The problem of correlation between trees

Bagging idea: aggregate many noisy but approximately unbiased1

Create low correlated trees

We saw the disadvantage of

Let introduce the improvement proposed by random forests: is to lower

Definition of Random forests

Definition (Random forests)

Algorithm of Random forests

Random forests in practice

Intuitively, reducing m will reduce the correlation between any

For further information about random forests, you can refer to

OOB-error (Out-Of-Bag error)

OOB error (out of bootstrap samples) - the principle

where ŷi from an aggregation of f̂b built on Db \ (xi , yi )

Variable Importance (VI)

Random Forests (RF) allow to rank the explanatory variables in

Variable Importance based on impurity decrease

The relative importance is the VI divided by the highest VI value

In the case of RF,

Variable Importance based on impurity decrease

Variable Importance based on permutation

Variable Importance based on permutation

The process is the following,

Variable Importance based on permutation

RFs are well suited to detecting outliers [Liu et al., 2008]

Pros and cons of Random forests

Extremely randomized trees

Randomization can be pushed further with extremely randomized

Impact on correlation and bias

Theses two differences:

That should be compensated by the randomization of split-points,

Algorithm of Extremely Randomized

Extremely Randomized Forest

Advantages of this approach,

Breiman, L., Friedman, J., Stone, C. J., and Olshen, R. A. (1984).

Geurts, P., Ernst, D., and Wehenkel, L. (2006).

Hastie, T., Tibshirani, R., and Friedman, J. (2009).

Liu, F. T., Ting, K. M., and Zhou, Z.-H. (2008).

Tibshirani, R. J. and Efron, B. (1993).

You might also like