Module 7 Notes

DFFE

Uploaded by

abdullahikulei

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views3 pages

Module 7 Notes

DFFE

Uploaded by

abdullahikulei

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

MODULE 7: Ensemble learning methods

Three common methods of ensemble learning are boosting, bagging and stack-
ing.
In boosting, the training is sequential. Each new model attempts to correct
the errors made by the previously trained model. This is made by assigning
higher weights to the previously misclassified examples. Examples of boosting
algorithms are AdaBoost, GradientBoost and XGBoost.
Bagging involves training several independent models using different ran-
dom subsets of the training data with replacement. Predictions of the different
models are then combined using voting for classifications and averaging for re-
gression. The most common bagging algorithm is random forests.
Stacking is another sequential ensemble learning method. In stacking, the
final learner use the outputs of base learners as inputs. The base learners are
trained using different algorithms. This is also known as meta-learning and the
final model is known as a meta-model.
We will review two ensemble learning algorithms that is AdaBoost and Ran-
dom Forests.

AdaBoost
The adaptive Boost algorithm iteratively creates a set of weak learners. In each
iteration, the weak learner trained focuses on the wrongly classified samples from
the previous iterations. This is achieved by selecting a bootstrapped dataset
which consists of samples from the original dataset. A sample in the original
dataset is more likely to be included in the bootstrapped dataset based on its
weight. A sample previously misclassified has more weight compared to a sample
that was correctly classified.
The following is the sequence of steps in the AdaBoost Algorithm:
Given the sample dataset

(x1 , y1 ) , (x2 , y2 ) . . . (xn , yn )

(i) Assign Equal Weights to each of the samples

(ii) Training
(1) Select a bootstrapped sample of size n by random selection with replacement
with probability of selection determined by the weight of the sample
(2) Train a weak learner (ft ) such as a decision stamp using the bootstrapped
sample
(3) Calculate the error rate of the weak learner using the formula
Pn
j=1 wj (ft (xj ) ̸= yj )
Ef t = Pn
k=1 wk

(4) Compute the weight of the weak learner using the formula below

1

1 (1 − Ef t )
αt = ln
2 Ef t
(5) Update the weights of the of the original samples using the formula

wi = wi × e±αt
Where the exponent is negative for correctly classified examples and positive
for incorrectly classified examples.
(6) Normalize the weights using the formula below
wi
wi = Pn
j=1 wj
(iii) Repeat step (ii) until the number of weak learners is sufficient
For a new sample to be classified x you use the formula below:
T
X
F (x) = Arg max αt ft (x) = c
c∈C
t=1

Random Forest Classifier

Random forest is an ensemble learning mechanism that was introduced by Leo
Breiman (Cutler A, Cutler D., & Stevens, 2014). They can be used for both clas-
sification and regression. According to Cutler A, Cutler D., & Stevens (2014),
the following are the advantages of random forests.
i. The can be used for both classification and regression
ii. They are fast to train
iii. They are fast in predicting
iv. The depend on one or two training parameters
v. They have a built in estimate of the generalization error
vi. They can be used for high dimension problems
vii. They can be implemented in parallel
viii. The can provide variable importance
The following is the sequence of steps in training a random forest classifier:
i. Randomly select M bootstrap samples each of size n from the datase † with
replacement
ii. Train M decision trees using the process below
At each node, given the total number of features p, randomly select k features
that are going to compete for the creation of the node. k is usually squareroot
of p for classification and p/3 for regression.
Grow the tree to the greatest extent possible without pruning
iii. Combine the trees as necessary for classification or regression. For classi-
fication, the majority class is taken as the predicted value. For regression, the
output of all the trees are averaged.

2
IMPLEMENTATION OF ENSEMBLE ALGO-
RITHMS USING PYTHON
You can implement AdaBoost and RandomForest classifiers
using python as shown below.
from sklearn.ensemble import AdaBoostClassifier, RandomForestClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn import metrics
from datetime import datetime
import matplotlib.pyplot as plt
data=load_iris()
y=data.target
labels=data.target_names
x=data.data
# split the data into training and test sets
x_train,x_test, y_train,y_test= train_test_split(x,y, random_state=42,
test_size=0.2, shuffle=True)
#create the classifiers
adBoostModel= AdaBoostClassifier(n_estimators=100, random_state=0)
rfModel= RandomForestClassifier(n_estimators=100, random_state=0)
#fit the models.
adBoostModel.fit(x_train, y_train)
rfModel.fit(x_train, y_train)
y_pred_adBoost= adBoostModel.predict(x_test)
y_pred_rf= rfModel.predict(x_test)
#get the accuracy
AdaBoostAccuracy=metrics.accuracy_score(y_test,y_pred_adBoost)
rfAccuracy=metrics.accuracy_score(y_test,y_pred_rf)
print("AdaBoost Accuracy:",AdaBoostAccuracy)
print("Random Forest Accuracy:",rfAccuracy)

Finance-Focused Big Data Techniques
100% (1)
Finance-Focused Big Data Techniques
23 pages
Propensity Score Analysis Statistical Methods and Applications 2nd Edition Shenyang Y. Guo Instant Download
100% (1)
Propensity Score Analysis Statistical Methods and Applications 2nd Edition Shenyang Y. Guo Instant Download
61 pages
ML - Logistic Regression
No ratings yet
ML - Logistic Regression
14 pages
Hsthe Ledger - Business Studies Form 3 Notes-2050
No ratings yet
Hsthe Ledger - Business Studies Form 3 Notes-2050
14 pages
Individual Assignment FIN534
No ratings yet
Individual Assignment FIN534
7 pages
CamScanner 29-05-2025 17.19
No ratings yet
CamScanner 29-05-2025 17.19
12 pages
Abdullahi Wilson - FeeStatement - 20250527162542
No ratings yet
Abdullahi Wilson - FeeStatement - 20250527162542
1 page
21102035# Nazmun Nahar Sneha#Finance
No ratings yet
21102035# Nazmun Nahar Sneha#Finance
38 pages
SS2 Mathematics Week 4 Third Term
No ratings yet
SS2 Mathematics Week 4 Third Term
3 pages
Unit 5 (ML)
No ratings yet
Unit 5 (ML)
25 pages
Poisson
No ratings yet
Poisson
54 pages
962 Geography Paper 2 Revision
No ratings yet
962 Geography Paper 2 Revision
9 pages
Boosted Trees
No ratings yet
Boosted Trees
66 pages
Business PP2 MS
No ratings yet
Business PP2 MS
8 pages
Test On Population Mean
No ratings yet
Test On Population Mean
25 pages
EDA - Midterms - Reviewer
No ratings yet
EDA - Midterms - Reviewer
7 pages
MEIMUNA
No ratings yet
MEIMUNA
2 pages
THE UMBRELLA.-WPS Office
No ratings yet
THE UMBRELLA.-WPS Office
2 pages
Pop Talent Show
No ratings yet
Pop Talent Show
2 pages
Asset Correlation Estimation in The Vasicek Model
No ratings yet
Asset Correlation Estimation in The Vasicek Model
9 pages
Fathers of Nations POSSIBLE KCSE ESSAYS
No ratings yet
Fathers of Nations POSSIBLE KCSE ESSAYS
5 pages
Nizikeni Ningali Hai
No ratings yet
Nizikeni Ningali Hai
1 page
3.7 CHEMISTRY (233) 3.7.1 Chemistry Paper 1 (233/1)
No ratings yet
3.7 CHEMISTRY (233) 3.7.1 Chemistry Paper 1 (233/1)
10 pages
Module 5 Notes
No ratings yet
Module 5 Notes
8 pages
Intro to Supervised Learning
No ratings yet
Intro to Supervised Learning
92 pages
Mod 6 DSC 204
No ratings yet
Mod 6 DSC 204
10 pages
DSC 202
No ratings yet
DSC 202
8 pages
Module 10 Notes
No ratings yet
Module 10 Notes
5 pages
Minutes of Language Department Meeting Held On 17TH May 2024 in The Ict Lab at 2
No ratings yet
Minutes of Language Department Meeting Held On 17TH May 2024 in The Ict Lab at 2
4 pages
CNC Plotter
No ratings yet
CNC Plotter
16 pages
Module 2 Notes
No ratings yet
Module 2 Notes
4 pages
Bagging
No ratings yet
Bagging
7 pages
CNC Plotter
No ratings yet
CNC Plotter
17 pages
2025 - Exhibition Judging Criteria
No ratings yet
2025 - Exhibition Judging Criteria
4 pages
Mathematical Science
No ratings yet
Mathematical Science
22 pages
The Pilar of Le-Wps Office
No ratings yet
The Pilar of Le-Wps Office
3 pages
Behavioural Science Script
No ratings yet
Behavioural Science Script
3 pages
Behavioural Science25
No ratings yet
Behavioural Science25
21 pages
The Population Discrepancy Between Cronbach - Raykov
No ratings yet
The Population Discrepancy Between Cronbach - Raykov
10 pages
Ensemble Methods
No ratings yet
Ensemble Methods
19 pages
Mulki 2
No ratings yet
Mulki 2
2 pages
13/07/2025 VOLLEYBALL GAME The Senior Chief Adano Girls Furaha Girls Sec
No ratings yet
13/07/2025 VOLLEYBALL GAME The Senior Chief Adano Girls Furaha Girls Sec
2 pages
KKVXAz
No ratings yet
KKVXAz
1 page
Ensemble Methods Final PDF
No ratings yet
Ensemble Methods Final PDF
25 pages
Ensembles
No ratings yet
Ensembles
9 pages
Shaffana Kintani Azzahra ..
No ratings yet
Shaffana Kintani Azzahra ..
3 pages
Random Forest-Supervised ML
No ratings yet
Random Forest-Supervised ML
45 pages
TT Mid T1 23-1
No ratings yet
TT Mid T1 23-1
1 page
TT Mid T1 23-1
No ratings yet
TT Mid T1 23-1
1 page
Assessing Predictive Models
No ratings yet
Assessing Predictive Models
25 pages
Week 11
No ratings yet
Week 11
16 pages
TOS GRADE 11 STAT & PROB (Finals) 2019 - 2020
100% (1)
TOS GRADE 11 STAT & PROB (Finals) 2019 - 2020
2 pages
Ensemble Learning-1
No ratings yet
Ensemble Learning-1
61 pages
ML UNIT-5 Answers
No ratings yet
ML UNIT-5 Answers
6 pages
14-AI ML Ensemble 2022
No ratings yet
14-AI ML Ensemble 2022
41 pages
LR Desktop Udo6rlp
No ratings yet
LR Desktop Udo6rlp
4 pages
D3 IT Random Forest Apr 2023
No ratings yet
D3 IT Random Forest Apr 2023
32 pages
Ensemble Methods
No ratings yet
Ensemble Methods
31 pages
UNIT III Word File
No ratings yet
UNIT III Word File
13 pages
Ensembling Techniques
No ratings yet
Ensembling Techniques
11 pages
Ensemble Method
No ratings yet
Ensemble Method
8 pages
ML Exp 9
No ratings yet
ML Exp 9
3 pages
Boosting
No ratings yet
Boosting
2 pages
ML Lecture 15 Ensemble
No ratings yet
ML Lecture 15 Ensemble
27 pages
Unit 3 Aml
No ratings yet
Unit 3 Aml
9 pages
ML-Unit I - Ensemble Methods
No ratings yet
ML-Unit I - Ensemble Methods
54 pages
Ensemble Learning
No ratings yet
Ensemble Learning
35 pages
Unit I ML (I) 24-25-1
No ratings yet
Unit I ML (I) 24-25-1
152 pages
ML Unit 3-1
No ratings yet
ML Unit 3-1
14 pages
Data Mining - Ensemble Methods
No ratings yet
Data Mining - Ensemble Methods
12 pages
Cost Estimation: True / False Questions
No ratings yet
Cost Estimation: True / False Questions
216 pages
Classification Algorithms
No ratings yet
Classification Algorithms
68 pages
Module 2
No ratings yet
Module 2
34 pages
Data Mining Notes
No ratings yet
Data Mining Notes
5 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
40 pages
Bagging
No ratings yet
Bagging
6 pages
Unit I ML (I) 24-25
No ratings yet
Unit I ML (I) 24-25
79 pages
Module 7 - Ensemble Learning
No ratings yet
Module 7 - Ensemble Learning
41 pages
Random Forest
No ratings yet
Random Forest
27 pages
Ensemble - Part 1
No ratings yet
Ensemble - Part 1
33 pages
ML Mod 5.1
No ratings yet
ML Mod 5.1
18 pages
22AIP3101A Session 11
No ratings yet
22AIP3101A Session 11
30 pages
Ensemble Learning
No ratings yet
Ensemble Learning
16 pages
Ch-4 Ensemble Learning
No ratings yet
Ch-4 Ensemble Learning
18 pages
Ensemble Machine Learning Approach
No ratings yet
Ensemble Machine Learning Approach
13 pages
Ensemble Classifiers Overview
No ratings yet
Ensemble Classifiers Overview
37 pages
Haramaya University College of Computing and Informatics Department of Statistics
No ratings yet
Haramaya University College of Computing and Informatics Department of Statistics
2 pages
Trees Handout
No ratings yet
Trees Handout
51 pages
16-Ensemble Learning - Cont... - 12-04-2024
No ratings yet
16-Ensemble Learning - Cont... - 12-04-2024
13 pages
A New Generalized Weibull Distribution Generated by Gamma Random Variables
No ratings yet
A New Generalized Weibull Distribution Generated by Gamma Random Variables
9 pages
Solution Sta116 Part 3
No ratings yet
Solution Sta116 Part 3
6 pages
Lecture 6
No ratings yet
Lecture 6
24 pages
New Course Outline Managerial Statistics-1
100% (1)
New Course Outline Managerial Statistics-1
4 pages
Data Science Course Content
No ratings yet
Data Science Course Content
8 pages
Ensemble Learning
No ratings yet
Ensemble Learning
52 pages
Unit-3 ML
No ratings yet
Unit-3 ML
18 pages
AJC H2Maths 2012prelim P2 Question
No ratings yet
AJC H2Maths 2012prelim P2 Question
6 pages
5 - EnsembleModeling
No ratings yet
5 - EnsembleModeling
80 pages
Capstone Project
100% (1)
Capstone Project
7 pages
ML Unit 3
No ratings yet
ML Unit 3
22 pages
Ensemble Methods in Machine Learning
No ratings yet
Ensemble Methods in Machine Learning
17 pages
ANCOVA Analysis Using R
No ratings yet
ANCOVA Analysis Using R
29 pages
KYKO Validation and Calibration of Items
No ratings yet
KYKO Validation and Calibration of Items
50 pages
Robust Regression
No ratings yet
Robust Regression
52 pages
Slides 0
No ratings yet
Slides 0
21 pages
Senior Chief Adano Girls High School
No ratings yet
Senior Chief Adano Girls High School
1 page
Slides Prepared by John S. Loucks St. Edward's University: 1 Slide © 2003 Thomson/South-Western
No ratings yet
Slides Prepared by John S. Loucks St. Edward's University: 1 Slide © 2003 Thomson/South-Western
34 pages
New CMA Part 1 Section A
100% (2)
New CMA Part 1 Section A
114 pages
What Is Ensemble Learning
No ratings yet
What Is Ensemble Learning
4 pages
Bagging and Random Forest Presentation1
100% (3)
Bagging and Random Forest Presentation1
23 pages
UNIT-V (Bagging, Boosting, Random Forest) : by Dr. K. Aditya Shastry Associate Professor Dept. of ISE NMIT, Bengaluru
No ratings yet
UNIT-V (Bagging, Boosting, Random Forest) : by Dr. K. Aditya Shastry Associate Professor Dept. of ISE NMIT, Bengaluru
27 pages
Outlines: Statements of Problems Objectives Bagging Random Forest Boosting Adaboost
100% (1)
Outlines: Statements of Problems Objectives Bagging Random Forest Boosting Adaboost
14 pages
1.1 - Xgboost, GBboost, Adaboost - Boosting - Medium
No ratings yet
1.1 - Xgboost, GBboost, Adaboost - Boosting - Medium
6 pages
Economics Exam Prep Guide
No ratings yet
Economics Exam Prep Guide
9 pages
Bagging+Boosting+Gradient Boosting
100% (1)
Bagging+Boosting+Gradient Boosting
48 pages
Trees, Boosting, and Random Forest
No ratings yet
Trees, Boosting, and Random Forest
14 pages
Ensemble Classifiers
100% (1)
Ensemble Classifiers
37 pages