0% found this document useful (0 votes)

237 views20 pages

Improving Classification With AdaBoost

1) AdaBoost is a meta-algorithm that can combine multiple weak classifiers to create a strong classifier with better than random accuracy. 2) It works by sequentially applying weak classifiers to reweighted versions of the data, where misclassified examples get higher weight. This focuses each new classifier on examples missed by previous ones. 3) AdaBoost exponentially decreases classification error over multiple rounds if the weak classifiers perform just slightly better than random guessing. It has been shown to be very effective in practice.

Uploaded by

Hawking Bear

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

237 views20 pages

Improving Classification With AdaBoost

Uploaded by

Hawking Bear

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

Improving Classification with the AdaBoost

meta-algorithm

Hawking Bear
February 5, 2022
National Institute of Science Education and Research

1
Problem Statement
Problem Statement

• For a classification problem (assume binary), we are given

a ”weak classifier”.
• Weak classifier - Classifier that performs just slightly
better than random guessing (> 50% accuracy).
• Can we combine multiple instances of the weak classifier
to obtain a strong classifier?

2
Meta-algorithms

• Methods that combine multiple classifiers are called

ensemble methods or meta-algorithms.
• Bagging and boosting are two common types.

3
Bagging and Boosting
Bagging

• Given a dataset X, we randomly sample X (with

replacement) S times to make S new datasets of equal size
as X.
• The weak classifier is applied to each dataset individually.
• To classify a new data point, we apply our S classifiers to
the new data points and take a majority vote.

4
Boosting

• Sequential use of classifiers over T rounds.

• In each subsequent round, the data points that were
misclassified in the previous round are given higher
priority.
• AdaBoost is the most popular boosting algorithm.

5
AdaBoost
AdaBoost

• To demonstrate the algorithms, we’ll use decision stumps

as the weak classifier.
• Decision stumps are decision trees of depth one which
classify data points based on just one feature and one
threshold.

6
AdaBoost

Figure 1: Sample data for decision stumps.

7
AdaBoost Pseudocode

Figure 2: AdaBoost pseudocode [1].

8
AdaBoost Schematic

Figure 3: Schematic representation of AdaBoost.

9
Why this formula for α?

• If α takes the given form and α > 0, it can be shown that

the classification error exponentially decreases over
multiple rounds [2].
• αt ≥ 0 if t ≤ 1/2, which is why we require the weak
classifier to have greater than 50% classification accuracy.

10
Class Imbabalance
What is it?

• Let’s say we’re building a classifier to detect a rare brain

tumor from MRI scans.
• In the dataset, for every positive sample there are 100,000
negative samples.
• A model that seeks to minimize classification error will
perform poorly at detecting cancer patients.

11
How do we detect it?

• Classification error doesn’t cut it, we need alternative

performance metrics.
• Confusion matrix is useful here.

Figure 4: Confusion matrix for a binary classification problem.

12
How do we detect it?

TP
• Precision = TP+FP = fraction of records that were positive
from the group that the classifier predicted to be positive.
TP
• Recall = TP+FN = fraction of positive examples the classifier
got right.
• Very useful when used together.

13
How do we address it?

1. Manipulate the cost matrix.

2. Resample during training.

Figure 5: Typical (top) and

modified (bottom) cost
matrices.

14
References i

1. Freund, Y., Schapire, R. & Abe, N. A short introduction to

boosting. Journal-Japanese Society For Artificial
Intelligence 14, 1612 (1999).
2. Freund, Y. & Schapire, R. E. A Decision-Theoretic
Generalization of On-Line Learning and an Application to
Boosting. Journal of Computer and System Sciences 55,
119–139. issn: 0022-0000.
https://www.sciencedirect.com/science/
article/pii/S002200009791504X (1997).

15
Why the name?

1
• Let the training error t of ht be given by 2 − γt .
• Previous learning algorithms required that γt be known a
priori before boosting begins.
• AdaBoost adapts to the error rates of the individual weak
hypotheses, thus the name ’adaptive’.

Unit 3 Theories and Principles in The Use and Design of Technology Driven Learning Lessons
100% (1)
Unit 3 Theories and Principles in The Use and Design of Technology Driven Learning Lessons
49 pages
LEVEL 3: Scope and Sequence: Big Question
No ratings yet
LEVEL 3: Scope and Sequence: Big Question
4 pages
Goldfrank's Toxicologic Emergencies, 11E (TRUE PDF) 11th Edition Robert S. Hoffman PDF Download
100% (1)
Goldfrank's Toxicologic Emergencies, 11E (TRUE PDF) 11th Edition Robert S. Hoffman PDF Download
62 pages
Reflective Essay On Module
No ratings yet
Reflective Essay On Module
5 pages
Geology of The Area
No ratings yet
Geology of The Area
4 pages
Lesson 3 Four Pillars of Education
No ratings yet
Lesson 3 Four Pillars of Education
40 pages
Physical Science Quiz for Students
No ratings yet
Physical Science Quiz for Students
5 pages
Factory Act Return PDF Download Form 22
No ratings yet
Factory Act Return PDF Download Form 22
1 page
UPI Transactiosn Frauds in India
No ratings yet
UPI Transactiosn Frauds in India
4 pages
Distance Learning Courses DLEN
No ratings yet
Distance Learning Courses DLEN
35 pages
VFD Application Checklist
No ratings yet
VFD Application Checklist
3 pages
Mathematical and Physical Formulas
No ratings yet
Mathematical and Physical Formulas
10 pages
Tiếng Anh thầy Tiểu Đạt - chuyên luyện thi Đại học Mr. Tieu Dat's English Academy Thầy Lưu Tiến Đạt (thầy Tiểu Đạt) Chuyên gia luyện thi môn Tiếng Anh
No ratings yet
Tiếng Anh thầy Tiểu Đạt - chuyên luyện thi Đại học Mr. Tieu Dat's English Academy Thầy Lưu Tiến Đạt (thầy Tiểu Đạt) Chuyên gia luyện thi môn Tiếng Anh
5 pages
MH 400
No ratings yet
MH 400
81 pages
Consumer Attitude Formation and Change: Consumer Behavior, Eighth Edition Schiffman & Kanuk
No ratings yet
Consumer Attitude Formation and Change: Consumer Behavior, Eighth Edition Schiffman & Kanuk
21 pages
Asuhan Keperawatan Diare
No ratings yet
Asuhan Keperawatan Diare
32 pages
Boosted Trees
No ratings yet
Boosted Trees
66 pages
RLB Construction Market Update Vietnam Q2 2018
No ratings yet
RLB Construction Market Update Vietnam Q2 2018
8 pages
Natural Disasters
No ratings yet
Natural Disasters
14 pages
Adaboost
No ratings yet
Adaboost
22 pages
Machine Learning: Ensemble Methods
No ratings yet
Machine Learning: Ensemble Methods
54 pages
Soil Variability and Its Consequences in Geotechnical Engineering
No ratings yet
Soil Variability and Its Consequences in Geotechnical Engineering
302 pages
Boosting
No ratings yet
Boosting
31 pages
Boosting: CS229: Machine Learning Carlos Guestrin
No ratings yet
Boosting: CS229: Machine Learning Carlos Guestrin
45 pages
کتاب هفتم بارگزاری شده
No ratings yet
کتاب هفتم بارگزاری شده
57 pages
RF Heating: Created in COMSOL Multiphysics 5.3a
No ratings yet
RF Heating: Created in COMSOL Multiphysics 5.3a
22 pages
Bagging - Boosting
No ratings yet
Bagging - Boosting
9 pages
14-AI ML Ensemble 2022
No ratings yet
14-AI ML Ensemble 2022
41 pages
Adaboost
No ratings yet
Adaboost
29 pages
2.5. Database+File+Layout
No ratings yet
2.5. Database+File+Layout
9 pages
Os Lec 4 Process
No ratings yet
Os Lec 4 Process
7 pages
Ensemble (v6)
No ratings yet
Ensemble (v6)
45 pages
Horse With Cowboy
No ratings yet
Horse With Cowboy
1 page
AdaBoost Final
No ratings yet
AdaBoost Final
97 pages
APSC 255 Formula Sheet
No ratings yet
APSC 255 Formula Sheet
3 pages
Unit V - Multiple Learners
No ratings yet
Unit V - Multiple Learners
54 pages
Pradipta Kumar Pattanayak - Ada Boosting
No ratings yet
Pradipta Kumar Pattanayak - Ada Boosting
44 pages
Lecture 16: Boosting - Applied ML
No ratings yet
Lecture 16: Boosting - Applied ML
20 pages
How To Mount A Remote File System Using Network File System (NFS)
No ratings yet
How To Mount A Remote File System Using Network File System (NFS)
3 pages
Ensemble - Part 1
No ratings yet
Ensemble - Part 1
33 pages
DM (Boosting)
No ratings yet
DM (Boosting)
15 pages
Boosting
No ratings yet
Boosting
2 pages
AdaBoost Algorithm: Key Features & Benefits
No ratings yet
AdaBoost Algorithm: Key Features & Benefits
9 pages
Life Vision Int Gram Worksheet B U4
No ratings yet
Life Vision Int Gram Worksheet B U4
1 page
ENG6500 7 Ensembles Boosting
No ratings yet
ENG6500 7 Ensembles Boosting
49 pages
Ada Boost
No ratings yet
Ada Boost
25 pages
2nd Exam TQ
No ratings yet
2nd Exam TQ
23 pages
CHOCOLAT (1988) Analysis of The Film
No ratings yet
CHOCOLAT (1988) Analysis of The Film
3 pages
Omkar Resume
No ratings yet
Omkar Resume
2 pages
Boosting Mit
No ratings yet
Boosting Mit
36 pages
Boosting Algo Adaboost
No ratings yet
Boosting Algo Adaboost
3 pages
Machine Learning Boosting Guide
No ratings yet
Machine Learning Boosting Guide
27 pages
Addaboost
No ratings yet
Addaboost
12 pages
07 Boosting Notes
No ratings yet
07 Boosting Notes
10 pages
LECTURE+NOTES Boosting
No ratings yet
LECTURE+NOTES Boosting
8 pages
Computational Data Analysis: Machine Learning
No ratings yet
Computational Data Analysis: Machine Learning
26 pages
Boosting Algorithms Explained
No ratings yet
Boosting Algorithms Explained
79 pages
Instruction Manual: P/N 30-2131-XXX Pressure Sensors
No ratings yet
Instruction Manual: P/N 30-2131-XXX Pressure Sensors
2 pages
Bagging+Boosting+Gradient Boosting
100% (1)
Bagging+Boosting+Gradient Boosting
48 pages
Algorithm Adaboost
No ratings yet
Algorithm Adaboost
1 page
Chapter 3 - Boosting Theory
No ratings yet
Chapter 3 - Boosting Theory
7 pages
Ensemble Learning
No ratings yet
Ensemble Learning
9 pages
Ensemble Classifiers Overview
No ratings yet
Ensemble Classifiers Overview
37 pages
Overview of Adaboost: Reconciling Its Views To Better Understand Its Dynamics
No ratings yet
Overview of Adaboost: Reconciling Its Views To Better Understand Its Dynamics
39 pages
Lecture Notes 7
No ratings yet
Lecture Notes 7
8 pages
A Brief Introduction To Adaboost: Hongbo Deng 6 Feb, 2007
No ratings yet
A Brief Introduction To Adaboost: Hongbo Deng 6 Feb, 2007
35 pages
22 Boosting
No ratings yet
22 Boosting
32 pages
Data Mining - Ensemble Methods
No ratings yet
Data Mining - Ensemble Methods
12 pages
Introduction To Boosting: Cynthia Rudin PACM, Princeton University
No ratings yet
Introduction To Boosting: Cynthia Rudin PACM, Princeton University
29 pages
AdaBoost Is Consistent
No ratings yet
AdaBoost Is Consistent
22 pages
Boosting
No ratings yet
Boosting
6 pages
AdaBoost Notes
No ratings yet
AdaBoost Notes
5 pages
Boosting and AdaBoost For Machine Learning
No ratings yet
Boosting and AdaBoost For Machine Learning
18 pages
Ada Boost
No ratings yet
Ada Boost
7 pages
AdaBoost: A Guide for Data Scientists
No ratings yet
AdaBoost: A Guide for Data Scientists
17 pages
Bagging and Boosting: 9.520 Class 10, 13 March 2006 Sasha Rakhlin
No ratings yet
Bagging and Boosting: 9.520 Class 10, 13 March 2006 Sasha Rakhlin
19 pages
Boosting Approach To Machine Learn
No ratings yet
Boosting Approach To Machine Learn
23 pages
A Short Introduction To Boosting
No ratings yet
A Short Introduction To Boosting
14 pages
Boosting and Applications Yuan
No ratings yet
Boosting and Applications Yuan
41 pages
FAQ - Boosting - Ensemble Techniques - Great Learning
No ratings yet
FAQ - Boosting - Ensemble Techniques - Great Learning
2 pages
Multi-class AdaBoost Explained
No ratings yet
Multi-class AdaBoost Explained
12 pages
1 Eric Boosting304FinalRpdf
No ratings yet
1 Eric Boosting304FinalRpdf
19 pages
Adaboost: Derek Hoiem March 31, 2004
No ratings yet
Adaboost: Derek Hoiem March 31, 2004
46 pages
Experiments With A New Boosting Algorithm: Yoav Freund Robert E. Schapire
No ratings yet
Experiments With A New Boosting Algorithm: Yoav Freund Robert E. Schapire
16 pages
A Short Introduction To Boosting
No ratings yet
A Short Introduction To Boosting
14 pages
Statistics Project
No ratings yet
Statistics Project
5 pages
AdaBoost M1
No ratings yet
AdaBoost M1
16 pages