0% found this document useful (0 votes)

8 views34 pages

Ensemble Learning

The document discusses two main approaches to supervised learning: single model philosophy and ensemble learning. Single models aim to minimize prediction errors by finding the simplest model, while ensemble methods combine multiple models to improve accuracy and stability. Key techniques in ensemble learning include bagging and boosting, each with distinct methodologies for combining models to enhance predictive performance.

Uploaded by

Mr.Dracula1989

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views34 pages

Ensemble Learning

Uploaded by

Mr.Dracula1989

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 34

Sparse vs.

Ensemble Approaches
to Supervised Learning
Greg Grudic
Modified by Longin Jan Latecki
Some slides are by Piyush Rai

Intro AI Ensembles 1
Goal of Supervised Learning?
• Minimize the probability of model
prediction errors on future data

• Two Competing Methodologies

– Build one really good model
• Traditional approach
– Build many models and average the results
• Ensemble learning (more recent)

Intro AI Ensembles 2
The Single Model Philosophy
• Motivation: Occam’s Razor
– “one should not increase, beyond what is
necessary, the number of entities required to
explain anything”
• Infinitely many models can explain any
given dataset
– Might as well pick the smallest one…

Intro AI Ensembles 3
Which Model is Smaller?

• In this case

• It’s not always easy to define small!

Intro AI Ensembles 4
Exact Occam’s Razor Models
• Exact approaches find optimal solutions
• Examples:
– Support Vector Machines
• Find a model structure that uses the smallest
percentage of training data (to explain the rest of it).
– Bayesian approaches
• Minimum description length

Intro AI Ensembles 5
How Do Support Vector Machines
Define Small?
Minimize the number
of Support Vectors!

Maximized
Margin

Intro AI Ensembles 6
Approximate Occam’s Razor
Models
• Approximate solutions use a greedy search
approach which is not optimal
• Examples
– Kernel Projection Pursuit algorithms
• Find a minimal set of kernel projections
– Relevance Vector Machines
• Approximate Bayesian approach
– Sparse Minimax Probability Machine Classification
• Find a minimum set of kernels and features

Intro AI Ensembles 7
Other Single Models: Not
Necessarily Motivated by Occam’s
Razor
• Minimax Probability Machine (MPM)
• Trees
– Greedy approach to sparseness
• Neural Networks
• Nearest Neighbor
• Basis Function Models
– e.g. Kernel Ridge Regression
Intro AI Ensembles 8
Ensemble Philosophy
• Build many models and combine them
• Only through averaging do we get at the
truth!
• It’s too hard (impossible?) to build a single
model that works best
• Two types of approaches:
– Models that don’t use randomness
– Models that incorporate randomness

Intro AI Ensembles 9
Ensemble Approaches
• Bagging
– Bootstrap aggregating

• Boosting

• Random Forests
– Bagging reborn

Intro AI Ensembles 10
Bagging
• Main Assumption:
– Combining many unstable predictors to produce a
ensemble (stable) predictor.
– Unstable Predictor: small changes in training data
produce large changes in the model.
• e.g. Neural Nets, trees
• Stable: SVM (sometimes), Nearest Neighbor.
• Hypothesis Space
– Variable size (nonparametric):
• Can model any function if you use an appropriate predictor
(e.g. trees)

Intro AI Ensembles 11
The Bagging Algorithm

Given data:

For
• Obtain bootstrap sample from the
training data
• Build a model from bootstrap data

Intro AI Ensembles 12
The Bagging Model
• Regression

• Classification:
– Vote over classifier outputs

Intro AI Ensembles 13
Bagging Details
• Bootstrap sample of N instances is obtained
by drawing N examples at random, with
replacement.
• On average each bootstrap sample has
63% of instances
– Encourages predictors to have
uncorrelated errors
• This is why it works
Intro AI Ensembles 14
Bagging Details 2
• Usually set
– Or use validation data to pick
• The models need to be unstable
– Usually full length (or slightly pruned) decision
trees.

Intro AI Ensembles 15
Boosting
– Main Assumption:
• Combining many weak predictors (e.g. tree stumps
or 1-R predictors) to produce an ensemble predictor
• The weak predictors or classifiers need to be stable
– Hypothesis Space
• Variable size (nonparametric):
– Can model any function if you use an appropriate
predictor (e.g. trees)

Intro AI Ensembles 16
Commonly Used Weak Predictor
(or classifier)
A Decision Tree Stump (1-R)

Intro AI Ensembles 17
Boosting

Each classifier is
trained from a weighted
Sample of the training
Data

Intro AI Ensembles 18
Boosting (Continued)
• Each predictor is created by using a biased
sample of the training data
– Instances (training examples) with high error
are weighted higher than those with lower error
• Difficult instances get more attention
– This is the motivation behind boosting

Intro AI Ensembles 19
Background Notation
• The function is defined as:

• The function is the natural logarithm

Intro AI Ensembles 20
The AdaBoost Algorithm
(Freund and Schapire, 1996)
Given data:

1. Initialize weights
2. For
a) Fit classifier to data using weights
b) Compute

c) Compute
d) Set
Intro AI Ensembles 21
The AdaBoost Model

AdaBoost is NOT used for Regression!

Intro AI Ensembles 22
The Updates in Boosting

Intro AI Ensembles 23
Boosting Characteristics
Simulated data: test error
rate for boosting with
stumps, as a function of
the number of iterations.
Also shown are the test
error rate for a single
stump, and a 400 node
tree.

Intro AI Ensembles 24
Loss Functions for
•Misclassification

•Exponential (Boosting)

•Binomial Deviance
(Cross Entropy)

•Squared Error

Incorrect Classification Correct Classification

•Support Vectors

Intro AI Ensembles 25
Intro AI Ensembles 26
Intro AI Ensembles 27
Intro AI Ensembles 28
Intro AI Ensembles 29
Intro AI Ensembles 30
Intro AI Ensembles 31
Other Variations of Boosting
• Gradient Boosting
– Can use any cost function
• Stochastic (Gradient) Boosting
– Bootstrap Sample: Uniform random sampling
(with replacement)
– Often outperforms the non-random version

Intro AI Ensembles 32
Gradient Boosting

Intro AI Ensembles 33
Boosting Summary
• Good points
– Fast learning
– Capable of learning any function (given appropriate weak learner)
– Feature weighting
– Very little parameter tuning
• Bad points
– Can overfit data
– Only for binary classification
• Learning parameters (picked via cross validation)
– Size of tree
– When to stop
• Software
– http://www-stat.stanford.edu/~jhf/R-MART.html

Intro AI Ensembles 34

Backpropagation & Neural Networks
No ratings yet
Backpropagation & Neural Networks
30 pages
UNIT1
No ratings yet
UNIT1
80 pages
SVM Questions
No ratings yet
SVM Questions
7 pages
Unit 3
No ratings yet
Unit 3
99 pages
Ensemble Methods
No ratings yet
Ensemble Methods
31 pages
Ensemble Methods
No ratings yet
Ensemble Methods
19 pages
Jntuk r20 ML Unit-III
No ratings yet
Jntuk r20 ML Unit-III
28 pages
CSE 445 - Lecture 7 - Ensemble Learning
No ratings yet
CSE 445 - Lecture 7 - Ensemble Learning
17 pages
AdaBoost Final
No ratings yet
AdaBoost Final
97 pages
Week 11.3
No ratings yet
Week 11.3
14 pages
Ensemble Learning in Machine Learning
No ratings yet
Ensemble Learning in Machine Learning
9 pages
Ensembles
No ratings yet
Ensembles
9 pages
M05 Ensemble
No ratings yet
M05 Ensemble
42 pages
Ensemble Learning SA
No ratings yet
Ensemble Learning SA
27 pages
Week 11
No ratings yet
Week 11
16 pages
Ensemble Learning
No ratings yet
Ensemble Learning
34 pages
LR Desktop Udo6rlp
No ratings yet
LR Desktop Udo6rlp
4 pages
14 Model Ensembles
No ratings yet
14 Model Ensembles
63 pages
Ensemble Methods Send
No ratings yet
Ensemble Methods Send
20 pages
Lecture 2
No ratings yet
Lecture 2
35 pages
13 PracticalMachineLearning
100% (1)
13 PracticalMachineLearning
84 pages
Unit V - Multiple Learners
No ratings yet
Unit V - Multiple Learners
54 pages
Ensemble Methods
No ratings yet
Ensemble Methods
30 pages
Module 5,1 Ensemble - Bagging, RF, Boosting
No ratings yet
Module 5,1 Ensemble - Bagging, RF, Boosting
66 pages
Lec06 - Ensembling Methods Bagging Boosting
No ratings yet
Lec06 - Ensembling Methods Bagging Boosting
48 pages
Module 7 - Ensemble Learning
No ratings yet
Module 7 - Ensemble Learning
41 pages
Unit 4 Part 1
No ratings yet
Unit 4 Part 1
47 pages
UNIT III Word File
No ratings yet
UNIT III Word File
13 pages
Unit 4 Notes
No ratings yet
Unit 4 Notes
24 pages
Unit 3 Aml
No ratings yet
Unit 3 Aml
9 pages
Ensemble Learning
No ratings yet
Ensemble Learning
52 pages
05 - Ensemble Learning
No ratings yet
05 - Ensemble Learning
39 pages
ML-Unit I - Ensemble Methods
No ratings yet
ML-Unit I - Ensemble Methods
54 pages
Unit 3
No ratings yet
Unit 3
59 pages
Unit 3
No ratings yet
Unit 3
63 pages
Ensemble Techniques Presentation
No ratings yet
Ensemble Techniques Presentation
17 pages
Unit Iv
No ratings yet
Unit Iv
18 pages
ML Lecture 15 Ensemble
No ratings yet
ML Lecture 15 Ensemble
27 pages
Bagging+Boosting+Gradient Boosting
100% (1)
Bagging+Boosting+Gradient Boosting
48 pages
Ensemble TBL Notes
No ratings yet
Ensemble TBL Notes
2 pages
Evaluating Machine Learning Algorithms and Model Selection
No ratings yet
Evaluating Machine Learning Algorithms and Model Selection
10 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
8 pages
Ensemble Learning for Data Scientists
No ratings yet
Ensemble Learning for Data Scientists
31 pages
2.4-Ensemble Methods Lecture Notes
No ratings yet
2.4-Ensemble Methods Lecture Notes
14 pages
22 Boosting
No ratings yet
22 Boosting
32 pages
Ensemble Classification
No ratings yet
Ensemble Classification
25 pages
Time To Explore (5) ML
No ratings yet
Time To Explore (5) ML
9 pages
5 - EnsembleModeling
No ratings yet
5 - EnsembleModeling
80 pages
EE769 9 Combining Models
No ratings yet
EE769 9 Combining Models
32 pages
Unit-3 ML
No ratings yet
Unit-3 ML
18 pages
Data Mining Suggestions
No ratings yet
Data Mining Suggestions
5 pages
Semi Supervised
No ratings yet
Semi Supervised
17 pages
Dbms Exercises
No ratings yet
Dbms Exercises
6 pages
Unit Iv
No ratings yet
Unit Iv
14 pages
Lecture 10 Ensemble Methods
No ratings yet
Lecture 10 Ensemble Methods
69 pages
Ensemble Methods in Machine Learning
No ratings yet
Ensemble Methods in Machine Learning
17 pages
ML Unit 3 r20 Jntuk
No ratings yet
ML Unit 3 r20 Jntuk
22 pages
Machine Learning Techniques
No ratings yet
Machine Learning Techniques
2 pages
Start For Free: Learning Vector Quantization Learning Vector Quantization
No ratings yet
Start For Free: Learning Vector Quantization Learning Vector Quantization
2 pages
Jntuk R20 ML Unit-Iii
100% (1)
Jntuk R20 ML Unit-Iii
21 pages
CNN Basic Beak of Bird
100% (1)
CNN Basic Beak of Bird
20 pages
AbhishekYadav Assignment 02
No ratings yet
AbhishekYadav Assignment 02
24 pages
Week 11 EnsembleLearning
No ratings yet
Week 11 EnsembleLearning
34 pages
What Is Ensemble Learning
No ratings yet
What Is Ensemble Learning
4 pages
Machine Learning and Data Mining: Prof. Alexander Ihler Fall 2012
No ratings yet
Machine Learning and Data Mining: Prof. Alexander Ihler Fall 2012
36 pages
Machine Learning Final Notes by Sakhawat Hossain
No ratings yet
Machine Learning Final Notes by Sakhawat Hossain
76 pages
ccs355 Model - A
No ratings yet
ccs355 Model - A
2 pages
Mod 3
No ratings yet
Mod 3
101 pages
I Am Sharing 'ML Merged - Removed' With You
No ratings yet
I Am Sharing 'ML Merged - Removed' With You
62 pages
HDFS Explaination PPTs
No ratings yet
HDFS Explaination PPTs
35 pages
Web Technologies Notes
No ratings yet
Web Technologies Notes
22 pages
Week 2 de Unedited
No ratings yet
Week 2 de Unedited
13 pages
CST414 QP
No ratings yet
CST414 QP
2 pages
Machine Learning Mastery Notes
No ratings yet
Machine Learning Mastery Notes
4 pages
Associate Engineer Trainee - Chubb (2026 Passouts)
No ratings yet
Associate Engineer Trainee - Chubb (2026 Passouts)
2 pages
APTx Activation Function
No ratings yet
APTx Activation Function
8 pages
MID-2 NNDL Exam Key Questions
No ratings yet
MID-2 NNDL Exam Key Questions
1 page
MNIST CNN Model Training Guide
No ratings yet
MNIST CNN Model Training Guide
6 pages
MODULE 4-Dr - GM
No ratings yet
MODULE 4-Dr - GM
23 pages
CNN For Computer Vision
No ratings yet
CNN For Computer Vision
81 pages
ACD
No ratings yet
ACD
61 pages
DL CS05
No ratings yet
DL CS05
22 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
32 pages
Multi-layer Neural Networks Basics
No ratings yet
Multi-layer Neural Networks Basics
55 pages
07 Boosting Notes
No ratings yet
07 Boosting Notes
10 pages
Enhanced Bagging EBagging A Novel Approach For Ens
No ratings yet
Enhanced Bagging EBagging A Novel Approach For Ens
15 pages
Experiment 1
No ratings yet
Experiment 1
7 pages
Identifikasi Kematangan Buah Tropika Berbasis Sistem Penciuman Elektronik Menggunakan Deret Sensor GAs Semikonduktor Dengan Metode Jaringan Syaraf Tiruan
No ratings yet
Identifikasi Kematangan Buah Tropika Berbasis Sistem Penciuman Elektronik Menggunakan Deret Sensor GAs Semikonduktor Dengan Metode Jaringan Syaraf Tiruan
9 pages
Exp4 11841524
No ratings yet
Exp4 11841524
8 pages
ML Ca2
No ratings yet
ML Ca2
3 pages
Sentiment Classification System of Twitter Data For US Airline Service Analysis
No ratings yet
Sentiment Classification System of Twitter Data For US Airline Service Analysis
5 pages
Multilayer Perceptron Algorithm
No ratings yet
Multilayer Perceptron Algorithm
3 pages
NO. NIM Nama Algoritma Kelompok
No ratings yet
NO. NIM Nama Algoritma Kelompok
2 pages

Ensemble Learning

Uploaded by

Ensemble Learning

Uploaded by

Sparse vs.

• Two Competing Methodologies

• It’s not always easy to define small!

• The function is the natural logarithm

AdaBoost is NOT used for Regression!

Incorrect Classification Correct Classification

You might also like