0% found this document useful (0 votes)

69 views21 pages

Regression Trees Chapter2

The document provides an introduction to regression trees and machine learning with tree-based models in R. It discusses training a regression tree using rpart(), performing a train/validation/test split of data, common metrics for evaluating regression like MAE and RMSE, hyperparameters that can be tuned for decision trees like minsplit, maxdepth, and cp, using grid search to evaluate multiple models trained with different hyperparameter combinations, and selecting the best model based on validation set performance.

Uploaded by

Mikistli Yowaltekutli

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

69 views21 pages

Regression Trees Chapter2

Uploaded by

Mikistli Yowaltekutli

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

Introduction to

regression trees
MA CH IN E LEA RN IN G W ITH TREE-BA S ED MODELS IN R

Erin LeDell
Instructor
Train a Regression Tree in R
rpart(formula = ___,
data = ___,
method = ___)

MACHINE LEARNING WITH TREE-BASED MODELS IN R

Train/Validation/Test Split
training set

validation set

test set

MACHINE LEARNING WITH TREE-BASED MODELS IN R

Let's practice!
MA CH IN E LEA RN IN G W ITH TREE-BA S ED MODELS IN R
Performance metrics
for regression
MA CH IN E LEA RN IN G W ITH TREE-BA S ED MODELS IN R

Erin LeDell
Instructor
Common metrics for regression
Mean Absolute Error (MAE)
1
MAE = ∑ ∣ actual − predicted ∣
n
Root Mean Square Error (RMSE)

RMSE = √ ∑ (actual − predicted)2

1
n

MACHINE LEARNING WITH TREE-BASED MODELS IN R

Evaluate a regression tree model
pred <- predict(object = model, # model object
newdata = test) # test dataset

library(Metrics)

# Compute the RMSE

rmse(actual = test$response, # the actual values
predicted = pred) # the predicted values

2.278249

MACHINE LEARNING WITH TREE-BASED MODELS IN R

Let's practice!
MA CH IN E LEA RN IN G W ITH TREE-BA S ED MODELS IN R
What are the
hyperparameters for
a decision tree?
MA CH IN E LEA RN IN G W ITH TREE-BA S ED MODELS IN R

Erin LeDell
Instructor
Decision tree hyperparameters
?rpart.control

MACHINE LEARNING WITH TREE-BASED MODELS IN R

Decision tree hyperparameters
minsplit: minimum number of data points required to attempt a
split

cp: complexity parameter

maxdepth: depth of a decision tree

MACHINE LEARNING WITH TREE-BASED MODELS IN R

Cost-Complexity Parameter (CP)
plotcp(grade_model)

MACHINE LEARNING WITH TREE-BASED MODELS IN R

Cost-Complexity Parameter (CP)
print(model$cptable)

CP nsplit rel error xerror xstd

1 0.06839852 0 1.0000000 1.0080595 0.09215642
2 0.06726713 1 0.9316015 1.0920667 0.09543723
3 0.03462630 2 0.8643344 0.9969520 0.08632297
4 0.02508343 3 0.8297080 0.9291298 0.08571411
5 0.01995676 4 0.8046246 0.9357838 0.08560120
6 0.01817661 5 0.7846679 0.9337462 0.08087153
7 0.01203879 6 0.7664912 0.9092646 0.07982862
8 0.01000000 7 0.7544525 0.9407895 0.08399125

MACHINE LEARNING WITH TREE-BASED MODELS IN R

Cost-Complexity Parameter (CP)
# Prune the model to optimized cp value
model_opt <- prune(tree = model, cp = cp_opt)

MACHINE LEARNING WITH TREE-BASED MODELS IN R

Let's practice!
MA CH IN E LEA RN IN G W ITH TREE-BA S ED MODELS IN R
Grid Search for
model selection
MA CH IN E LEA RN IN G W ITH TREE-BA S ED MODELS IN R

Erin LeDell
Instructor
Grid Search
What is a model hyperparameter?

What is a "grid"?

What is the goal of a grid search?

How is the best model chosen?

MACHINE LEARNING WITH TREE-BASED MODELS IN R

Set up the grid
# Establish a list of possible hyper_grid[1:10,]
# values for minsplit & maxdepth

minsplit maxdepth
splits <- seq(1, 30, 5)
1 1 5
depths <- seq(5, 40, 10)
2 6 5
3 11 5
# Create a data frame containing 4 16 5
# all combinations 5 21 5
6 26 5
hyper_grid <- expand.grid( 7 1 15
minsplit = splits 8 6 15
maxdepth = depths 9 11 15
10 16 15

MACHINE LEARNING WITH TREE-BASED MODELS IN R

Grid Search in R: Train models
# Create an empty list to store models
models <- list()

# Execute the grid search

for (i in 1:nrow(hyper_grid)) {
# Get minsplit, maxdepth values at row i
minsplit <- hyper_grid$minsplit[i]
maxdepth <- hyper_grid$maxdepth[i]

# Train a model and store in the list

models[[i]] <- rpart(formula = response ~ .,
data = train,
method = "anova",
minsplit = minsplit,
maxdepth = maxdepth)

MACHINE LEARNING WITH TREE-BASED MODELS IN R

# Create an empty vector to store RMSE values
rmse_values <- c()

# Compute validation RMSE

for (i in 1:length(models)) {

# Retreive the i^th model from the list

model <- models[[i]]

# Generate predictions on grade_valid

pred <- predict(object = model,
newdata = valid)

# Compute validation RMSE and add to the

rmse_values[i] <- rmse(actual = valid$response,
predicted = pred)
}

MACHINE LEARNING WITH TREE-BASED MODELS IN R

Let's practice!
MA CH IN E LEA RN IN G W ITH TREE-BA S ED MODELS IN R

Reading 2 Time-Series Analysis - Answers
No ratings yet
Reading 2 Time-Series Analysis - Answers
52 pages
Tree Based Methods
No ratings yet
Tree Based Methods
64 pages
Tree Based Models in ML
No ratings yet
Tree Based Models in ML
2 pages
445 Lecture 8 DTR
No ratings yet
445 Lecture 8 DTR
135 pages
of Decision Tree
No ratings yet
of Decision Tree
14 pages
08 - Ensemble Methods
No ratings yet
08 - Ensemble Methods
59 pages
Sy19 A22 Cours7
No ratings yet
Sy19 A22 Cours7
76 pages
Nep Chemistry 1ST and 2ND Sem
No ratings yet
Nep Chemistry 1ST and 2ND Sem
32 pages
Forecasting: Questions and Answers Q6.1 Q6.1 Answer
No ratings yet
Forecasting: Questions and Answers Q6.1 Q6.1 Answer
28 pages
Random Forest and Parameter Tuning in R
No ratings yet
Random Forest and Parameter Tuning in R
9 pages
Regression Tree ML
No ratings yet
Regression Tree ML
3 pages
ML Unit 3
No ratings yet
ML Unit 3
21 pages
Unit 3 - ML (NEW)
No ratings yet
Unit 3 - ML (NEW)
68 pages
DM Chapter 8
No ratings yet
DM Chapter 8
7 pages
Week14 - LAQs - SWR
No ratings yet
Week14 - LAQs - SWR
3 pages
India's Crop Residue Bioenergy Potential
No ratings yet
India's Crop Residue Bioenergy Potential
10 pages
CIVE1219 Transport 3 Assignment2
No ratings yet
CIVE1219 Transport 3 Assignment2
38 pages
Unit 2 Ipr
No ratings yet
Unit 2 Ipr
15 pages
Slides (A19 A20)
No ratings yet
Slides (A19 A20)
261 pages
Notes 221104 101858
No ratings yet
Notes 221104 101858
32 pages
Machine Learning
No ratings yet
Machine Learning
2 pages
IML Trees
No ratings yet
IML Trees
66 pages
Dar Lect 12
No ratings yet
Dar Lect 12
29 pages
ES335
No ratings yet
ES335
22 pages
CART
No ratings yet
CART
2 pages
Tree Based Methods
No ratings yet
Tree Based Methods
21 pages
XG Boosting Reference
No ratings yet
XG Boosting Reference
6 pages
365 ML Infographic
No ratings yet
365 ML Infographic
1 page
2 Measurements
No ratings yet
2 Measurements
26 pages
Violations of Classical Assumptions: Chapter Four
No ratings yet
Violations of Classical Assumptions: Chapter Four
38 pages
925-Article Text-4638-1-10-20220701
No ratings yet
925-Article Text-4638-1-10-20220701
11 pages
Economics and Managerial Concepts Quiz
100% (2)
Economics and Managerial Concepts Quiz
40 pages
Scientific Computation 1st Edition Gaston H. Gonnet PDF Download
100% (1)
Scientific Computation 1st Edition Gaston H. Gonnet PDF Download
48 pages
Tree Based Learning Methods
No ratings yet
Tree Based Learning Methods
28 pages
1.10. Decision Trees - Scikit-Learn 0.24.1 Documentation
No ratings yet
1.10. Decision Trees - Scikit-Learn 0.24.1 Documentation
10 pages
The Art and Science of Prompting
No ratings yet
The Art and Science of Prompting
2 pages
Brueckner, J. K., & Largey, A. G. (2008) - Social Interaction and Urban Sprawl - Jue
No ratings yet
Brueckner, J. K., & Largey, A. G. (2008) - Social Interaction and Urban Sprawl - Jue
17 pages
Pns Assignment
No ratings yet
Pns Assignment
28 pages
Unit IV
No ratings yet
Unit IV
36 pages
Scraping HTML Chapter2
No ratings yet
Scraping HTML Chapter2
31 pages
Excel Resampling Techniques Guide
No ratings yet
Excel Resampling Techniques Guide
41 pages
Statistics First Year Intitial Practical
No ratings yet
Statistics First Year Intitial Practical
3 pages
Unit 3 Classification - Dr. Vidyut D
No ratings yet
Unit 3 Classification - Dr. Vidyut D
72 pages
Deep Learning CH3
No ratings yet
Deep Learning CH3
22 pages
Random Forest
No ratings yet
Random Forest
29 pages
Statistics and Data Analysis Exam
No ratings yet
Statistics and Data Analysis Exam
20 pages
Hypothesis Testing and Interval Estimation
No ratings yet
Hypothesis Testing and Interval Estimation
9 pages
Decision Tree Algorithm
No ratings yet
Decision Tree Algorithm
14 pages
Machine Learning: Decision Trees
No ratings yet
Machine Learning: Decision Trees
30 pages
Global Leadership & Supply Chain
No ratings yet
Global Leadership & Supply Chain
22 pages
Product Catalog
No ratings yet
Product Catalog
30 pages
05 - Ensemble Learning
No ratings yet
05 - Ensemble Learning
39 pages
Modern Grammar
No ratings yet
Modern Grammar
59 pages
EUC1502 Module3 Machine-Learning
No ratings yet
EUC1502 Module3 Machine-Learning
25 pages
Final Exam Astrid Gabriela Hassan-2201925084 Lob7: 0 1 1i 2 2i 3 3i K Ki I
No ratings yet
Final Exam Astrid Gabriela Hassan-2201925084 Lob7: 0 1 1i 2 2i 3 3i K Ki I
7 pages
Forecasting Models & Techniques
No ratings yet
Forecasting Models & Techniques
25 pages
Subject CS1 - Actuarial Statistics 1 Core Principles For 2019 Examinations
No ratings yet
Subject CS1 - Actuarial Statistics 1 Core Principles For 2019 Examinations
10 pages
Forecasting System Imbalance Volumes in Competitive Electricity Markets
No ratings yet
Forecasting System Imbalance Volumes in Competitive Electricity Markets
10 pages
Decisiontree1 2
No ratings yet
Decisiontree1 2
29 pages
Tube Counting
No ratings yet
Tube Counting
17 pages
Correlation Regression And: Learning Outcomes
No ratings yet
Correlation Regression And: Learning Outcomes
16 pages
Wacom STU SDK Guide for Developers
No ratings yet
Wacom STU SDK Guide for Developers
13 pages
Case of Study Effects of Organisational Culture On Employees Per
No ratings yet
Case of Study Effects of Organisational Culture On Employees Per
9 pages
Excel Regression Analysis Output Explained
No ratings yet
Excel Regression Analysis Output Explained
14 pages
Week 12
No ratings yet
Week 12
34 pages
Generalization Error: Elie Kawerk
No ratings yet
Generalization Error: Elie Kawerk
37 pages
Chapter1 PDF
No ratings yet
Chapter1 PDF
36 pages
Kubernetes Certification Resources
No ratings yet
Kubernetes Certification Resources
1 page
Decision Trees in Python Guide
No ratings yet
Decision Trees in Python Guide
29 pages
Bagged Trees Chapter3
No ratings yet
Bagged Trees Chapter3
21 pages
Tuning A CART's Hyperparameters: Elie Kawerk
No ratings yet
Tuning A CART's Hyperparameters: Elie Kawerk
26 pages
Tree-Based Model
No ratings yet
Tree-Based Model
21 pages
Chapter1 PDF
No ratings yet
Chapter1 PDF
29 pages
Lecture #15: Regression Trees & Random Forests
No ratings yet
Lecture #15: Regression Trees & Random Forests
34 pages
AMTA Assignment AMTA B (Aswin Avni Navya)
No ratings yet
AMTA Assignment AMTA B (Aswin Avni Navya)
13 pages
Prof. Hemant Kombrabail
100% (2)
Prof. Hemant Kombrabail
32 pages
Machine Learning in Ecology
No ratings yet
Machine Learning in Ecology
15 pages
Tattoo Machines
No ratings yet
Tattoo Machines
11 pages
Lecture+Notes+-+Random Forests
No ratings yet
Lecture+Notes+-+Random Forests
10 pages
Jabra Sport Pulse Techsheet Standard
No ratings yet
Jabra Sport Pulse Techsheet Standard
1 page
Tree-Based Machine Learning Methods
100% (1)
Tree-Based Machine Learning Methods
138 pages
Classification & Regression Trees Guide
No ratings yet
Classification & Regression Trees Guide
80 pages
Tree Models & Generalization in Python
No ratings yet
Tree Models & Generalization in Python
37 pages
Exploring Relationships Between Procrastination, Perfectionism, Self-Forgiveness and Academic Grade: A Path Analysis.
No ratings yet
Exploring Relationships Between Procrastination, Perfectionism, Self-Forgiveness and Academic Grade: A Path Analysis.
40 pages
LAB (1) Decision Tree: Islamic University of Gaza Computer Engineering Department Artificial Intelligence ECOM 5038
No ratings yet
LAB (1) Decision Tree: Islamic University of Gaza Computer Engineering Department Artificial Intelligence ECOM 5038
18 pages
Econometrics Guide for Stata Users
100% (5)
Econometrics Guide for Stata Users
222 pages
Anova Biometry
No ratings yet
Anova Biometry
33 pages
Week 7 - Tree-Based Model
100% (1)
Week 7 - Tree-Based Model
8 pages
Top R Packages for Machine Learning
No ratings yet
Top R Packages for Machine Learning
9 pages
Classification Using Decision Trees
No ratings yet
Classification Using Decision Trees
43 pages
Random Forest - R-Package PDF
No ratings yet
Random Forest - R-Package PDF
29 pages
Package Party': January 27, 2015
No ratings yet
Package Party': January 27, 2015
38 pages
Alchemical Keys Book Review
50% (2)
Alchemical Keys Book Review
1 page
DecisionTrees RandomForest v2
No ratings yet
DecisionTrees RandomForest v2
27 pages

Regression Trees Chapter2

Uploaded by

Regression Trees Chapter2

Uploaded by

Introduction to

MACHINE LEARNING WITH TREE-BASED MODELS IN R

MACHINE LEARNING WITH TREE-BASED MODELS IN R

RMSE = √ ∑ (actual − predicted)2

MACHINE LEARNING WITH TREE-BASED MODELS IN R

# Compute the RMSE

MACHINE LEARNING WITH TREE-BASED MODELS IN R

MACHINE LEARNING WITH TREE-BASED MODELS IN R

cp: complexity parameter

maxdepth: depth of a decision tree

MACHINE LEARNING WITH TREE-BASED MODELS IN R

MACHINE LEARNING WITH TREE-BASED MODELS IN R

CP nsplit rel error xerror xstd

MACHINE LEARNING WITH TREE-BASED MODELS IN R

MACHINE LEARNING WITH TREE-BASED MODELS IN R

What is the goal of a grid search?

How is the best model chosen?

MACHINE LEARNING WITH TREE-BASED MODELS IN R

MACHINE LEARNING WITH TREE-BASED MODELS IN R

# Execute the grid search

# Train a model and store in the list

MACHINE LEARNING WITH TREE-BASED MODELS IN R

# Compute validation RMSE

# Retreive the i^th model from the list

# Generate predictions on grade_valid

# Compute validation RMSE and add to the

MACHINE LEARNING WITH TREE-BASED MODELS IN R

You might also like