README updated 18 Sept 2025.
This repository contains materials (scripts and some data) for a Physalia short course demonstrating the uses of Bayesian additive regression tree (BART) methods for species distribution modeling. All scripts are run in R, with specific methods implemented dbarts
with utilities from embarcadero
. The first day's material also uses gbm3
and brms
for demonstration of key concepts.
The course will introduce and demonstrate the use of Bayesian Additive Regression Tree (BART) methods for species distribution modeling (SDM) and other ecological applications. We will explain how BART modeling works, and identify how BARTs improve over other commonly used SDM methods. Participants will work through all steps necessary for conducting a SDM study with BARTs in R, using utilities in the embarcadero and dbarts packages to select informative environmental predictors, train and evaluate BART models of species occurrence, and use trained models to predict species presence or absence from new data.
The course is aimed at advanced students, researchers, and practitioners who have some familiarity with species distribution modeling and want to expand their experience and skills. Participants will need a laptop with a webcam and a good internet connection to participate in interactive live sessions. They should be comfortable working in R, particularly the Rstudio environment, and they should be prepared to install specialized packages, edit and write simple scripts, and manage and read in downloaded datasets.
By the end of the course, participants will
- Understand the structure of BART machine-learning models and how they compare to similar ML methods, especially for species distribution modeling
- Use embarcadero and dbarts utilities to select predictors, and train BART models of species occurrence
- Visualize and interpret predictor effects and interactions in a trained BART model
- Use trained BART SDM models to project species distributions into new regions or times
- Pre course: Self-guided introduction and installation of necessary packages
- Day 1: Species Distribution Models and Bayesian stats
- Lecture 1500-1800, 4h practical
- SDMs overview/review
- Worked demonstration of SDMs with Boosted Regression Trees using
gbm3
(Topic 1, below) - Bayesian stats overview/review
- Worked demonstration of Bayesian regression with
brms
(Topic 2)
- Day 2: BARTs and embarcadero
- Lecture 1500-1800, 4h practical
- BARTs, theory and comparison to other ML methods for SDMs
- Worked demonstration of an SDM with BARTs, using
embarcadero
--- predictor selection, model training, and prediction with new data (Topic 3)
- Day 3: BART SDM workflow: predictor selection, model evaluation, troubleshooting
- Lecture 1500-1800, 4h practical
- Inspecting predictor partial effects and spatial partial effects (Topic 4)
- Random-intercept BARTs (Topic 5)
topics
--- code and slides for worked examples, organized by topic (see below)output
--- materials output by scripts in the worked examples. This folder is generated by the example code under Topic 01; saved models and other materials are not in the version-controlled repo due to file size contraintsdata
--- data sets needed for the worked examples
The code provided should let you dip into any example and take what you want from it. However, this material was developed for a multi-day course, and that experience is best reproduced if you work through it in the same order as presented. In some cases, results of computationally intensive (slow) analyses are saved in an earlier module for use in a later module.
Here are the topics and brief descriptions of their contents.
-
Species distribution modeling with boosted regression trees
- Preparing presence and pseudoabsence data to train a SDM for Joshua trees, Yucca brevifolia and Y. jaegeriana
- Training bosted regression tree (BRT) SDMs from Joshua tree data using
gbm3
- Predictor selection
- Evaluating a trained model
- Using a trained model to predict from new data
-
Bayesian regression with
brms
, as an example of Bayesian methods generally -
Species distribution modeling with BARTs
- Predictor selection using stepwise training and the
varimp.diag()
method - Evaluating a trained working model
- Using a trained model to predict from new data
- Comparison to results from a previously trained BRT
- Predictor selection using stepwise training and the
-
Inspecting predictor partial effects
- Generating and interpreting partial effect plots
- Generating and interpreting spatial partial effects ("spartials")
-
Random-intercept BART modeling
- Training a BART model with a random-intercept (RI) effect included
- Predicting from new data without the RI effect
- Comparison of results from models with and without an RI effect controlling for sampling heterogeneity