Thanks to visit codestin.com
Credit goes to github.com

Skip to content

A GPU-Accelerated Differentiable Simulation Framework for Efficient Quadrotor Policy Learning

License

flyingbitac/diffaero

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DiffAero: A GPU-Accelerated Differentiable Simulation Framework for Efficient Quadrotor Policy Learning

This repository contains the code of the paper: DiffAero: A GPU-Accelerated Differentiable Simulation Framework for Efficient Quadrotor Policy Learning

Introduction

DiffAero is a GPU-accelerated differentiable quadrotor simulator that parallelizes both physics and rendering. It achieves orders-of-magnitude performance improvements over existing platforms with little VRAM consumption. It provides a modular and extensible framework supporting four differentiable dynamics models, three sensor modalities, and three flight tasks. Its PyTorch-based interface unifies four learning formulations and three learning paradigms. This flexibility enables DiffAero to serve as a benchmark for learning algorithms and allows researchers to investigate a wide range of problems, from differentiable policy learning to multi-agent coordination. Users can combine different components almost arbitrarily to initiate a custom-configured training process with minimal effort.

Features

Module Currently Supported
Tasks Position Control, Obstacle Avoidance, Racing
Differential Learning Algorithms BPTT, SHAC, SHA2C
Reinforcement Learning Algorithms PPO, Dreamer V3
Sensors Depth Camera, LiDAR
Dynamic Models Full Quadrotor, Continuous Point-Mass, Discrete Point-Mass

Environments

DiffAero now supports three flight tasks:

  • Position Control (env=pc): The goal is to navigate to and hover on the specified target positions from random initial positions, without colliding with other agents.
  • Obstacle Avoidance (env=oa): The goal is to navigate to and hover on target positions while avoiding collision with environmental obstacles and other quadrotors, given exteroceptive informations:
    • Relative positions of obstacles w.r.t. the quadrotor, or
    • Image from the depth camera attached to the quadrotor, or
    • Ray distance from the LiDAR attached to the quadrotor.
  • Racing (env=racing): The goal is to navigate through a series of gates in the shortest time, without colliding with the gates.

Learning algorithms

We have implemented several learning algorithms, including RL algorithms and algorithms that exploit the differentiability of the simulator:

Dynamical models

We have implemented four types of dynamic models for the quadrotor:

Sensors

DiffAero supports two types of exteroceptive sensors:

  • Depth Camera (sensor=camera): Provides depth information about the environment.
  • LiDAR (sensor=lidar): Provides distance measurements to nearby obstacles.

Installation

System requirements

  • System: Ubuntu.
  • Pytorch 2.x.

Installing the DiffAero

Clone this repo and install the python package:

git clone https://github.com/zxh0916/diffaero.git
cd diffaero && pip install -e .

Usage

Basic usage

Under the repo's root directory, run the following command to train a policy ([a,b,c] means a or b or c, etc.):

python script/train.py env=[pc,oa,racing] algo=[apg,apg_sto,shac,sha2c,ppo,world]

Note that env=[pc,oa] means use env=pc or env=oa, etc.

Once the training is done, run the following command to test the trained policy:

python script/test.py env=[pc,oa,racing] checkpoint=/absolute/path/to/checkpoints/directory use_training_cfg=True n_envs=64

To list all configuration choices, run:

python script/train.py -h

To enable tab-completion in command line, run:

eval "$(python script/train.py -sc install=bash)"

Visualization

Visualization with taichi GUI

DiffAero supports real-time visualization using taichi GGUI system. To enable the GUI, set headless=False in the training or testing command. Note that the taichi GUI can only be used with GPU0 (device=0) on workstation with multiple GPUs. For example, to visualize the training process of the Position Control task, run:

python script/train.py env=pc headless=False device=0

Visualize the depth camera and LiDAR data

To visualize the depth camera and LiDAR data in the Obstacle Avoidance task, set display_image=True in the training or testing command. For example, to visualize the depth camera data during testing, run:

python script/train.py env=oa display_image=True

Record First-Person View Videos

The Obstacle Avoidance task supports recording first-person view videos from the quadrotor's first-person perspective. To record videos, set record_video=True in the testing command:

python script/train.py env=oa checkpoint=/absolute/path/to/checkpoints/directory use_training_cfg=True n_envs=16 record_video=True

The recorded videos will be saved in the outputs/test/YYYY-MM-DD/HH-MM/video directory under the repo's root directory.a

Sweep across multiple configurations

DiffAero supports sweeping across multiple configurations using hydra. For example, you can specify multiple values to one argument by separating them with commas, and hydra will automatically generate all combinations of the specified values. For example, to sweep across different environments and algorithms, you can run:

python script/train.py -m env=pc,oa,racing algo=apg,apg_sto,shac,sha2c,ppo,world # generate 3x6=18 combinations, executed sequentially

Sweep across multiple GPUs in parallel

For workstations with multiple GPUs, you can specify multiple devices by setting device to string containing multiple GPU indices and setting n_jobs greater than 1 to sweep through configuation combinations in parallel using hydra-joblib-launcher and joblib. For example, to use the first 4 GPUs (GPU0, GPU1, GPU2, GPU3), run:

# generate 2x2x3=12 combinations, executed in parallel on 4 GPUs, with 3 jobs each
python script/train.py -m env=pc,oa algo=apg_sto,shac algo.l_rollout=16,32,64 n_jobs=4 device="0123" 

Automatic Hyperparameter Tuning

DiffAero supports automatic hyperparameter tuning using hydra-optuna-sweeper and Optuna. To search for the hyperparameter configuration that maximizes the success rate, uncomment the override hydra/sweeper: optuna_sweep line in cfg/config_train.yaml, specify the hyperparameters to be optimized in the cfg/hydra/sweeper/optuna_sweep.yaml file, and run

python script/train.py -m

This feature can be combined with multi-device parallel sweep to further speed up the hyperparameter search.

Deploy

If you want to evaluate and deploy your trained policy in Gazebo or in real world, please refer to this repository (Coming soon).

TODO-List

  • Add simplified quadrotor dynamics model.
  • Add support to train policies with rsl_rl (maybe).
  • Update the LiDAR sensor to be more realistic.

Citation

If you find DiffAero useful in your research, please consider citing:

@misc{zhang2025diffaero,
      title={DiffAero: A GPU-Accelerated Differentiable Simulation Framework for Efficient Quadrotor Policy Learning}, 
      author={Xinhong Zhang and Runqing Wang and Yunfan Ren and Jian Sun and Hao Fang and Jie Chen and Gang Wang},
      year={2025},
      eprint={2509.10247},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2509.10247}, 
}

About

A GPU-Accelerated Differentiable Simulation Framework for Efficient Quadrotor Policy Learning

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages