rl_training
We've released the following tutorials for training and deploying a reinforcement learning policy. Please check it out on Bilibili or YouTube!
rl_training is a RL training library for deeprobotics robots, based on IsaacLab. The table below lists all available environments:
| Robot Model | Environment Name (ID) | Screenshot |
|---|---|---|
| Deeprobotics Lite3 | Rough-Deeprobotics-Lite3-v0 | ![]() |
| Deeprobotics M20 | Rough-Deeprobotics-M20-v0 | ![]() |
Note
If you want to deploy policies in mujoco or real robots, please use the corresponding deploy repo in Deep Robotics Github Center.
Everyone is welcome to contribute to this repo. If you discover a bug or optimize our training config, just submit a pull request and we will look into it.
-
Install Isaac Lab by following the installation guide. We recommend using the conda installation as it simplifies calling Python scripts from the terminal.
-
Clone this repository separately from the Isaac Lab installation (i.e. outside the
IsaacLabdirectory):git clone --recurse-submodules https://github.com/DeepRoboticsLab/rl_training.git
-
Using a python interpreter that has Isaac Lab installed, install the library
python -m pip install -e source/rl_training
-
Verify that the extension is correctly installed by running the following command to print all the available environments in the extension:
python scripts/tools/list_envs.py
Setup as Omniverse Extension (Optional, click to expand)
We provide an example UI extension that will load upon enabling your extension defined in source/rl_training/rl_training/ui_extension_example.py.
To enable your extension, follow these steps:
-
Add the search path of your repository to the extension manager:
- Navigate to the extension manager using
Window->Extensions. - Click on the Hamburger Icon (☰), then go to
Settings. - In the
Extension Search Paths, enter the absolute path torl_trainingb/source - If not already present, in the
Extension Search Paths, enter the path that leads to Isaac Lab's extension directory directory (IsaacLab/source) - Click on the Hamburger Icon (☰), then click
Refresh.
- Navigate to the extension manager using
-
Search and enable your extension:
- Find your extension under the
Third Partycategory. - Toggle it to enable your extension.
- Find your extension under the
Deeprobotics Lite3:
# Train
python scripts/reinforcement_learning/rsl_rl/train.py --task=Rough-Deeprobotics-Lite3-v0 --headless
# Play
python scripts/reinforcement_learning/rsl_rl/play.py --task=Rough-Deeprobotics-Lite3-v0 --num_envs=10Deeprobotics M20:
# Train
python scripts/reinforcement_learning/rsl_rl/train.py --task=Rough-Deeprobotics-M20-v0 --headless
# Play
python scripts/reinforcement_learning/rsl_rl/play.py --task=Rough-Deeprobotics-M20-v0 --num_envs=10Note
If you want to control a SINGLE ROBOT with the keyboard during playback, add --keyboard at the end of the play script.
Key bindings:
====================== ========================= ========================
Command Key (+ve axis) Key (-ve axis)
====================== ========================= ========================
Move along x-axis Numpad 8 / Arrow Up Numpad 2 / Arrow Down
Move along y-axis Numpad 4 / Arrow Right Numpad 6 / Arrow Left
Rotate along z-axis Numpad 7 / Z Numpad 9 / X
====================== ========================= ========================
- Record video of a trained agent (requires installing
ffmpeg), add--video --video_length 200 - Play/Train with 32 environments, add
--num_envs 32 - Play on specific folder or checkpoint, add
--load_run run_folder_name --checkpoint model.pt - Resume training from folder or checkpoint, add
--resume --load_run run_folder_name --checkpoint model.pt
You can download our trained examples for reference of the training process. This should be expected if you run our training process properly. M20 and Lite3
-
To train with multiple GPUs, use the following command, where --nproc_per_node represents the number of available GPUs:
python -m torch.distributed.run --nnodes=1 --nproc_per_node=2 scripts/reinforcement_learning/rsl_rl/train.py --task=<ENV_NAME> --headless python -m torch.distributed.run --nnodes=1 --nproc_per_node=2 scripts/reinforcement_learning/rsl_rl/train.py --task=Rough-Deeprobotics-Lite3-v0 --headless --distributed --num_envs=2048
-
Note: each gpu will have the same number of envs specified in the config, to use the previous total number of envs, devide it by the number of gpus.
-
To scale up training beyond multiple GPUs on a single machine, it is also possible to train across multiple nodes. To train across multiple nodes/machines, it is required to launch an individual process on each node.
For the master node, use the following command, where --nproc_per_node represents the number of available GPUs, and --nnodes represents the number of nodes:
python -m torch.distributed.run --nproc_per_node=2 --nnodes=2 --node_rank=0 --rdzv_id=123 --rdzv_backend=c10d --rdzv_endpoint=localhost:5555 scripts/reinforcement_learning/rsl_rl/train.py --task=<ENV_NAME> --headless --distributed
Note that the port (
5555) can be replaced with any other available port. For non-master nodes, use the following command, replacing--node_rankwith the index of each machine:python -m torch.distributed.run --nproc_per_node=2 --nnodes=2 --node_rank=1 --rdzv_id=123 --rdzv_backend=c10d --rdzv_endpoint=ip_of_master_machine:5555 scripts/reinforcement_learning/rsl_rl/train.py --task=<ENV_NAME> --headless --distributed
To view tensorboard, run:
tensorboard --logdir=logsIn some VsCode versions, the indexing of part of the extensions is missing. In this case, add the path to your extension in .vscode/settings.json under the key "python.analysis.extraPaths".
Note: Replace <path-to-isaac-lab> with your own IsaacLab path.
{
"python.languageServer": "Pylance",
"python.analysis.extraPaths": [
"${workspaceFolder}/source/rl_training",
"/<path-to-isaac-lab>/source/isaaclab",
"/<path-to-isaac-lab>/source/isaaclab_assets",
"/<path-to-isaac-lab>/source/isaaclab_mimic",
"/<path-to-isaac-lab>/source/isaaclab_rl",
"/<path-to-isaac-lab>/source/isaaclab_tasks",
]
}Temporary USD files are generated in /tmp/IsaacLab/usd_{date}_{time}_{random} during simulation runs. These files can consume significant disk space and can be cleaned by:
rm -rf /tmp/IsaacLab/usd_*The project uses some code from the following open-source code repositories:

