rl_sar

This repository provides a framework for simulation verification and physical deployment of robot reinforcement learning algorithms, suitable for quadruped robots, wheeled robots, and humanoid robots. "sar" stands for "simulation and real"

Supports both IsaacGym and IsaacSim

Supports both ROS-Noetic and ROS2-Foxy/Humble

Supports both libtorch and onnxruntime

Supports both Linux and macOS(Only support Mujoco simulation)

Supports both Gazebo and Mujoco(Partial support)

Supports both Locomotion and Dance

Support List:

Robot Name (rname:=)	Pre-Trained Policy	Gazebo	Mujoco	Real
Unitree-A1 (a1)	legged_gym (IsaacGym)	✅	❌	✅
Unitree-Go2 (go2)	himloco (IsaacGym) robot_lab (IsaacSim)	✅	✅	✅ ✅
Unitree-Go2W (go2w)	robot_lab (IsaacSim)	✅	✅	✅
Unitree-B2 (b2)	robot_lab (IsaacSim)	✅	✅	⚪
Unitree-B2W (b2w)	robot_lab (IsaacSim)	✅	✅	⚪
Unitree-G1 (g1)	robomimic/locomotion (IsaacGym) robomimic/charleston (IsaacGym) whole_body_tracking/dance_102 (IsaacSim) whole_body_tracking/gangnam_style (IsaacSim)	✅	✅	✅
FFTAI-GR1T1 (gr1t1) (Only available on Ubuntu20.04)	legged_gym (IsaacGym)	✅	❌	⚪
FFTAI-GR1T2 (gr1t2) (Only available on Ubuntu20.04)	legged_gym (IsaacGym)	✅	❌	⚪
zhinao-L4W4 (l4w4)	legged_gym (IsaacGym)	✅	❌	✅
Deeprobotics-Lite3 (lite3)	himloco (IsaacGym)	✅	❌	✅
DDTRobot-Tita (tita)	robot_lab (IsaacSim)	✅	❌	⚪

Important

Python version temporarily suspended maintenance, please use v2.3 if necessary, may be re-released in the future.

Note

If you want to train policy using IsaacLab(IsaacSim), please use robot_lab project.

The order of joints in robot_lab cfg file joint_names is the same as that defined in xxx/robot_lab/config.yaml in this project.

Discuss in Github Discussion or Discord.

Caution

Disclaimer: User acknowledges that all risks and consequences arising from using this code shall be solely borne by the user, the author assumes no liability for any direct or indirect damages, and proper safety measures must be implemented prior to operation.

Preparation

Clone the repository

git clone --recursive --depth 1 https://github.com/fan-ziqi/rl_sar.git

To update

git pull
git submodule update --init --recursive --recommend-shallow --progress

Dependency

Install the required packages:

# Ubuntu
sudo apt install cmake g++ build-essential libyaml-cpp-dev libeigen3-dev libboost-all-dev libspdlog-dev libfmt-dev libtbb-dev liblcm-dev

# macOS
brew install boost lcm yaml-cpp tbb libomp pkg-config glfw

If you need to use ROS, install the following dependency packages:

# ros-noetic (Ubuntu20.04)
sudo apt install ros-noetic-teleop-twist-keyboard ros-noetic-controller-interface ros-noetic-gazebo-ros-control ros-noetic-joint-state-controller ros-noetic-effort-controllers ros-noetic-joint-trajectory-controller ros-noetic-joy ros-noetic-ros-control ros-noetic-ros-controllers ros-noetic-controller-manager

# ros2-foxy (Ubuntu20.04) / ros2-humble (Ubuntu22.04)
sudo apt install ros-$ROS_DISTRO-teleop-twist-keyboard ros-$ROS_DISTRO-ros2-control ros-$ROS_DISTRO-ros2-controllers ros-$ROS_DISTRO-control-toolbox ros-$ROS_DISTRO-robot-state-publisher ros-$ROS_DISTRO-joint-state-publisher-gui ros-$ROS_DISTRO-gazebo-ros2-control ros-$ROS_DISTRO-gazebo-ros-pkgs ros-$ROS_DISTRO-xacro

Compilation

Execute the following script in the project root directory to compile the entire project:

./build.sh

To compile specific packages individually, you can append the package names:

./build.sh package1 package2

To clean the build, use the following command. This will remove all compiled outputs and created symbolic links:

./build.sh -c  # or ./build.sh --clean

If simulation is not needed and you only want to run on the robot, you can compile using CMake while disabling ROS (the compiled executables will be in cmake_build/bin and libraries in cmake_build/lib):

./build.sh -m  # or ./build.sh --cmake

To use the Mujoco simulator

./build.sh -mj  # or ./build.sh --mujoco

For detailed usage instructions, you can check them via ./build.sh -h:

Usage: ./build.sh [OPTIONS] [PACKAGE_NAMES...]

Options:
  -c, --clean    Clean workspace (remove symlinks and build artifacts)
  -m, --cmake    Build using CMake (for hardware deployment only)
  -mj,--mujoco   Build with MuJoCo simulator support (CMake only)"
  -h, --help     Show this help message

Examples:
  ./build.sh                    # Build all ROS packages
  ./build.sh package1 package2  # Build specific ROS packages
  ./build.sh -c                 # Clean all symlinks and build artifacts
  ./build.sh --clean package1   # Clean specific package and build artifacts
  ./build.sh -m                 # Build with CMake for hardware deployment
  ./build.sh -mj                # Build with CMake and MuJoCo simulator support

Tip

If catkin build report errors: Unable to find either executable 'empy' or Python module 'em', run catkin config -DPYTHON_EXECUTABLE=/usr/bin/python3 before catkin build

Running

In the following text, <ROBOT>/<CONFIG> is used to represent different environments, such as go2/himloco and go2w/robot_lab.

Before running, copy the trained pt model file to rl_sar/src/rl_sar/policy/<ROBOT>/<CONFIG>, and configure the parameters in <ROBOT>/<CONFIG>/config.yaml and <ROBOT>/base.yaml.

Simulation

Gazebo

Open a terminal, launch the gazebo simulation environment

# ROS1
source devel/setup.bash
roslaunch rl_sar gazebo.launch rname:=<ROBOT>

# ROS2
source install/setup.bash
ros2 launch rl_sar gazebo.launch.py rname:=<ROBOT>

Open a new terminal, launch the control program

# ROS1
source devel/setup.bash
rosrun rl_sar rl_sim

# ROS2
source install/setup.bash
ros2 run rl_sar rl_sim

Tip

If you cannot see the robot after launching Gazebo in Ubuntu 22.04, it means the robot was initialized outside the field of view. The robot's position will be automatically reset after launching rl_sim. If the robot falls over during the standing process, use the keyboard R or the gamepad RB+Y to reset the robot.

If Gazebo cannot be opened when you start it for the first time, you need to download the model package

git clone https://github.com/osrf/gazebo_models.git ~/.gazebo/models

Mujoco

./cmake_build/bin/rl_sim_mujoco <ROBOT> <SCENE>
# Example: ./cmake_build/bin/rl_sim_mujoco g1 scene_29dof

Control with Mobile Web (Experimental)

Install dependencies

sudo apt install ros-${ROS_DISTRO}-rosbridge-suite
sudo apt install ros-${ROS_DISTRO}-web-video-server

# If you are using a ROS2 version other than Humble, Jazz, and Rolling, you need to build `web_video_server` from source
cd <your_ros2_workspace>/src
git clone https://github.com/RobotWebTools/web_video_server.git
cd <your_ros2_workspace>
colcon build --packages-select web_video_server

Run rosbridge and web_video_server in robot

# ROS1
roslaunch rosbridge_server rosbridge_websocket.launch
rosrun web_video_server web_video_server

# ROS2
ros2 launch rosbridge_server rosbridge_websocket_launch.xml
ros2 run web_video_server web_video_server

Visit http://robot.robotsfan.com/, fill in the IP address and port, check the settings page in the upper right corner, then connect to the robot. After entering the control page, turn the screen horizontally and click the full screen button in the upper left corner, Then you can control the robot using your phone's browser!

Control with Gamepad or Keyboard

Gamepad Control	Keyboard Control	Description
Basic
A	Num0	Move the robot from its initial program pose to the `default_dof_pos` defined in `base.yaml` using position control interpolation
B	Num9	Move the robot from its current position to the initial program pose using position control interpolation
X	N	Toggle navigation mode (disables velocity commands, receives `cmd_vel` topic)
Y	N/A	N/A
Simulation
RB+Y	R	Reset Gazebo environment (stand up fallen robot)
RB+X	Enter	Toggle Gazebo run/stop (default: running state)
Motor
LB+A	M	N/A (Recommended for motor enable)
LB+B	K	N/A (Recommended for motor disable)
LB+X	P	N/A Motor passive mode (`kp=0, kd=8`)
LB+RB	N/A	N/A (Recommended for emergency stop)
Skill
RB+DPadUp	Num1	Basic Locomotion
RB+DPadDown	Num2	Skill 2
RB+DPadLeft	Num3	Skill 3
RB+DPadRight	Num4	Skill 4
LB+DPadUp	Num5	Skill 5
LB+DPadDown	Num6	Skill 6
LB+DPadLeft	Num7	Skill 7
LB+DPadRight	Num8	Skill 8
Movement
LY Axis	W/S	Forward/Backward movement (X-axis)
LX Axis	A/D	Left/Right movement (Y-axis)
RX Axis	Q/E	Yaw rotation
N/A (Release joystick)	Space	Reset all control commands to zero

Real Robots

Unitree A1 (Click to expand)

Unitree A1 can be connected using both wireless and wired methods:

Wireless: Connect to the Unitree starting with WIFI broadcasted by the robot (Note: Wireless connection may lead to packet loss, disconnection, or even loss of control, please ensure safety)
Wired: Use an Ethernet cable to connect any port on the computer and the robot, configure the computer IP as 192.168.123.162, and the netmask as 255.255.255.0

Open a new terminal and start the control program

# ROS1
source devel/setup.bash
rosrun rl_sar rl_real_a1

# ROS2
source install/setup.bash
ros2 run rl_sar rl_real_a1

# CMake
./cmake_build/bin/rl_real_a1

Unitree Go2/Go2W/G1(29dofs) (Click to expand)

Ethernet Connection

Connect one end of the Ethernet cable to the Go2/Go2W/G1(29dofs) robot and the other end to your computer. Then, enable USB Ethernet on the computer and configure it. The IP address of the onboard computer on the Go2 robot is 192.168.123.161, so the computer's USB Ethernet address should be set to the same network segment as the robot. For example, enter 192.168.123.222 in the "Address" field (you can replace 222 with another number).

Use the ifconfig command to find the name of the network interface for the 123 network segment, such as enxf8e43b808e06. In the following steps, replace <YOUR_NETWORK_INTERFACE> with the actual network interface name.

Go2:

Open a new terminal and start the control program. If you are controlling Go2W, you need to add wheel after the command, otherwise leave it blank.

# ROS1
source devel/setup.bash
rosrun rl_sar rl_real_go2 <YOUR_NETWORK_INTERFACE> [wheel]

# ROS2
source install/setup.bash
ros2 run rl_sar rl_real_go2 <YOUR_NETWORK_INTERFACE> [wheel]

# CMake
./cmake_build/bin/rl_real_go2 <YOUR_NETWORK_INTERFACE> [wheel]

G1(29dofs):

Turn on the robot and lift it up, press L2+R2 to enter the debugging mode, then open a new terminal and start the control program.

# ROS1
source devel/setup.bash
rosrun rl_sar rl_real_g1 <YOUR_NETWORK_INTERFACE>

# ROS2
source install/setup.bash
ros2 run rl_sar rl_real_g1 <YOUR_NETWORK_INTERFACE>

# CMake
./cmake_build/bin/rl_real_g1 <YOUR_NETWORK_INTERFACE>

Deploying on the Onboard Jetson

Connect your computer to the robot using the Ethernet cable and log into the Jetson onboard computer. The default password is 123:

ssh [email protected]

Connect the phone to the USB of the robot, enable USB network sharing on the phone, pull the code and compile it using ./build.sh -m. After successful compilation, run:

# Go2:
./cmake_build/bin/rl_real_go2 <YOUR_NETWORK_INTERFACE> [wheel]

# G1(29dofs):
./cmake_build/bin/rl_real_g1 <YOUR_NETWORK_INTERFACE>

Then you can unplug the phone and network cable, and control the robot using the remote controller.

Auto-Start on Boot

If you need to set up auto-start on boot, you can follow this process:

Create a service file

sudo touch /etc/systemd/system/rl_sar.service

Write the following content, assuming the rl_sar project is in the ~/rl_sar directory

[Unit]
Description=RL SAR Service
After=network.target

[Service]
Type=simple
User=unitree
WorkingDirectory=/home/unitree/rl_sar
ExecStart=/home/unitree/rl_sar/cmake_build/bin/rl_real_go2 eth0 wheel
Restart=on-failure
RestartSec=5
StandardOutput=journal
StandardError=journal

[Install]
WantedBy=multi-user.target

Reload the systemd configuration:

sudo systemctl daemon-reload

Enable auto-start on boot:

sudo systemctl enable rl_sar.service

Disable auto-start on boot:

sudo systemctl disable rl_sar.service

Start the service:

sudo systemctl start rl_sar.service

Stop the service:

sudo systemctl stop rl_sar.service

Restart the service:

sudo systemctl restart rl_sar.service

View service logs:

sudo journalctl -u rl_sar.service -f

After reboot, the robot will first run the built-in standing program. After the rl_sar service starts, it will automatically dampen down, and then can be controlled normally using the remote controller.

Deeprobotics Lite3 (Click to expand)

Deeprobotics Lite3 can be connected using wireless method. (Wired not tested. For some versions of Lite3, the wired Ethernet port may requires additional installation.)

Connect to the Lite3 starting with WIFI broadcasted by the robot. We strongly recommand testing the communication the Lite3 using Lite3_Motion_SDK before use. (Note: Wireless connection may lead to packet loss, disconnection, or even loss of control, please ensure safety)
Determine the IP address and port number of Lite3, and modify line 46-48 in rl_sar/src/rl_real_lite3.cpp.
Then Update jy_exe/conf/network.toml on the Lite3 motion host to set the IP and port to that of the local machine running ROS2, enabling communication.

[!CAUTION] Recheck joint mapping parameters!
Recheck rl_sar/policy/himloco/config.yaml. The default joint mapping in Sim2Sim configuration differs from that used in real. If not updated accordingly, this mismatch may lead to incorrect robot behavior and potential safety hazards

Lite3 also support control using Deeprobotics Retroid gamepad, refer to Deeprobotics Gamepad

Open a new terminal and start the control program

# ROS1
source devel/setup.bash
rosrun rl_sar rl_real_lite3

# ROS2
source install/setup.bash
ros2 run rl_sar rl_real_lite3

# CMake
./cmake_build/bin/rl_real_lite3

Train the actuator network

Take A1 as an example below

Uncomment #define CSV_LOGGER in the top of rl_real_a1.hpp. You can also modify the corresponding part in the simulation program to collect simulation data for testing the training process.
Run the control program, and the program will log all data in src/rl_sar/policy/<ROBOT>/motor.csv.
Stop the control program and start training the actuator network. Note that rl_sar/src/rl_sar/policy/ is omitted before the following paths.
```
rosrun rl_sar actuator_net.py --mode train --data a1/motor.csv --output a1/motor.pt
```

Verify the trained actuator network.

rosrun rl_sar actuator_net.py --mode play --data a1/motor.csv --output a1/motor.pt

Add Your Robot

The following uses <ROBOT>/<CONFIG> to represent your robot environment. You only need to create or modify the following files, and the names must exactly match those shown below. (You can refer to the corresponding files in go2w as examples.)

# your robot description
rl_sar/src/rl_sar_zoo/<ROBOT>_description/CMakeLists.txt
rl_sar/src/rl_sar_zoo/<ROBOT>_description/package.ros1.xml
rl_sar/src/rl_sar_zoo/<ROBOT>_description/package.ros2.xml
rl_sar/src/rl_sar_zoo/<ROBOT>_description/xacro/robot.xacro
rl_sar/src/rl_sar_zoo/<ROBOT>_description/xacro/gazebo.xacro
rl_sar/src/rl_sar_zoo/<ROBOT>_description/config/robot_control.yaml
rl_sar/src/rl_sar_zoo/<ROBOT>_description/config/robot_control_ros2.yaml

# your policy
policy/<ROBOT>/base.yaml  # This file must follow the physical robot's joint order
policy/<ROBOT>/<CONFIG>/config.yaml
policy/<ROBOT>/<CONFIG>/<POLICY>.pt  # for libtorch, note that exporting JIT is required
policy/<ROBOT>/<CONFIG>/<POLICY>.onnx  # for onnxruntime

# fsm for robot
src/rl_sar/fsm_robot/fsm_<ROBOT>.hpp
src/rl_sar/fsm_robot/fsm_all.hpp

# your real robot code
rl_sar/src/rl_sar/src/rl_real_<ROBOT>.cpp  # You can customize the forward() function as needed to adapt to your policy

Contributing

Wholeheartedly welcome contributions from the community to make this framework mature and useful for everyone. These may happen as bug reports, feature requests, or code contributions.

List of contributors

Citation

Please cite the following if you use this code or parts of it:

@software{fan-ziqi2024rl_sar,
  author = {fan-ziqi},
  title = {rl_sar: Simulation Verification and Physical Deployment of Robot Reinforcement Learning Algorithm.},
  url = {https://github.com/fan-ziqi/rl_sar},
  year = {2024}
}

Acknowledgements

The project uses some code from the following open-source code repositories:

Name		Name	Last commit message	Last commit date
Latest commit History 234 Commits
.github		.github
policy		policy
scripts		scripts
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
CONTRIBUTORS.md		CONTRIBUTORS.md
LICENSE		LICENSE
README.md		README.md
README_CN.md		README_CN.md
VERSION		VERSION
build.sh		build.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

rl_sar

Preparation

Dependency

Compilation

Running

Simulation

Gazebo

Mujoco

Control with Mobile Web (Experimental)

Control with Gamepad or Keyboard

Real Robots

Ethernet Connection

Deploying on the Onboard Jetson

Auto-Start on Boot

Train the actuator network

Add Your Robot

Contributing

Citation

Acknowledgements

About

Uh oh!

Releases 14

Packages

Uh oh!

Contributors 5

Languages

Uh oh!

License

Uh oh!

fan-ziqi/rl_sar

Folders and files

Latest commit

History

Repository files navigation

rl_sar

Preparation

Dependency

Compilation

Running

Simulation

Gazebo

Mujoco

Control with Mobile Web (Experimental)

Control with Gamepad or Keyboard

Real Robots

Ethernet Connection

Deploying on the Onboard Jetson

Auto-Start on Boot

Train the actuator network

Add Your Robot

Contributing

Citation

Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 14

Packages 0

Uh oh!

Contributors 5

Languages

Packages