Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Ki6an/SkyRL

 
 

Repository files navigation

SkyRL-v0: Train Real-World Long-Horizon Agents via Reinforcement Learning

Github Twitter Hugging Face Collection Discord

News

  • [2025/05/06] 🎉 We released SkyRL-v0: the first open-source online RL training framework for multi-turn tool use LLMs, optimized for long-horizon, real-environment tasks like SWE-Bench!

Links

Getting Started

This repository contains training code for the SkyRL-v0 release. Our implementation is a fork of VeRL.

Installation

The only pre-requisite is having uv installed on your system. We use the uv + ray integration to easily manage dependencies in multi-node training.

Clone SkyRL-OpenHands

We use SkyRL-OpenHands to be able to connect to our remote runtime server. Clone the repository and place it in the git root:

git clone https://github.com/NovaSky-AI/SkyRL-OpenHands

Installation dry run

You can dry run your installation with the following command:

uv run --isolated --frozen pip show torch

NOTE: With a CPU head node, you might encounter installation issues with torch-memory-saver. To fix this, you need to install CUDA and make sure your CUDA libraries are linked in /usr/lib. For example,

sudo ln -s /usr/local/cuda-12.4/compat/libcuda.so /usr/lib/libcuda.so
sudo ln -s /usr/local/cuda-12.4/compat/libcuda.so.1 /usr/lib/libcuda.so.1

Scripts for reproduction

For reproducing our results for SkyRL-Agent-14B-v0, SkyRL-Agent-8B-v0, and SkyRL-Agent-7B-v0 you can refer to examples/sky.

SWE-Bench-Verified Evaluation Results

Model Base Base Performance Performance Training Time
SkyRL-Agent-7B-v0 OpenHands-7B-Agent 11% 14.6% 16hrs 8xH100
SkyRL-Agent-8B-v0 Qwen3-8B no thinking 3.6% 9.4% 27hrs 8xH200
SkyRL-Agent-14B-v0 Qwen3-14B thinking 18% 21.6% 20hrs 8xH200

Acknowledgement

This work is done at Berkeley Sky Computing Lab, with the amazing compute support from Lambda Labs, Anyscale, and Databricks.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 94.7%
  • Shell 5.1%
  • Roff 0.2%