Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View luanfujun's full-sized avatar
🤖
🤖

Highlights

  • Pro

Block or report luanfujun

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek, Qwen, Llama, Gemma, TTS 2x faster with 70% less VRAM.

Python 50,624 4,175 Updated Jan 13, 2026

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Python 40,735 7,103 Updated Jan 13, 2026

verl: Volcano Engine Reinforcement Learning for LLMs

Python 18,272 3,015 Updated Jan 13, 2026

Train transformer language models with reinforcement learning.

Python 16,937 2,417 Updated Jan 12, 2026

OpenAI Baselines: high-quality implementations of reinforcement learning algorithms

Python 16,617 4,950 Updated Aug 1, 2024

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

Python 12,505 2,038 Updated Dec 18, 2025

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)

Python 8,808 958 Updated Jul 8, 2025

An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & TIS & vLLM & Ray & Async RL)

Python 8,772 848 Updated Jan 8, 2026

Code and implementations for the paper "AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning" by Zhiheng Xi et al.

Python 553 59 Updated Sep 11, 2025