Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View Zephyr271828's full-sized avatar

Highlights

  • Pro

Organizations

@NYUSH-AIIG

Block or report Zephyr271828

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

RL

10 repositories

[ICML 2022] The official implementation of DWBC in "Discriminator-Weighted Offline Imitation Learning from Suboptimal Demonstrations"

Python 35 2 Updated Jan 5, 2023

Train transformer language models with reinforcement learning.

Python 16,145 2,271 Updated Nov 4, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 15,092 2,418 Updated Nov 4, 2025

Implementations and examples of common offline policy evaluation methods in Python.

Python 224 25 Updated Feb 11, 2023

🙌 OpenHands: Code Less, Make More

Python 64,680 7,863 Updated Nov 4, 2025

An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & Ray & Dynamic Sampling & Async Agentic RL)

Python 8,311 807 Updated Oct 31, 2025

Awesome List for Agentic RL

HTML 531 16 Updated Oct 13, 2025

[NeurIPS 2025] The official repo of SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond

Python 177 19 Updated Jul 7, 2025

Resources for the Enigmata Project.

Python 73 4 Updated Aug 13, 2025

slime is an LLM post-training framework for RL Scaling.

Python 2,361 241 Updated Nov 3, 2025