A comrephensive collection of learning from rewards in the post-training and test-time scaling of LLMs, with a focus on both reward models and learning strategies across training, inference, and po…

60 2 Updated Jun 13, 2025

rllm-org / rllm

Democratizing Reinforcement Learning for LLMs

Python 4,995 487 Updated Jan 18, 2026

hkust-nlp / simpleRL-reason

Simple RL training for reasoning

Python 3,826 283 Updated Dec 23, 2025

rkinas / reasoning_models_how_to

This repository serves as a collection of research notes and resources on training large language models (LLMs) and Reinforcement Learning from Human Feedback (RLHF). It focuses on the latest resea…

Python 125 11 Updated Jul 28, 2025

huggingface / open-r1

Fully open reproduction of DeepSeek-R1

Python 25,825 2,411 Updated Nov 24, 2025

THUDM / slime

slime is an LLM post-training framework for RL Scaling.

Python 3,378 427 Updated Jan 18, 2026

alibaba / ROLL

An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models

Python 2,664 204 Updated Jan 18, 2026

OpenManus / OpenManus-RL

A live stream development of RL tunning for LLM agents

Python 3,814 527 Updated Oct 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Joseri Chen Joserii

Achievements

Achievements

Block or report Joserii

LLM_RL

GAIR-NLP / LIMR

huggingface / trl

volcengine / verl

simplescaling / s1

princeton-nlp / LESS

Unakar / Logic-RL

deepseek-ai / DeepSeek-Math

yuanzhoulvpi2017 / vscode_debug_transformers

RLHFlow / Minimal-RL

PRIME-RL / TTRL