-
National University of Singapore
- Singapore
-
23:03
(UTC -12:00) - https://vanzll.github.io/
-
-
Frontier-CS Public
Forked from FrontierCS/Frontier-CSA benchmark for evaluating LLMs on open-ended CS problems. Exploring the Next Frontier of Computer Science.
Python MIT License UpdatedJan 23, 2026 -
-
verl-agent Public
Forked from langfengQ/verl-agentverl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in-Group Policy Optimization for LLM Agent Training"
Python Apache License 2.0 UpdatedJan 13, 2026 -
-
Awesome-Flow-RL-Papers Public
Forked from Tonghe-Zhang/Awesome-Flow-RL-PapersA collection of paper/projects that trains flow matching model/policies via RL.
MIT License UpdatedDec 3, 2025 -
-
agent-lightning Public
Forked from microsoft/agent-lightningThe absolute trainer to light up AI agents.
Python MIT License UpdatedNov 4, 2025 -
EBC Public
[ICML'25] Diversifying Policy Behaviors via Extrinsic Behavioral Curiosity
-
Awesome-Process-Reward-Models Public
Forked from RyanLiu112/Awesome-Process-Reward-ModelsA comprehensive collection of process reward models.
UpdatedOct 4, 2025 -
qdhf Public
Forked from ld-ing/qdhfQuality Diversity through Human Feedback: Towards Open-Ended Diversity-Driven Optimization (ICML 2024)
Python MIT License UpdatedJul 4, 2025 -
Uni-RLHF-Platform Public
Forked from pickxiguapi/Uni-RLHF-PlatformUni-RLHF platform for "Uni-RLHF: Universal Platform and Benchmark Suite for Reinforcement Learning with Diverse Human Feedback" (ICLR2024)
Python MIT License UpdatedNov 20, 2024 -