Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View thomlake's full-sized avatar

Highlights

  • Pro

Block or report thomlake

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 962 101 Updated Dec 21, 2025
Python 202 15 Updated Oct 27, 2025

Implementing DeepSeek R1's GRPO algorithm from scratch

Python 1,716 81 Updated Apr 18, 2025

🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.

Python 583 50 Updated Oct 31, 2025
Python 465 37 Updated Aug 28, 2025
Python 143 33 Updated Jul 23, 2025

A Framework for LLM-based Multi-Agent Reinforced Training and Inference

Python 377 42 Updated Nov 20, 2025

Low ReSource Reinforcement Learning with CPU Offloading Training Support

Python 78 7 Updated Dec 10, 2025

The absolute trainer to light up AI agents.

Python 9,769 790 Updated Dec 21, 2025

Our library for RL environments + evals

Python 3,654 453 Updated Dec 21, 2025

verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in-Group Policy Optimization for LLM Agent Training"

Python 1,297 116 Updated Dec 11, 2025

A version of verl to support diverse tool use

Python 768 63 Updated Dec 10, 2025

Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen2.5, Qwen3, Llama, and more!

Python 8,051 643 Updated Dec 19, 2025

PyTorch Single Controller

Rust 929 120 Updated Dec 20, 2025

A collection of prompts to challenge the reasoning abilities of large language models in presence of misguiding information

Python 452 26 Updated Jul 31, 2025

Official Repo for Open-Reasoner-Zero

Python 2,084 119 Updated Jun 2, 2025

A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).

Python 896 50 Updated Sep 30, 2025

A python module to repair invalid JSON from LLMs

Python 4,195 161 Updated Dec 17, 2025
Python 27 3 Updated Feb 26, 2024

Turns Data and AI algorithms into production-ready web applications in no time.

Python 18,967 1,961 Updated Dec 2, 2025

A simple tool to update bib entries with their official information (e.g., DBLP or the ACL anthology).

Python 2,957 164 Updated Jul 9, 2025

AsyncIO serving for data science models

Python 24 Updated Dec 8, 2022

The Learning Interpretability Tool: Interactively analyze ML models to understand their behavior in an extensible and framework agnostic interface.

TypeScript 3,618 370 Updated Dec 5, 2025

Hydra is a framework for elegantly configuring complex applications

Python 10,051 760 Updated Dec 11, 2025

Datasets, tools, and benchmarks for representation learning of code.

Jupyter Notebook 2,400 410 Updated Jan 31, 2022

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 154,114 31,504 Updated Dec 20, 2025

One hundred challenge problems for logical formalizations of commonsense psychology

27 1 Updated Oct 9, 2025

🔮 A refreshing functional take on deep learning, compatible with your favorite libraries

Python 2,884 288 Updated Dec 12, 2025

Documentation on how to access and use the Quick, Draw! Dataset.

6,613 1,024 Updated Mar 11, 2025
Next