- CA, United States
Highlights
- Pro
Starred repositories
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & Ray & Dynamic Sampling & Async Agentic RL)
A scalable asynchronous reinforcement learning implementation with in-flight weight updates.
Benchmarking Chat Assistants on Long-Term Interactive Memory (ICLR 2025)
Recursive-Open-Meta-Agent v0.1 (Beta). A meta-agent framework to build high-performance multi-agent systems.
τ²-Bench: Evaluating Conversational Agents in a Dual-Control Environment
Tongyi Deep Research, the Leading Open-source Deep Research Agent
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
LOFT: A 1 Million+ Token Long-Context Benchmark
Doing simple retrieval from LLM models at various context lengths to measure accuracy
Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek-R1, Qwen3, Gemma 3, TTS 2x faster with 70% less VRAM.
Open weights language model from Google DeepMind, based on Griffin.
[ICLR'25] BigCodeBench: Benchmarking Code Generation Towards AGI
Helpful tools and examples for working with flex-attention
A tiny library for coding with large language models.
Evaluate your LLM's response with Prometheus and GPT4 💯
[NeurIPS 2024] Knowledge Circuits in Pretrained Transformers
🚀 Efficient implementations of state-of-the-art linear attention models
Train transformer language models with reinforcement learning.
tiktoken is a fast BPE tokeniser for use with OpenAI's models.