Starred repositories
Machine Learning Engineering Open Book
[NIPS 2025 DB Spotlight] AGENTIF: Benchmarking Instruction Following of Large Language Models in Agentic Scenarios
SGLang is a fast serving framework for large language models and vision language models.
Generate-on-Graph: Treat LLM as both Agent and KG for Incomplete Knowledge Graph Question Answering. EMNLP 2024 Main
Forward-Looking Active REtrieval-augmented generation (FLARE)
[NeurIPS'24] HippoRAG is a novel RAG framework inspired by human long-term memory that enables LLMs to continuously integrate knowledge across external documents. RAG + Knowledge Graphs + Personali…
AutoMQ is a diskless Kafka® on S3. 10x Cost-Effective. No Cross-AZ Traffic Cost. Autoscale in seconds. Single-digit ms latency. Multi-AZ Availability.
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard and designing lighteval!
R1-searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
ReCall: Learning to Reason with Tool Call for LLMs via Reinforcement Learning
Library for Knowledge Intensive Language Tasks
Companion code for FanOutQA: Multi-Hop, Multi-Document Question Answering for Large Language Models (ACL 2024)
Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL
[CIKM 2025] Constraint Back-translation Improves Complex Instruction Following of Large Language Models
[ACL 2025] Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems
🔍 Search-o1: Agentic Search-Enhanced Large Reasoning Models [EMNLP 2025]
Papers on LLM Reasoning and Retrieval-Augmented LLM Reasoning
[CIKM 2025] LLMAEL: Large Language Models are Good Context Augmenters for Entity Linking
[KDD 2025] AtomR: Atomic Operator-Empowered Large Language Models for Heterogeneous Knowledge Reasoning
Fully open reproduction of DeepSeek-R1
Production-ready platform for agentic workflow development.