Highlights
- Pro
Stars
Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen2.5, Qwen3, Llama, and more!
Recipes to scale inference-time compute of open models
Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning
Manage scalable open LLM inference endpoints in Slurm clusters
A beautiful, simple, clean, and responsive Jekyll theme for academics
Beyond the Imitation Game collaborative benchmark for measuring and extrapolating the capabilities of language models
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
FluidML is a lightweight framework for developing machine learning pipelines.
Evaluate your dialog model with 17 metrics! (see paper)
A list of semi to fully remote-friendly companies (jobs) in tech.
Shared repository for open-sourced projects from the Google AI Language team.
[NeurIPS 2022] 🛒WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents
Research code for pixel-based encoders of language (PIXEL)
Simple (but often Strong) Baselines for POMDPs in PyTorch, ICML 2022
ZenML 🙏: One AI Platform from Pipelines to Agents. https://zenml.io.
arXiv LaTeX Cleaner: Easily clean the LaTeX code of your paper to submit to arXiv
A Real-World Benchmark for Reinforcement Learning based Recommender System
MetaDict is a powerful dict subclass enabling (nested) attribute-style item access/assignment and IDE autocompletion support.
Python Implementation of Reinforcement Learning: An Introduction
AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data…
Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk