Stars
An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models
A construction kit for reinforcement learning environment management.
A multi-voice TTS system trained with an emphasis on quality
Bridge Megatron-Core to Hugging Face/Reinforcement Learning
Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, DeepSeek, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discr…
A PyTorch native platform for training generative AI models
Renderer for the harmony response format to be used with gpt-oss
gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI
My learning notes/codes for ML SYS.
A Datacenter Scale Distributed Inference Serving Framework
DLRover: An Automatic Distributed Deep Learning System
NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the effective training time by minimizing the downtime due to fa…
⚡️SwanLab - an open-source, modern-design AI training tracking and visualization tool. Supports Cloud / Self-hosted use. Integrated with PyTorch / Transformers / LLaMA Factory / veRL/ Swift / Ultra…
Community maintained hardware plugin for vLLM on Ascend
verl: Volcano Engine Reinforcement Learning for LLMs
An educational resource to help anyone learn deep reinforcement learning.
An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & Ray & Dynamic Sampling & Async Agentic RL)
Ring attention implementation with flash attention
FlashInfer: Kernel Library for LLM Serving
The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, No-code agent builder, MCP compatibility, and more.
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
A suite of image and video neural tokenizers
A high-throughput and memory-efficient inference and serving engine for LLMs
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
SGLang is a fast serving framework for large language models and vision language models.
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)