Stars
Code Repository of Evaluating Quantized Large Language Models
mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
slime is an LLM post-training framework for RL Scaling.
Algorithm powering the For You feed on X
MiMo-V2-Flash: Efficient Reasoning, Coding, and Agentic Foundation Model
A framework for efficient model inference with omni-modality models
An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models
HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation
MiniMax-M2, a model built for Max coding & agentic workflows.
Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.
Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation
🔥 Comprehensive survey on Context Engineering: from prompt engineering to production-grade AI systems. hundreds of papers, frameworks, and implementation guides for LLMs and AI agents.
Text-audio foundation model from Boson AI
Hackable and optimized Transformers building blocks, supporting a composable construction.
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
Seamless operability between C++11 and Python
FlashInfer: Kernel Library for LLM Serving
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…
百聆 是一个类似GPT-4o的语音对话机器人,通过ASR+LLM+TTS实现,集成DeepSeek R1等优秀大模型,时延低至800ms,Mac等低配置也可运行,支持打断
Machine Learning Engineering Open Book
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM