The Large Model Systems Organization develops large models and systems that are open, accessible, and scalable.
Latest Blog
See all posts
Higgs Audio v3 TTS on SGLang-Omni: Real-Time, Controllable Speech for Voice Agents
Today we are announcing end-to-end serving for Higgs Audio v3 TTS on SGLang-Omni. Higgs Audio v3 TTS is Boson AI's text-to-speech model for conversational voice agents: it generates natural and expres...

SGLang and Miles Add Day-0 Support for NVIDIA Nemotron 3 Ultra for Long-Running Autonomous Agents
We are excited to announce that SGLang and Miles support NVIDIA Nemotron 3 Ultra on Day 0\. Agentic AI systems are moving from short prompt-response interactions to persistent workflows that plan, us...

Heterogeneous CPU + GPU EPD Disaggregation to Boost VLM Serving
TL;DR We enabled heterogeneous Encode-Prefill-Decode (EPD) disaggregation via Dynamo and SGLang for Vision-Language Models (VLMs). By offloading vision encoding tasks to CPUs (the easiest-getting CPU...
Projects
View all projectsOur Sponsors & Partners
Backed by leading companies and institutions advancing AI research.
Voltage Park, NVIDIA, Nebius, Google Cloud, AtlasCloud, a16z, AMD, InnoMatrix, Laude Institute, Hyperbolic, NovitaAI, Verda Cloud, Sky9, Kaggle, MBZUAI, Together, RunPod, Anyscale, HuggingFace




