The Large Model Systems Organization develops large models and systems
that are open, accessible, and scalable.
Latest Blog
See all posts
Elastic EP in SGLang: Achieving Partial Failure Tolerance for DeepSeek MoE Deployments
To serve massive Mixture-of-Experts (MoE) models efficiently, deploying a "wide" Expert Parallelism (EP) strategy—often spanning 32 GPUs or more per inference instance—is not just an option; it is a n...

ROCm Support for Miles: Large-Scale RL Post-Training on AMD Instinct™ GPUs
Reinforcement learning (RL) has rapidly become a core stage of modern foundation-model development. While large-scale pretraining remains essential, today's most capable models rely heavily on post-tr...
SGLang Adds Day-0 Support for NVIDIA Nemotron 3 Super for building High-Efficiency Multi-Agent Systems
We are excited to announce that SGLang supports NVIDIA Nemotron 3 Super on Day 0. Nemotron 3 Super is a leading open model in the Nemotron 3 family, built for running many collaborating agents togeth...
Projects
View all projectsOur Sponsors & Partners
Backed by leading companies and institutions advancing AI research.
Voltage Park, NVIDIA, Nebius, Google Cloud, AtlasCloud, a16z, AMD, InnoMatrix, Laude Institute, Hyperbolic, NovitaAI, Verda Cloud, Sky9, Kaggle, MBZUAI, Together, RunPod, Anyscale, HuggingFace




