-
Tsinghua University--> Beihang Univerisity
- Haidian, Beijing
- https://zhangmenghao.github.io/
- https://orcid.org/0000-0001-5274-5512
Stars
Secure and fast microVMs for serverless computing.
Venus Collective Communication Library, supported by SII and Infrawaves.
UCCL is an efficient communication library for GPUs, covering collectives, P2P (e.g., KV cache transfer, RL weight transfer), and EP (e.g., GPU-driven)
Analyze computation-communication overlap in V3/R1.
Aims to implement dual-port and multi-qp solutions in deepEP ibrc transport
yana-27 / BTDefense
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
DLRover: An Automatic Distributed Deep Learning System
ASTRA-sim2.0: Modeling Hierarchical Networks and Disaggregated Systems for Large-model Training at Scale
Lumina is a user-friendly tool to test the correctness and performance of hardware network stacks.
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Large Language Model (LLM) Systems Paper List
PArametrized Recommendation and Ai Model benchmark is a repository for development of numerous uBenchmarks as well as end to end nets for evaluation of training and inference platforms.
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Zeta is a distributed platform for developing and deploying complex, elastic, and highly available multi-tenant network services.