ATC25 Colocating ML Inference and Training with Fast GPU Memory Handover
今天yf来分享一篇来自IPADS的ATC25文章。 Colocating ML Inference and Training with Fast GPU Memory Handover 简短点评:依旧IPADS特有的大工程,TVM+vLLM+NCCL+Pytorch 开组会大家一起问了很多问题。 https://ipads.se.sjtu.edu.cn/_media/publications/si
- Paper Reading
- Haibin
- 2026-01-15
- 52 Views
- 0 Comments