Thanks to visit codestin.com
Credit goes to github.com

Jayden-Xu

Follow

Peiyu Xu Jayden-Xu

Follow

ML Infra

2 followers · 1 following

in/peiyu-xu-0b7813339

Pinned Loading

SGLang-RadixMoE SGLang-RadixMoE Public

Forked from sgl-project/sglang

RadixMoE extends SGLang's Radix Cache to store expert activation patterns alongside KV cache entries, enabling zero-compute routing prediction for Mixture-of-Experts models.

Python 1
vLLM-FlashMLA vLLM-FlashMLA Public

Forked from vllm-project/vllm

FlashMLA is a high-performance kernel library specifically optimized for Multi-Head Latent Attention (MLA) architectures and integrated in vLLM.

Python 1