Tags: ROCm/ATOM
Tags
ci(benchmark): run DP-attention up to c=1024 at util 0.85, rename suf… …fix to -dpa (#986) - models.json: DeepSeek-V4-Pro DP gets --gpu-memory-utilization 0.85 (dp-attention prefill MoE peak needs headroom; 0.9 OOMs at high concurrency, 0.85 verified stable through c=1024). Rename suffix -dp -> -dpa. - nightly_params.json: add 512, 1024 to the concurrency sweep (both 1k/1k and 8k/1k). - atom-benchmark.yaml: keep 512/1024 scoped to DP-attention only — exclude them for all other models (suffix "" and "-mtp3"); update the existing low-concurrency DP excludes to the new -dpa suffix.
[plugin][OOT Benchmark] Refine OOT benchmark(manual trigger) to cover… … key models (#409) * [plugin][oot benchmark] refine the OOT benchmark workflow Signed-off-by: zejunchen-zejun <[email protected]> * add model qwen3.5 change to manual trigger align env and arguments choice box default false Signed-off-by: zejunchen-zejun <[email protected]> * set 4 GPU machine for Kimi-K2 TP4 Signed-off-by: zejunchen-zejun <[email protected]> * if the model has not been chosen, the gpu runner will not be dispatched Signed-off-by: zejunchen-zejun <[email protected]> * change the config Signed-off-by: zejunchen-zejun <[email protected]> * remove redundant env flag for gptoss Signed-off-by: zejunchen-zejun <[email protected]> * add specific branch trigger OOT benchmark for acceptance test when upgrading vLLM Signed-off-by: zejunchen-zejun <[email protected]> * change the oot benchmark behavior Signed-off-by: zejunchen-zejun <[email protected]> * refine the docker remove logic and rebuild logic Signed-off-by: zejunchen-zejun <[email protected]> * add Qwen3-Next into OOT benchmark Signed-off-by: zejunchen-zejun <[email protected]> * refine the summary workflow Signed-off-by: zejunchen-zejun <[email protected]> * change gptoss model to openai version Signed-off-by: zejunchen-zejun <[email protected]> * direct use the model weight, which has been downloaded onto the machine Signed-off-by: zejunchen-zejun <[email protected]> * change the model weight path Signed-off-by: zejunchen-zejun <[email protected]> * make lint happy Signed-off-by: zejunchen-zejun <[email protected]> --------- Signed-off-by: zejunchen-zejun <[email protected]>