Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Tags: ROCm/ATOM

Tags

v0.1.3

Toggle v0.1.3's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
ci(benchmark): run DP-attention up to c=1024 at util 0.85, rename suf…

…fix to -dpa (#986)

- models.json: DeepSeek-V4-Pro DP gets --gpu-memory-utilization 0.85
  (dp-attention prefill MoE peak needs headroom; 0.9 OOMs at high
  concurrency, 0.85 verified stable through c=1024). Rename suffix -dp -> -dpa.
- nightly_params.json: add 512, 1024 to the concurrency sweep (both 1k/1k
  and 8k/1k).
- atom-benchmark.yaml: keep 512/1024 scoped to DP-attention only — exclude
  them for all other models (suffix "" and "-mtp3"); update the existing
  low-concurrency DP excludes to the new -dpa suffix.

v0.1.2

Toggle v0.1.2's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
[plugin][OOT Benchmark] Refine OOT benchmark(manual trigger) to cover…

… key models (#409)

* [plugin][oot benchmark] refine the OOT benchmark workflow

Signed-off-by: zejunchen-zejun <[email protected]>

* add model qwen3.5
change to manual trigger
align env and arguments
choice box default false

Signed-off-by: zejunchen-zejun <[email protected]>

* set 4 GPU machine for Kimi-K2 TP4

Signed-off-by: zejunchen-zejun <[email protected]>

* if the model has not been chosen, the gpu runner
will not be dispatched

Signed-off-by: zejunchen-zejun <[email protected]>

* change the config

Signed-off-by: zejunchen-zejun <[email protected]>

* remove redundant env flag for gptoss

Signed-off-by: zejunchen-zejun <[email protected]>

* add specific branch trigger OOT benchmark
for acceptance test when upgrading vLLM

Signed-off-by: zejunchen-zejun <[email protected]>

* change the oot benchmark behavior

Signed-off-by: zejunchen-zejun <[email protected]>

* refine the docker remove logic and rebuild logic

Signed-off-by: zejunchen-zejun <[email protected]>

* add Qwen3-Next into OOT benchmark

Signed-off-by: zejunchen-zejun <[email protected]>

* refine the summary workflow

Signed-off-by: zejunchen-zejun <[email protected]>

* change gptoss model to openai version

Signed-off-by: zejunchen-zejun <[email protected]>

* direct use the model weight, which has been downloaded
onto the machine

Signed-off-by: zejunchen-zejun <[email protected]>

* change the model weight path

Signed-off-by: zejunchen-zejun <[email protected]>

* make lint happy

Signed-off-by: zejunchen-zejun <[email protected]>

---------

Signed-off-by: zejunchen-zejun <[email protected]>