Codestin Search App

omirosh · 2026-06-04T05:19:41Z

Summary

Adds FP8 fused-MoE tuning entries for GLM-4.7 running with Expert Parallel = 4
plus the fused shared-expert (FSE) path (introduced in vllm-project/vllm#44313 ).

Changes

aiter/configs/model_configs/glm47_fp8_untuned_fmoe.csv: append 20
rows for the new
expert=41, topk=10 block.
aiter/configs/model_configs/glm47_fp8_tuned_fmoe.csv: append the
corresponding 20 tuned entries

Test plan

Boot vLLM with GLM-4.7 EP=4 + FSE, verify the new tuned rows are
picked up via aiter.tuned_moe lookup and that the cached kernel
configs are used (no fallback warnings).

Made with Cursor

Adds FP8 fused-MoE tuning entries for GLM-4.7 running with Expert Parallel=4 plus the fused shared-expert (FSE) path introduced in vllm-project/vllm#44313. Each EP=4 rank carries 160/4 routed experts plus 1 fused shared expert (expert=41, topk=10) at cu_num=256, model_dim=5120, inter_dim=1536. - aiter/configs/model_configs/glm47_fp8_untuned_fmoe.csv: append 20 token rows for the new expert=41, topk=10 block. - aiter/configs/model_configs/glm47_fp8_tuned_fmoe.csv: append the corresponding 20 tuned entries. Co-authored-by: Cursor <[email protected]>

github-actions · 2026-06-04T05:20:06Z

🏷️ CI Guide

Runs automatically on every PR:

✅ Pre-checks (submodule verification, code formatting)
✅ Aiter op tests (gfx942 + gfx950)
✅ Triton tests on MI35X (only when aiter/ops/triton/** or related paths are changed)

Extended tests (opt-in via labels):

Label	Tests
`ci:triton-300x`	Run an additional Triton test job on MI300X in PRs; main branch always runs both MI35X and MI300X
`ci:sglang`	SGLang integration tests: DeepSeek-R1-MXFP4 accuracy, Qwen 3.5 accuracy
`ci:atom`	ATOM benchmark: DeepSeek-R1-0528, GPT-OSS-120B
`ci:atom_full`	ATOM accuracy suite for PR and main models from ATOM `models_accuracy.json`
`ci:vllm`	vLLM benchmark: GPT-OSS-120B, DeepSeek-R1-0528, Kimi-K2.5
`ci:all`	All standard extended tests (excludes `ci:atom_full`)

Only add ci:atom_full for FlyDSL or Triton upgrades.
Add labels via the sidebar or gh pr edit 3529 --add-label <label>

Copilot

Pull request overview

Adds additional FP8 fused-MoE tuning coverage for GLM-4.7 in the model-specific config set, targeting the EP=4 + fused shared-expert (FSE) path by extending the expert=41, topk=10 shape block and ensuring both untuned and tuned CSVs contain corresponding entries.

Changes:

Extended glm47_fp8_untuned_fmoe.csv with a new expert=41, topk=10 block covering token sizes from 1 through 32768 (20 rows).
Extended glm47_fp8_tuned_fmoe.csv with the corresponding 20 tuned entries (kernel selections + timings) for the same shapes.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File	Description
aiter/configs/model_configs/glm47_fp8_untuned_fmoe.csv	Adds new untuned shape rows for `expert=41, topk=10` to drive tuning/lookup coverage.
aiter/configs/model_configs/glm47_fp8_tuned_fmoe.csv	Adds the matching tuned kernel configs for `expert=41, topk=10` shapes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

omirosh requested review from a team and Copilot June 4, 2026 05:19

Copilot started reviewing on behalf of omirosh June 4, 2026 05:19 View session

Copilot AI reviewed Jun 4, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[tune] GLM-4.7-FP8 FMOE configs for EP=4 + fused shared expert (MI355x)#3529

[tune] GLM-4.7-FP8 FMOE configs for EP=4 + fused shared expert (MI355x)#3529
omirosh wants to merge 1 commit into
ROCm:mainfrom
omirosh:glm47/fp8-tuned-fmoe-ep4-fse

omirosh commented Jun 4, 2026

Uh oh!

github-actions Bot commented Jun 4, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

omirosh commented Jun 4, 2026

Summary

Changes

Test plan

Uh oh!

github-actions Bot commented Jun 4, 2026

🏷️ CI Guide

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants