Thanks to visit codestin.com
Credit goes to github.com

Skip to content

AITER Development Roadmap (2026 Q3) #3443

@sunway513

Description

@sunway513

AITER Development Roadmap (2026 Q3)

Modeled on the SGLang AMD roadmap (sgl-project/sglang#23494). Last updated 2026-05-30.

Contributions and feedback are welcome.

Legend: ✓ done · ▶ in progress · ○ planned. Each item links a tracked PR/issue.

Focus

  • Model enablement velocity: Day-N kernels for the frontier MoE models driving inference demand — MiniMax-M2.5, GLM-5.x, GPT-OSS, DeepSeek-V4.
  • MXFP4 across the stack: MoE, GEMM, attention, and Sage attention on MXFP4 with assured accuracy.
  • MLA completeness: Close the remaining MLA feature gaps (FP8 KV cache, small head counts, speculative decode).
  • Build & architecture: CK-Free build, faster Python-binding compile, config loading without rebuild — reduce time-to-kernel for agentic workflows.
  • Training kernels (new): First grouped-GEMM APIs for MoE training, beyond inference.

Feature and Performance Improvements (Q3 planned)


TODO: WIP.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions