Fix segfault in moe-expert-reduce test in support mode and coverage #16936

sbera77 · 2025-11-02T12:45:16Z

This PR fixes a segmentation fault that occurs while running the test-backend-ops tool in support mode or with --show-coverage flag. This will also allow docs/ops.md to be updated for tracking #14909 as it needs the results from support mode.

Root Cause

Testing does not initialize gf (ggml_cgraph), it calls build_graph method for each test case. The test_moe_expert_reduce test case calls ggml_build_forward_expand(gf, ...) inside its build_graph method but gf is a nullptr in this flow which causes a seg fault.

Solution

Wrap the ggml_build_forward_expand call in a gf null check.

…how-coverage

sbera77 · 2025-11-02T12:53:13Z

@am17an Please review

am17an

Thanks for the fix!

slaren

It would be better to filter out fusion cases in the supports test, but also initialize gf in eval_support.

sbera77 · 2025-11-02T14:41:56Z

Thanks @slaren, I incorporated your suggestions. Please let me know if this was the intended approach

tests/test-backend-ops.cpp

sbera77 · 2025-11-02T16:59:10Z

@slaren Thank you for your guidance and feedback. Please have a look

Fusion cases are now filtered out from both support mode and --show-coverage. This fixes the seg fault (also makes sense to check only individual ops there)
Initialized gf in eval_support (though its only used in fusion cases right now, which we filtered out and so this can be removed ?)

* origin/master: (169 commits) opencl: support imrope (ggml-org#16914) fix: Viewing multiple PDF attachments (ggml-org#16974) model-conversion : pass config to from_pretrained (ggml-org#16963) server : add props.model_alias (ggml-org#16943) ggml: CUDA: add head size 72 for flash-attn (ggml-org#16962) mtmd: add --image-min/max-tokens (ggml-org#16921) mtmd: pad mask for qwen2.5vl (ggml-org#16954) ggml : LoongArch fixes (ggml-org#16958) sync: minja (glm 4.6 & minmax m2 templates) (ggml-org#16949) SYCL: optimized repeat_back kernel (3× fewer asm instructions, 2× faster)Feature/sycl repeat back opt (ggml-org#16869) feat(webui): improve LaTeX rendering with currency detection (ggml-org#16508) test-backend-ops : fix segfault in moe-expert-reduce test in support mode and coverage (ggml-org#16936) ci : disable failing riscv cross build (ggml-org#16952) model: add Janus Pro for image understanding (ggml-org#16906) clip : use FA (ggml-org#16837) server : support unified cache across slots (ggml-org#16736) common : move gpt-oss reasoning processing to init params (ggml-org#16937) docs: remove llama_sampler_accept reference in sampling sample usage (ggml-org#16920) CUDA: add FLOOR, CEIL, ROUND, TRUNC unary ops (ggml-org#16917) devops: fix failing s390x docker build (ggml-org#16918) ...

tests: fix segfault in moe-expert-reduce test in support mode and --s…

cf26680

…how-coverage

sbera77 requested a review from slaren as a code owner November 2, 2025 12:45

github-actions bot added the testing Everything test related label Nov 2, 2025

am17an approved these changes Nov 2, 2025

View reviewed changes

slaren requested changes Nov 2, 2025

View reviewed changes

DajanaV mentioned this pull request Nov 2, 2025

UPSTREAM PR #16936: Fix segfault in moe-expert-reduce test in support mode and coverage auroralabs-loci/llama.cpp#42

Closed

sbera77 marked this pull request as draft November 2, 2025 14:13

tests: init gf and filter out fusion tests for support mode

e78ca2e

sbera77 marked this pull request as ready for review November 2, 2025 14:31

slaren reviewed Nov 2, 2025

View reviewed changes

tests/test-backend-ops.cpp Outdated Show resolved Hide resolved

sbera77 marked this pull request as draft November 2, 2025 15:17

tests: filter out fusion cases before calling eval_support

c3cb20b

sbera77 commented Nov 2, 2025

View reviewed changes

tests/test-backend-ops.cpp Outdated Show resolved Hide resolved

sbera77 marked this pull request as ready for review November 2, 2025 16:03

slaren reviewed Nov 2, 2025

View reviewed changes

tests/test-backend-ops.cpp Outdated Show resolved Hide resolved

sbera77 marked this pull request as draft November 2, 2025 16:28

tests: filter out fusion cases from show_test_coverage as well, fix lint

2eeb1c1

sbera77 marked this pull request as ready for review November 2, 2025 16:56

slaren approved these changes Nov 2, 2025

View reviewed changes

slaren merged commit a2054e3 into ggml-org:master Nov 2, 2025
68 of 72 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix segfault in moe-expert-reduce test in support mode and coverage #16936

Fix segfault in moe-expert-reduce test in support mode and coverage #16936

Uh oh!

sbera77 commented Nov 2, 2025

Uh oh!

sbera77 commented Nov 2, 2025

Uh oh!

am17an left a comment

Uh oh!

slaren left a comment

Uh oh!

sbera77 commented Nov 2, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sbera77 commented Nov 2, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix segfault in moe-expert-reduce test in support mode and coverage #16936

Fix segfault in moe-expert-reduce test in support mode and coverage #16936

Uh oh!

Conversation

sbera77 commented Nov 2, 2025

Root Cause

Solution

Uh oh!

sbera77 commented Nov 2, 2025

Uh oh!

am17an left a comment

Choose a reason for hiding this comment

Uh oh!

slaren left a comment

Choose a reason for hiding this comment

Uh oh!

sbera77 commented Nov 2, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sbera77 commented Nov 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sbera77 commented Nov 2, 2025 •

edited

Loading