Load all MoE experts during warmup and make warmup 1 token #198

saood06 · 2025-02-09T22:23:09Z

First commit is a port of: ggml-org/llama.cpp#11571

The second commit is based on what fairydreaming has reported here ggml-org/llama.cpp#11733 and also unify's warmup to always be one token.

This allows warmup to actually warmup an MoE model as all experts are exercised.

Co-authored-by: Stanisław Szymczyk <[email protected]>

ikawrakow

LGTM, but it does nothing on the single socket computers I have currently available, so relying on the comments in the linked PR and issue that this really improves things on NUMA systems.

saood06 · 2025-02-10T14:52:48Z

LGTM, but it does nothing on the single socket computers I have currently available, so relying on the comments in the linked PR and issue that this really improves things on NUMA systems.

The first commit, should work on any system to help MoE loading (Deepseek is the most noticeable because of it's large size and expert count but it should help all MoE) . It is only the the second commit is designed to benefit NUMA systems.

…kawrakow#198)"

saood06 and others added 2 commits February 9, 2025 15:32

Load all MoE experts during warmup

ca4e8e5

Co-authored-by: Stanisław Szymczyk <[email protected]>

Unify warmup to one token

3702743

ikawrakow approved these changes Feb 10, 2025

View reviewed changes

ikawrakow merged commit a366a3d into main Feb 10, 2025

Nexesenex added a commit to Nexesenex/ik_llama.cpp.nxs that referenced this pull request Mar 3, 2025

Revert "Load all MoE experts during warmup and make warmup 1 token (i…

93e66a6

…kawrakow#198)"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Load all MoE experts during warmup and make warmup 1 token #198

Load all MoE experts during warmup and make warmup 1 token #198

saood06 commented Feb 9, 2025

ikawrakow left a comment

saood06 commented Feb 10, 2025 •

edited

Loading

Load all MoE experts during warmup and make warmup 1 token #198

Load all MoE experts during warmup and make warmup 1 token #198

Conversation

saood06 commented Feb 9, 2025

ikawrakow left a comment

Choose a reason for hiding this comment

saood06 commented Feb 10, 2025 • edited Loading

saood06 commented Feb 10, 2025 •

edited

Loading