Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@slaren
Copy link
Member

@slaren slaren commented Jul 31, 2025

This is intended to be a simple and curated way to use the CPU for the MoE weights. Internally, it is just setting up the appropriate tensor overrides, but this should be easier to use.

@slaren slaren merged commit a06ed5f into master Jul 31, 2025
47 checks passed
@slaren slaren deleted the sl/moe-switch branch July 31, 2025 18:15
@jacekpoplawski
Copy link
Contributor

Am I correct that this is on/off? It would be better to have an option for the number of layers (similar to -ngl).

@slaren
Copy link
Member Author

slaren commented Jul 31, 2025

I am not convinced that it would be worth it. The goal here is to have a very simple option that works well enough for most people. If you want to min-max, you can still use the --override-tensor option to customize it in any way you want.

@jacekpoplawski
Copy link
Contributor

Yes, I understand. And now I have an idea for my experiments :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants