Codestin Search App

Ooooze · 2025-11-17T13:24:34Z

Add support for Granite MoE Hybrid in model.py by including down projections for shared MLP and MoE experts

…ections for shared MLP and MoE experts

p-e-w · 2025-11-17T13:31:11Z

Thanks! Does this actually work? I guess I never considered just adding those matrices blindly and letting try_add sort out the layers. Can you post the results of the Heretic run?

Ooooze · 2025-11-17T18:02:16Z

Sure! here is the full log
heretic.log

model uploaded to Hugginface:
https://huggingface.co/Biogenic/granite-4.0-micro-heretic-uncensored

pszemraj · 2025-11-17T20:42:40Z

yeah same here, I wrote some code that more or less does the same as this PR

Abliteration parameters

Parameter	Value
direction_index	24.65
attn.o_proj.max_weight	1.13
attn.o_proj.max_weight_position	30.77
attn.o_proj.min_weight	0.50
attn.o_proj.min_weight_distance	17.16
mamba.out_proj.max_weight	1.44
mamba.out_proj.max_weight_position	26.90
mamba.out_proj.min_weight	0.91
mamba.out_proj.min_weight_distance	19.24
mlp.shared_down_proj.max_weight	0.86
mlp.shared_down_proj.max_weight_position	28.04
mlp.shared_down_proj.min_weight	0.00
mlp.shared_down_proj.min_weight_distance	13.42

Performance

Metric	This model	Original model (ibm-granite/granite-4.0-h-1b)
KL divergence	0.03	0 (by definition)
Refusals	12/100	93/100

give this a spin: https://huggingface.co/pszemraj/granite-4.0-h-1b-heretic

LFM2 models, like https://hf.co/LiquidAI/LFM2-2.6B are proving a bit trickier fwiw - it seems to need more than simply adding the conv layers (i even tried expanding the ranges too)

pszemraj · 2025-11-17T20:44:48Z

oh oops, my bad. I realized this PR is for the transformer MoE version of granite-4.0, I did the mamba hybrid one(s) (as.. that was the whole point) + LFM2

if you like, I can submit a separate PR?

p-e-w · 2025-11-18T03:06:57Z

Thanks, great stuff!

I made the same error as @pszemraj in assuming that this is the Mamba version. I actually wasn't aware that Granite also has a traditional transformer variant. Either way, it's great to have broader model support, the more the merrier.

@pszemraj Yes please.

Ooooze · 2025-11-18T07:00:48Z

Thank you guys, awesome project!)

pszemraj · 2025-11-18T21:35:38Z

Great! @p-e-w i'll take a look later today or tomorrow & fork/reconcile my changes vs latest main

Add support for Granite MoE Hybrid in model.py by including down proj…

94e7ad5

…ections for shared MLP and MoE experts

p-e-w merged commit 61fdf72 into p-e-w:master Nov 18, 2025

p-e-w mentioned this pull request Nov 21, 2025

Support for other granite-4 models #34

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IBM Granite support#14

IBM Granite support#14
p-e-w merged 1 commit intop-e-w:masterfrom
Ai-Swat:feature/ibm-granite-support

Ooooze commented Nov 17, 2025

Uh oh!

p-e-w commented Nov 17, 2025

Uh oh!

Ooooze commented Nov 17, 2025

Uh oh!

pszemraj commented Nov 17, 2025 •

edited

Loading

Uh oh!

pszemraj commented Nov 17, 2025

Uh oh!

p-e-w commented Nov 18, 2025

Uh oh!

Ooooze commented Nov 18, 2025

Uh oh!

pszemraj commented Nov 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Conversation

Ooooze commented Nov 17, 2025

Uh oh!

p-e-w commented Nov 17, 2025

Uh oh!

Ooooze commented Nov 17, 2025

Uh oh!

pszemraj commented Nov 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Abliteration parameters

Performance

Uh oh!

pszemraj commented Nov 17, 2025

Uh oh!

p-e-w commented Nov 18, 2025

Uh oh!

Ooooze commented Nov 18, 2025

Uh oh!

pszemraj commented Nov 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

pszemraj commented Nov 17, 2025 •

edited

Loading