-
Notifications
You must be signed in to change notification settings - Fork 12k
model: support arch DbrxForCausalLM
#6515
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
81 commits
Select commit
Hold shift + click to select a range
1d8de31
model: dbrx convert to gguf
phymbert ed582c1
llama: support dbrx
phymbert 3e3d2d1
gguf-py: remove wrong clip -> clamp
phymbert 3937100
model: dbrx, trust remote code
phymbert c0beb3c
llama: add label for model 132B
phymbert 0921033
model: dbrx fix python linter in convert-hf-to-gguf.py
phymbert e4f8ee4
llama: support dbrx fix norm type
phymbert a7f9a3e
dbrx: minor
phymbert e3c1e81
convert: dbrx: fix mixed up and down expert tensors
phymbert 0a35f58
convert: dbrx: fix mixed up and down expert tensors
phymbert c8e6f90
doc: dbrx: add the model as supported
phymbert 916b918
convert: dbrx: fix remove wrong ATTN_OUT_NORM tensor, add output laye…
phymbert 03da419
llama: dbrx: remove wrong attn output layer in model arch
phymbert 76f266b
scripts: get-wikitext-2 add unzip
phymbert 9c7dedb
llama: dbrx: no attention output layer
phymbert fe80898
model: dbrx: fix missing embedding tensor, mix with output layer
phymbert 4f12a58
llama: dbrx: remove not existing condition on empty output layer
phymbert 6985629
Merge remote-tracking branch 'origin/master' into hp/model/support-dbrx
phymbert 7e7cd53
llama: dbrx: remove unnecessary optional tensor on FFN_GATE_EXPS
phymbert 52c4033
llama: increase maximum experts allowed
phymbert 06a59ab
model: dbrx: convert add n_ff
phymbert 305ac3b
llama: dbrx: quantize fix n_attention_wv tensor name
phymbert b6522a9
model: dbrx: convert fix tokenizer
phymbert dccb012
llama: dbrx: quantize fix n_attention_wv tensor name
phymbert 61be4b9
model: convert-hf-to-gguf.py add _set_vocab_tiktoken gpt2 backed on l…
phymbert 1fb6d95
model: convert-hf-to-gguf.py fix classname conflict with qwen2
phymbert 200ce21
model: dbrx: convert-hf-to-gguf.py fix fix ftype missing, fix tensor …
phymbert 9e17dad
model: dbrx: convert-hf-to-gguf.py add chat template
phymbert d7546fd
llama: quantize: remove wrong look for tensor qkv name as it was badl…
phymbert 3a9dc2e
model: dbrx: convert-hf-to-gguf.py fix 'token_embd.weight' has wrong …
phymbert 8154617
model: dbrx: convert-hf-to-gguf.py support python 3.8
phymbert 2449ef4
llama: dbrx: no weight suffix in ffn_gate_exps, ffn_up_exps and ffn_d…
phymbert 1bd9427
llama: quantize: remove wrong look for tensor qkv name as it was badl…
phymbert e9987c6
llama: dbrx: fix tensor qkv number of elements
phymbert d151d8f
model: dbrx: convert reshape expert tensors to 3D
phymbert f062b83
model: dbrx: convert experts to f16
phymbert dbfd591
model: dbrx: fix tensor names mapping broken
phymbert 7dd84b0
model: dbrx: fix expert reshape
phymbert c9bddbf
model: dbrx: fix expert reshape
phymbert e2c9199
model: dbrx: fix again sic expert reshape
phymbert 50b4373
model: dbrx: weird fix expert reshape
phymbert 0ab1bae
llama: dbrx: output norm dim
phymbert 830e46d
llama: dbrx: fix last normalization
phymbert 2897aa6
llama: dbrx: revert
phymbert 993f836
llama: dbrx: move norm2 after attention, fix build kv
phymbert b01b062
llama: dbrx: fix build kv att out
phymbert 74e6d87
llama: dbrx: fix build kv att out tensor name
phymbert f8f97e7
llama: dbrx: hardcode nn.LayerNorm epsilon
phymbert 71f9e47
llama: dbrx: Try another rope type
phymbert 52c6276
llama: dbrx: fix k scale
phymbert 8e22688
llama: dbrx: move norm epsilon to convert. Fix missing normalization.
phymbert 35dce3e
llama: dbrx: rename tensor to actual meaning. Fix normalization in gr…
phymbert 506cc2e
llama: dbrx: convert remove previous reverse
phymbert eb0847e
llama: dbrx: load norm eps in hparams
phymbert 81f308a
llama: dbrx: fix experts tensor layout
phymbert 21fb24a
model: dbrx: convert-hf-to-gguf.py fix experts tensors shapes
phymbert f20c04f
llama: factorize moe graph implementation between grok, mixtral and dbrx
phymbert 48909ed
model: dbrx convert permute experts directly torch, log shape
phymbert 18a84fe
llama: dbrx: fix experts 3D tensor layout (again)
phymbert 9968952
llama: dbrx: fix experts 3D tensor layout (again)
phymbert e66f1e3
llama: dbrx: document changes, permute only FFN_DOWN_EXPS. Add a chec…
phymbert f30a73b
llama: dbrx: rename layer_out_norm to attn_out_norm
phymbert ea8b58c
llama: dbrx: first add the residuals and then do the norm
phymbert 55943a2
model: dbrx: convert fix mixed ffn_gate_exps and ffn_down_exps
phymbert c7b9a2e
llama: dbrx: fix ggml context of the attention outputs weight
phymbert ac82aa0
gguf-py: revert spaces
phymbert ac75fbd
gguf-py: dbrx: reverse again the MOE tensors mapping:
phymbert e5631cf
Merge remote-tracking branch 'origin/master' into hp/model/support-dbrx
phymbert 6f813dc
Merge remote-tracking branch 'origin/master' into hp/model/support-dbrx
phymbert 74529e5
llama: dbrx: use the MOE naming convention for model type
phymbert 06527c6
Merge remote-tracking branch 'origin/master' into hp/model/support-dbrx
phymbert fc89fee
model: convert-hf-to-gguf.py remove tiktoken
phymbert bdc4efe
Is silu activation function applied to MODEL_TENSOR.FFN_GATE_EXP here…
phymbert 542585f
Is silu activation function applied to MODEL_TENSOR.FFN_GATE_EXP here…
phymbert ecbfb1b
Wrong input was being fed to moe layer. This needs to be corrected
phymbert 647a11b
eval-callback: also print last n elements of each dimension
phymbert 03bdc36
minor spaces
phymbert 8e6758f
convert: update comment of MOE tensors mapping
phymbert f1256dc
llama: rename build_moe to build_moe_ffn and fix grok is using gelu i…
phymbert e517585
convert-hf-to-gguf.py: fix python linter
phymbert 9f77484
minor: fix indent in llama_build_graph
phymbert File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.