-
Notifications
You must be signed in to change notification settings - Fork 13.7k
hparams : add n_embd_inp() to support extended embed #16928
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Tested this. It works correctly with a 1d cvector of the size 5120, and for basic MTMD use cases. Thanks! |
This comment was marked as outdated.
This comment was marked as outdated.
|
Hmm, please ignore what I said earlier. Indeed, I think there is currently a misunderstanding here. The I would suggest calling it |
| const int n_embd = hparams.n_embd; | ||
| ggml_tensor * b = ggml_new_tensor_4d(ctx, GGML_TYPE_F32, n_embd, w->ne[1], 1, 1); | ||
| const int n_embd_inp = hparams.n_embd_inp(); | ||
| ggml_tensor * b = ggml_new_tensor_4d(ctx, GGML_TYPE_F32, n_embd_inp, w->ne[1], 1, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unsure if this is correct as well...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think either n_embd_inp or n_embd should not be very important. at least for now, models that use this ops will have n_embd_inp == n_embd
|
I tested the latest commit in this series and it works for text and image processing successfully with a cvector applied. |
I will test again later today, but feel free to do so too before that. |
Tested again on the latest commit. Seems to work great, MTMD and text work with a cvector applied. cvector is still working as expected. |
Required for proper handling of
Qwen3-VLDeepStack embeds.May change more than currently necessary for future use, f.ex. in
llama-context(or maybe even not enough), please review carefully!Fixes #16908