Codestin Search App

b4908

fixed compilation warnings in ggml-sycl (ggml-org#12424)

Mar 18, 2025
a53f7f7
zip
tar.gz
Downloads

b4859

llava : fix bug in minicpm-v code (ggml-org#11513)

* fix bug in minicpm-v code

* update readme of minicpm-v

Mar 10, 2025
8352cdc
zip
tar.gz
Downloads

b4778

vulkan: fix assertion when qy_needs_dequant (ggml-org#12068)

Looks like a copy/paste bug from qx_needs_dequant.

Feb 25, 2025
a82c9e7
zip
tar.gz
Downloads

b4773

server: support add_generation_prompt query param (ggml-org#12062)

Feb 25, 2025
0b52745
zip
tar.gz

b4771

llama : expose llama_model_n_head_kv in the API (ggml-org#11997)

It's useful to be able to have this from the library layer as it's a key
parameter of the model (e.g. to figure out how much KV cache memory is
needed).

Feb 25, 2025
3e9a286
zip
tar.gz

b4770

metal : copy kernels for quant to F32/F16 conversions (ggml-org#12017)

metal: use dequantize_q templates

---------

Co-authored-by: Georgi Gerganov <[email protected]>

Feb 25, 2025
58d07a8
zip
tar.gz

b4769

opencl: fix for small models (ggml-org#11950)

* opencl: fix small shape gemv, remove unused extensions

* opencl: fix `transpose_16`, `dump_tensor`, enforce subgroup size

* opencl: fix for token length < 4

* opencl: use wave size of 64 for all Adreno GPUs

---------

Co-authored-by: Shawn Gu <[email protected]>
Co-authored-by: Skyler Szot <[email protected]>

Feb 24, 2025
34a846b
zip
tar.gz

b4768

llava : Add Granite Vision Support (ggml-org#11794)

* Add super wip scripts for multimodal granite gguf

Signed-off-by: Alex-Brooks <[email protected]>

* Add example for converting mmgranite to gguf

Signed-off-by: Alex-Brooks <[email protected]>

* remove hardcoded path

Signed-off-by: Alex-Brooks <[email protected]>

* Add vision feature layer to gguf params

Signed-off-by: Alex-Brooks <[email protected]>

* Clean up llava surgery and remove name substitution hacks

Signed-off-by: Alex-Brooks <[email protected]>

* Add transformers llava next tensor name mapping

Signed-off-by: Alex-Brooks <[email protected]>

* Make siglip / openclip mutuall exclusive

Signed-off-by: Alex-Brooks <[email protected]>

* Fix projector linear substitution

Signed-off-by: Alex-Brooks <[email protected]>

* Fix linear 2 substitution index

Signed-off-by: Alex-Brooks <[email protected]>

* Increase max flattened gridpoints to 64

Signed-off-by: Alex-Brooks <[email protected]>

* Fix hardcoded concat for multiple feature layers

Signed-off-by: Alex-Brooks <[email protected]>

* Pull vision feature layers out of gguf keys

Signed-off-by: Alex-Brooks <[email protected]>

* fix num gridpoints and use all layers

Signed-off-by: Alex-Brooks <[email protected]>

* Avoid dropping last image encoder layer in llava models

Signed-off-by: Alex-Brooks <[email protected]>

* Use 10 for max number of patches

Signed-off-by: Alex-Brooks <[email protected]>

* Standardize vision feature layers

Signed-off-by: Alex-Brooks <[email protected]>

* Cleanup logs

Signed-off-by: Alex-Brooks <[email protected]>

* Update comment for vision feature layer init

Signed-off-by: Alex-Brooks <[email protected]>

* Update notes for alternative to legacy llm conversion script

Signed-off-by: Alex-Brooks <[email protected]>

* Fix notes rendering

Signed-off-by: Alex-Brooks <[email protected]>

* Add v prefix to vision feature layer log

Signed-off-by: Alex-Brooks <[email protected]>

* Use current defaults for feature layer

Signed-off-by: Alex-Brooks <[email protected]>

* Use constant for max gridpoints / feat layers, style fixes

Signed-off-by: Alex-Brooks <[email protected]>

* clarify non-negative feature layers

Signed-off-by: Alex-Brooks <[email protected]>

* Remove CLIP_API from func signature

Signed-off-by: Alex-Brooks <[email protected]>

* USE MAX_IMAGE_FEATURE_LAYERS const in layer calc

Signed-off-by: Alex-Brooks <[email protected]>

* Clarify feature layers are non negative ints and not uint

Signed-off-by: Alex-Brooks <[email protected]>

* Fix condition for reading feature layers

Signed-off-by: Alex-Brooks <[email protected]>

* pop last llava layer when feature layers are unset

Signed-off-by: Alex-Brooks <[email protected]>

* Fix unset vision layer 0

Signed-off-by: Alex-Brooks <[email protected]>

* Update examples/llava/clip.cpp

Co-authored-by: Xuan-Son Nguyen <[email protected]>

* Reenable assertion for out of bounds get_rows

Signed-off-by: Alex-Brooks <[email protected]>

* Use std vector for gridpoints and feature layers

Signed-off-by: Alex-Brooks <[email protected]>

* Caculate max feature layer at load time

Signed-off-by: Alex-Brooks <[email protected]>

* Include base patch for granite vision allocation

Signed-off-by: Alex-Brooks <[email protected]>

* Fix trailing whitespace

Signed-off-by: Alex-Brooks <[email protected]>

* Add max num patches = 10 back for minicpmv

Signed-off-by: Alex-Brooks <[email protected]>

* Use unordered set to store feature layers

Co-authored-by: Xuan-Son Nguyen <[email protected]>
Signed-off-by: Alex-Brooks <[email protected]>

* Use max feature layer for postnorm

Signed-off-by: Alex-Brooks <[email protected]>

* Apply suggestions from code review

---------

Signed-off-by: Alex-Brooks <[email protected]>
Co-authored-by: Xuan-Son Nguyen <[email protected]>

Feb 24, 2025
7a2c913
zip
tar.gz

b4767

[SYCL] Optimize mul_mat for Q4_0 on Intel GPU (ggml-org#12035)

* opt performance by reorder for Intel GPU

* detect hw type and save opt feature, and print opt feature

* correct name

* support optimize graph once when compute graph, record the opt status in tensor->extra, make CI passed

* add env variable GGML_SYCL_DISABLE_OPT for debug

* use syclex::architecture replace the custom hw define, update the guide for GGML_SYCL_DISABLE_OPT

* add performance data

* mv getrows functions to separeted files

* fix global variables

---------

Co-authored-by: arthw <[email protected]>

Feb 24, 2025
08d5986
zip
tar.gz

b4765

SYCL: Fix GGML_SYCL_DEBUG macro (ggml-org#11995)

Feb 24, 2025
8303e8b
zip
tar.gz

PreviousNext

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

b4908

b4859

b4778

b4773

b4771

b4770

b4769

b4768

b4767

b4765

Tags: orca-zhang/llama.cpp