Codestin Search App

b5585

CUDA: fix FTZ in FA for Gemma 3 (ggml-org#13991)

Jun 4, 2025
0b4be4c
zip
tar.gz
Downloads

b5581

opencl: add `backend_synchronize` (ggml-org#13939)

* This is not needed by the normal use where the result is read
  using `tensor_get`, but it allows perf mode of `test-backend-ops`
  to properly measure performance.

Jun 2, 2025
71e74a3
zip
tar.gz
Downloads

b5579

server : disable speculative decoding for SWA models (ggml-org#13970)

* server : use swa-full fo draft context

ggml-ci

* server : disable speculative decoding for SWA models

Jun 2, 2025
3637576
zip
tar.gz
Downloads

b5574

cmake : Handle mixed-case 'Power' strings in POWER CPU detection (ggm…

…l-org#13966)

Some systems report the CPU implementation as "Power11" instead of "POWER11".
The existing CMake logic uses a case-sensitive regular expression to extract
the CPU generation, which fails when the casing doesn't exactly match "POWER".

This patch provides a fix by first converting the string to uppercase before applying the regex.

Signed-off-by: root <[email protected]>
Co-authored-by: root <[email protected]>

Jun 2, 2025
093e3f1
zip
tar.gz
Downloads

b5568

sync : ggml

ggml-ci

Jun 1, 2025
f3a4b16
zip
tar.gz
Downloads

b5558

threading: support for GGML_SCHED_PRIO_LOW, update thread info on Win…

…dows to avoid throttling (ggml-org#12995)

* threading: support for GGML_SCHED_PRIO_LOW, update thread info on Windows to avoid throttling

We talked about adding LOW priority for GGML threads in the original threadpool PR.
It might be useful for some cases to avoid contention.

Latest Windows ARM64 releases started parking (offlining) the CPU cores
more aggresively which results in suboptimal performance with n_threads > 4.
To deal with that we now disable Power Throttling for our threads for the NORMAL
and higher priorities.

Co-authored-by: Diego Devesa <[email protected]>

* threading: disable SetThreadInfo() calls for older Windows versions

* Update tools/llama-bench/llama-bench.cpp

Co-authored-by: Diego Devesa <[email protected]>

---------

Co-authored-by: Diego Devesa <[email protected]>

May 31, 2025
053b153
zip
tar.gz
Downloads

b5557

docs : Note about necessity of having libcurl installed for standard …

…build. (ggml-org#13945)

Signed-off-by: Jiri Podivin <[email protected]>

May 31, 2025
b3a89c3
zip
tar.gz
Downloads

b5555

llama : deprecate explicit kv_self defrag/update calls (ggml-org#13921)

ggml-ci

May 31, 2025
803f8ba
zip
tar.gz
Downloads

b5414

cmake: use the current build config for vulkan-shaders-gen (ggml-org#…

…13595)

* fix: use the current build config for `vulkan-shaders-gen`

* fix: only pass a valid build type to `--config`

May 17, 2025
e3a7cf6
zip
tar.gz
Downloads

b5412

vulkan: move common FA code to flash_attn_base.comp (ggml-org#13556)

* vulkan: move common FA code to flash_attn_base.comp

* vulkan: move common FA index/stride setup code to flash_attn_base.comp

* build fix

May 17, 2025
2f5a4e1
zip
tar.gz
Downloads

PreviousNext

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

b5585

b5581

b5579

b5574

b5568

b5558

b5557

b5555

b5414

b5412

Tags: dumpmemory/llama.cpp