Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Tags: catap/llama.cpp

Tags

b6979

Toggle b6979's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
vulkan : refactor buffer handling in vk_op_f32 (ggml-org#16840)

* vulkan : refactor/simplify buffer handling in vk_op_* functions

* Combine UMA handling into ggml_vk_tensor_subbuffer

b6978

Toggle b6978's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
CUDA: fix should_use_mmvf for ne11 == 1 (ggml-org#17085)

* CUDA: fix should_use_mmvf for ne11 == 1

* Apply suggestion from @am17an

Co-authored-by: Aman Gupta <[email protected]>

---------

Co-authored-by: Aman Gupta <[email protected]>

b6977

Toggle b6977's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
bench : cache the llama_context state at computed depth (ggml-org#16944)

* bench : cache llama_context state at depth

* cont : handle failures to restore the old state

* cont : print information when the state is being reused

b6976

Toggle b6976's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
hparams : add n_embd_inp() to support extended embed (ggml-org#16928)

* add n_embd_full to support extended embed

* don't change output

* rename to n_embd_inp

* restore n_embd where applicable

b6975

Toggle b6975's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
kv-cache : pad the cache size to 256 for performance (ggml-org#17046)

* kv-cache : pad the size of the small SWA cache for performance

* context : pad the total context to 256

* cont : future-proof the swa pad

* server : adjust test params to new logic

b6974

Toggle b6974's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Revert "ggml-cpu: detect correct cpu flags for arm64 (ggml-org#16229) (

…ggml-org#16239)" (ggml-org#17084)

This reverts commit 7c23f3f.

b6973

Toggle b6973's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
ggml-cpu: detect correct cpu flags for arm64 (ggml-org#16229) (ggml-o…

…rg#16239)

When using GCC 9 and GCC 12 on the arm64 platform of ubuntu 2004,
the command "gcc -mcpu=native -E -v -" fails to detect the correct CPU flags,
which results in compilation failures for certain extended instructions,
but the correct CPU flags can be obtained by using gcc -march.

Signed-off-by: lizhenneng <[email protected]>
Co-authored-by: lizhenneng <[email protected]>

b6972

Toggle b6972's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
server : print the samplers chain for each request (ggml-org#17070)

b6971

Toggle b6971's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
common: move download functions to download.(cpp|h) (ggml-org#17059)

* common: move download functions to download.(cpp|h)

* rm unused includes

* minor cleanup

---------

Co-authored-by: Georgi Gerganov <[email protected]>

b6970

Toggle b6970's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
ggml-cpu : optimize RVV q2_k and q3_k kernels (ggml-org#16887)