Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Tags: VJHack/llama.cpp

Tags

b5350

Toggle b5350's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
mtmd : Use RMS norm for InternVL 3 38B and 78B mmproj (ggml-org#13459)

b4696

Toggle b4696's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
HIP: Switch to std::vector in rocblas version check (ggml-org#11820)

b4476

Toggle b4476's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
server : (UI) Improve messages bubble shape in RTL (ggml-org#11220)

I simply have overlooked message bubble's tail placement for RTL
text as I use the dark mode and that isn't visible there and this
fixes it.

b4457

Toggle b4457's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
llama: add support for QRWKV6 model architecture (ggml-org#11001)

llama: add support for QRWKV6 model architecture (ggml-org#11001)

* WIP: Add support for RWKV6Qwen2

Signed-off-by: Molly Sophia <[email protected]>

* RWKV: Some graph simplification

Signed-off-by: Molly Sophia <[email protected]>

* Add support for RWKV6Qwen2 with cpu and cuda GLA

Signed-off-by: Molly Sophia <[email protected]>

* RWKV6[QWEN2]: Concat lerp weights together to reduce cpu overhead

Signed-off-by: Molly Sophia <[email protected]>

* Fix some typos

Signed-off-by: Molly Sophia <[email protected]>

* code format changes

Signed-off-by: Molly Sophia <[email protected]>

* Fix wkv test & add gla test

Signed-off-by: Molly Sophia <[email protected]>

* Fix cuda warning

Signed-off-by: Molly Sophia <[email protected]>

* Update README.md

Signed-off-by: Molly Sophia <[email protected]>

* Update ggml/src/ggml-cuda/gla.cu

Co-authored-by: Georgi Gerganov <[email protected]>

* Fix fused lerp weights loading with RWKV6

Signed-off-by: Molly Sophia <[email protected]>

* better sanity check skipping for QRWKV6 in llama-quant

thanks @compilade

Signed-off-by: Molly Sophia <[email protected]>
Co-authored-by: compilade <[email protected]>

---------

Signed-off-by: Molly Sophia <[email protected]>
Co-authored-by: Georgi Gerganov <[email protected]>
Co-authored-by: compilade <[email protected]>

b4447

Toggle b4447's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
ci : use actions from ggml-org (ggml-org#11140)

b4444

Toggle b4444's commit message

Verified

This commit was signed with the committer’s verified signature.
ggerganov Georgi Gerganov
sync : ggml

b4431

Toggle b4431's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
llama-run : fix context size (ggml-org#11094)

Set `n_ctx` equal to `n_batch` in `Opt` class. Now context size is
a more reasonable 2048.

Signed-off-by: Eric Curtin <[email protected]>

b4311

Toggle b4311's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
common : add missing env var for speculative (ggml-org#10801)

b4306

Toggle b4306's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Update README.md (ggml-org#10772)

b4295

Toggle b4295's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
CUDA: fix shared memory access condition for mmv (ggml-org#10740)