Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Tags: arthw/llama.cpp

Tags

b4937

Toggle b4937's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Merge pull request #10 from arthw/fix_yaml

fix format

b4789

Toggle b4789's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Merge pull request #8 from arthw/fix_q4_1

fix ut fault of Q4_1, Q5..

b4787

Toggle b4787's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Merge pull request #7 from arthw/cherry_pick_20250224

Cherry pick 20250224

b4383

Toggle b4383's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Merge pull request #6 from arthw/cherry-1220

Cherry 1220

b4137

Toggle b4137's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Merge pull request #5 from arthw/cherry-1118

Cherry 1118

b3555

Toggle b3555's commit message
fix error

b3554

Toggle b3554's commit message
ggml-backend : fix async copy from CPU (ggml-org#8897)

* ggml-backend : fix async copy from CPU

* cuda : more reliable async copy, fix stream used when the devices are the same

b3517

Toggle b3517's commit message
[SYCL] Fixing wrong VDR iq4nl value (ggml-org#8812)

b3482

Toggle b3482's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Merge pull request #2 from arthw/refactor_dev

Refactor device management and usage api

b3475

Toggle b3475's commit message
llama : add support for llama 3.1 rope scaling factors (ggml-org#8676)

* Add llama 3.1 rope scaling factors to llama conversion and inference

This commit generates the rope factors on conversion and adds them to the resulting model as a tensor. At inference time, these factors are passed to the `ggml_rope_ext` rope oepration, improving results for context windows above 8192

* Update convert_hf_to_gguf.py

Co-authored-by: compilade <[email protected]>

* address comments

* address comments

* Update src/llama.cpp

Co-authored-by: compilade <[email protected]>

* Update convert_hf_to_gguf.py

Co-authored-by: compilade <[email protected]>

---------

Co-authored-by: compilade <[email protected]>