Codestin Search App

b4937

Merge pull request #10 from arthw/fix_yaml

fix format

Mar 19, 2025
b634b4b
zip
tar.gz
Downloads

b4789

Merge pull request #8 from arthw/fix_q4_1

fix ut fault of Q4_1, Q5..

Feb 27, 2025
37a06a0
zip
tar.gz
Downloads

b4787

Merge pull request #7 from arthw/cherry_pick_20250224

Cherry pick 20250224

Feb 27, 2025
c69f491
zip
tar.gz
Downloads

b4383

Merge pull request #6 from arthw/cherry-1220

Cherry 1220

Dec 20, 2024
258e80f
zip
tar.gz
Downloads

b4137

Merge pull request #5 from arthw/cherry-1118

Cherry 1118

Nov 19, 2024
8dcc98f
zip
tar.gz
Downloads

b3555

fix error

Aug 7, 2024
75a3266
zip
tar.gz
Downloads

b3554

ggml-backend : fix async copy from CPU (ggml-org#8897)

* ggml-backend : fix async copy from CPU

* cuda : more reliable async copy, fix stream used when the devices are the same

Aug 7, 2024
9d73802
zip
tar.gz
Downloads

b3517

[SYCL] Fixing wrong VDR iq4nl value (ggml-org#8812)

Aug 2, 2024
11c713b
zip
tar.gz
Downloads

b3482

Merge pull request #2 from arthw/refactor_dev

Refactor device management and usage api

Aug 1, 2024
c16f01b
zip
tar.gz
Downloads

b3475

llama : add support for llama 3.1 rope scaling factors (ggml-org#8676)

* Add llama 3.1 rope scaling factors to llama conversion and inference

This commit generates the rope factors on conversion and adds them to the resulting model as a tensor. At inference time, these factors are passed to the `ggml_rope_ext` rope oepration, improving results for context windows above 8192

* Update convert_hf_to_gguf.py

Co-authored-by: compilade <[email protected]>

* address comments

* address comments

* Update src/llama.cpp

Co-authored-by: compilade <[email protected]>

* Update convert_hf_to_gguf.py

Co-authored-by: compilade <[email protected]>

---------

Co-authored-by: compilade <[email protected]>

Jul 27, 2024
e661170
zip
tar.gz
Downloads

PreviousNext

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

b4937

b4789

b4787

b4383

b4137

b3555

b3554

b3517

b3482

b3475

Tags: arthw/llama.cpp