Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Tags: stevenkuang-tencent/llama.cpp

Tags

b6345

Toggle b6345's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
convert : remove redundant code (ggml-org#15708)

Signed-off-by: Jie Fu <[email protected]>

b6098

Toggle b6098's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
CANN: add support for ACL Graph (ggml-org#15065)

* feat(cann): add optional support for ACL Graph execution

This commit adds support for executing ggml computational graphs using
Huawei's ACL graph mode via the USE_CANN_GRAPH flag. The support can be
enabled at compile time using the CMake option:

    -DUSE_CANN_GRAPH=ON

By default, ACL graph execution is **disabled**, and the fallback path
uses node-by-node execution.

Key additions:
- CMake option  to toggle graph mode
- Graph capture and execution logic using
- Tensor property matching to determine whether graph update is required
- Safe fallback and logging if the environment variable LLAMA_SET_ROWS
  is unset or invalid

This prepares the backend for performance improvements in repetitive graph
execution scenarios on Ascend devices.

Signed-off-by: noemotiovon <[email protected]>

* Fix review comments

Signed-off-by: noemotiovon <[email protected]>

* remane USE_CANN_GRAPH to USE_ACL_GRAPH

Signed-off-by: noemotiovon <[email protected]>

* fix typo

Signed-off-by: noemotiovon <[email protected]>

---------

Signed-off-by: noemotiovon <[email protected]>

b5988

Toggle b5988's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
mtmd : fix 32-bit narrowing issue in export-lora and mtmd clip (ggml-…

…org#14503)

* [fix] Fix 32-bit narrowing issue in export-lora and mtmd clip

* Update export-lora.cpp

* Update clip.cpp

* Update export-lora.cpp

* format: use space to replace tab

b5977

Toggle b5977's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
docs: add libcurl-dev install hint for Linux distros (ggml-org#14801)

* docs: add libcurl-dev install hint for Linux distros

Signed-off-by: PouyaGhahramanian <[email protected]>

* Update docs/build.md

---------

Signed-off-by: PouyaGhahramanian <[email protected]>
Co-authored-by: Xuan-Son Nguyen <[email protected]>

b5952

Toggle b5952's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
kleidiai: add support for get_rows (ggml-org#14676)

* kleidiai: add support for get_rows

* apply fixes based on code review

* apply more fixes based on code review

b5929

Toggle b5929's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
graph : refactor context to not pass gf explicitly (ggml-org#14629)

ggml-ci

b5896

Toggle b5896's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
ggml : refactor llamafile_sgemm PPC code (ggml-org#14673)

Remove un-necessary templates from class definition and packing functions
Reduce deeply nested conditionals, if-else switching in mnapck function
Replace repetitive code with inline functions in Packing functions

2 ~ 7% improvement in Q8 Model
15 ~ 50% improvement in Q4 Model

Signed-off-by: Shalini Salomi Bodapati <[email protected]>