Tags: stevenkuang-tencent/llama.cpp
Tags
convert : remove redundant code (ggml-org#15708) Signed-off-by: Jie Fu <[email protected]>
CANN: add support for ACL Graph (ggml-org#15065) * feat(cann): add optional support for ACL Graph execution This commit adds support for executing ggml computational graphs using Huawei's ACL graph mode via the USE_CANN_GRAPH flag. The support can be enabled at compile time using the CMake option: -DUSE_CANN_GRAPH=ON By default, ACL graph execution is **disabled**, and the fallback path uses node-by-node execution. Key additions: - CMake option to toggle graph mode - Graph capture and execution logic using - Tensor property matching to determine whether graph update is required - Safe fallback and logging if the environment variable LLAMA_SET_ROWS is unset or invalid This prepares the backend for performance improvements in repetitive graph execution scenarios on Ascend devices. Signed-off-by: noemotiovon <[email protected]> * Fix review comments Signed-off-by: noemotiovon <[email protected]> * remane USE_CANN_GRAPH to USE_ACL_GRAPH Signed-off-by: noemotiovon <[email protected]> * fix typo Signed-off-by: noemotiovon <[email protected]> --------- Signed-off-by: noemotiovon <[email protected]>
docs: add libcurl-dev install hint for Linux distros (ggml-org#14801) * docs: add libcurl-dev install hint for Linux distros Signed-off-by: PouyaGhahramanian <[email protected]> * Update docs/build.md --------- Signed-off-by: PouyaGhahramanian <[email protected]> Co-authored-by: Xuan-Son Nguyen <[email protected]>
ggml : refactor llamafile_sgemm PPC code (ggml-org#14673) Remove un-necessary templates from class definition and packing functions Reduce deeply nested conditionals, if-else switching in mnapck function Replace repetitive code with inline functions in Packing functions 2 ~ 7% improvement in Q8 Model 15 ~ 50% improvement in Q4 Model Signed-off-by: Shalini Salomi Bodapati <[email protected]>