Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@taronaeo
Copy link
Collaborator

ref: #16664 (comment)

This PR introduces the CPU features detection for the s390x platform and allows for dynamic backend loading when compiled with -DGGML_NATIVE=OFF -DGGML_BACKEND_DL=ON -DGGML_CPU_ALL_VARIANTS=ON.

Tested release.yml and it seems to be working as intended as well: https://github.com/ggml-org/llama.cpp/actions/runs/18814223900/job/53680143680.

@taronaeo taronaeo requested a review from CISC October 26, 2025 06:45
@github-actions github-actions bot added devops improvements to build systems and github actions ggml changes relating to the ggml tensor library for machine learning labels Oct 26, 2025
Copy link
Collaborator

@CISC CISC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but let's wait for input from @slaren in case we overlooked something.

@CISC
Copy link
Collaborator

CISC commented Oct 31, 2025

@taronaeo Will this also fix the Docker build?
https://github.com/ggml-org/llama.cpp/actions/runs/18962560883/job/54152829325

@taronaeo
Copy link
Collaborator Author

@taronaeo Will this also fix the Docker build? https://github.com/ggml-org/llama.cpp/actions/runs/18962560883/job/54152829325

Ooh, looks like it broke for some reason. But yes, this PR + the previous PR (#16664) should fix this. Let me double check tomorrow when I have more time :)

* drop vxe feature
* add nnpa feature

Signed-off-by: Aaron Teo <[email protected]>
@taronaeo
Copy link
Collaborator Author

taronaeo commented Nov 1, 2025

@rishiraj20 Can you assist to test this PR on AQLINUX1 and 2?

  1. Build this PR using:
$ cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DGGML_NATIVE=OFF -DGGML_BACKEND_DL=ON -DGGML_CPU_ALL_VARIANTS=ON

$ cmake --build build --config Release -t llama-cli -j$(nproc)
  1. Check that there are 2 libggml-cpu-z*.so files built:
$ ls -la build/bin | grep libggml-cpu

-rwxr-xr-x.  1 root root 1167608 Nov  1 19:04 libggml-cpu-z15.so
-rwxr-xr-x.  1 root root 1167608 Nov  1 19:04 libggml-cpu-z16.so
  1. Run a test prompt and let me know which library is loaded via:
$ build/bin/llama-cli -m /opt/hf_models/granite-3.3-2b-instruct-be.Q4_K_M.gguf -no-cnv --seed 42 -n 50 -p "Write me a dog walking business idea 1. " 2>&1 | less

Help me paste the first few outputs from the top. It should print something like this at the top and it should run the prompt completely without problems.

load_backend: loaded CPU backend from /opt/llama.cpp/build/bin/libggml-cpu-z16.so
build: 6819 (b62c93efc) with cc (GCC) 15.1.1 20250521 (Red Hat 15.1.1-2) for s390x-redhat-linux
main: llama backend init
main: load the model and apply lora adapter, if any
llama_model_loader: loaded meta data with 40 key-value pairs and 362 tensors from /opt/hf_models/granite-3.3-2b-instruct-be.Q4_K_M.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.

...

@taronaeo
Copy link
Collaborator Author

taronaeo commented Nov 1, 2025

None of the CI failures look related to this PR. Merging in a few hours unless the CI failures are related.

@CISC
Copy link
Collaborator

CISC commented Nov 1, 2025

None of the CI failures look related to this PR. Merging in a few hours unless the CI failures are related.

Indeed unrelated, go ahead.

@taronaeo taronaeo merged commit d38d9f0 into ggml-org:master Nov 2, 2025
127 of 141 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

devops improvements to build systems and github actions ggml changes relating to the ggml tensor library for machine learning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants