ggml-cpu: Build variant targeting Neoverse-V2 #14380

ckastner · 2025-06-25T19:14:19Z

As a first improvement on the recently added generic ARM support for GGML_CPU_ALL_VARIANTS, this builds a variant targeting Neoverse-V2 specifically (eg: Graviton4 or NVIDIA Grace).

The cmake part needed little change. Feature processing unchanged, but the target is a specific -mcpu= rather than a generic -march=
It also defines a GGML_ARM_MCPU passed on to the scoring function
The scoring function parses the part number from /proc/cpuinfo on Linux (Graviton4 is Linux-only and I'd guess NVIDIA Grace, too), and uses it in scoring.

In the scoring function, I shifted features to the 9th bit and beyond. The idea being that features are more important than microarchitecture, platform, whatever, which can use bits 2-8 to rank themselves. So nuances like the microarchitecture of two variants become relevant in scoring only if they have otherwise equal features, otherwise features win. I thought this might be a useful convention.

I tested this on Graviton4, where the neoverse-v2 variant indeed received a higher score than the armv8.6-a variant, which would also work for Neoverse-V2 as it is armv8.6-a. neoverse-v2 is also what the GGML_NATIVE=ON build targets.

I did not see meaningful benchmark improvements over generic armv8.6-a, but I tested only limited models, and only with 4 vCPUs. Some tests ran with 2-3% improvement, but this wasn't always reproducible. I hope to get more AWS resources in July where I can properly test this on a dedicated box.

In any case, I think this would at least serve as an easy-to-copy template for other variants where this might matter more.

This supersedes #14332.

This allows for ranking backends when they otherwise support the same features.

slaren · 2025-06-30T14:22:31Z

I am just wondering if it is worth adding this if it doesn't give any measurable improvement. I don't think it is likely that this will help meaningfully in any case because most of the (performance sensitive) code is already very low level intrinsics and assembly.

ckastner · 2025-06-30T17:49:19Z

With no improvement, I would tend to agree. Which would be a nice result actually, as that would mean the current ALL_VARIANTS solution targeting a generic ARM arch would already be good enough.

I was just surprised to see no improvement. Clearly nothing big was to be expected, but I did expect at least a marginal one. Hence why I plan to do more tests in July with dedicated hardware. I think it's quite possible that the vCPUs I was using masked any possible gains. 4 vCPUs (not cores), on a 96-core machine with who knows how many co-tenants.

Apart from that, there could be the benefit of having this code as a template for platforms where it might actually matter; minus the whitespace from un-nesting an if, this is still a small change.

(Speaking naively again, the reason I would have expected the compiler to gain at least something from knowing more details about the cores used is that otherwise, what's the point of a Neoverse core over any other. Apart from where direct assembly use prohibits this, of course.)

cmake: Reduce unnecessary nesting

4f04a23

github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Jun 25, 2025

ckastner mentioned this pull request Jun 25, 2025

ggml-cpu: Pass on tag_name to the feature scoring #14332

Closed

ckastner added 3 commits June 25, 2025 21:16

ggml-cpu: Add ARM variant targeting neoverse-v2

15dd2f7

ggml-cpu: Split ARM backend scores

c02e0da

This allows for ranking backends when they otherwise support the same features.

ggml-cpu: Rank neoverse-v2 over generic ARM

14ca242

ckastner force-pushed the target-cpus branch from 94b2702 to 14ca242 Compare June 25, 2025 19:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ggml-cpu: Build variant targeting Neoverse-V2 #14380

ggml-cpu: Build variant targeting Neoverse-V2 #14380

Uh oh!

ckastner commented Jun 25, 2025

Uh oh!

slaren commented Jun 30, 2025 •

edited

Loading

Uh oh!

ckastner commented Jun 30, 2025

Uh oh!

Uh oh!

ggml-cpu: Build variant targeting Neoverse-V2 #14380

Are you sure you want to change the base?

ggml-cpu: Build variant targeting Neoverse-V2 #14380

Uh oh!

Conversation

ckastner commented Jun 25, 2025

Uh oh!

slaren commented Jun 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ckastner commented Jun 30, 2025

Uh oh!

Uh oh!

slaren commented Jun 30, 2025 •

edited

Loading