-
Notifications
You must be signed in to change notification settings - Fork 207
Open
Description
Summary
The ARMv8A target currently only includes single-state Keccak implementations (KeccakP-1600-plain-64bits-ua), while the ARMv7A target includes parallel implementations (KeccakP-1600-times2-ARMV7A-NEON).
It would be valuable to have a KeccakP-1600-times2 implementation for ARM64 (AArch64) using NEON intrinsics or assembly.
Use Case
Batch hashing of multiple independent messages is common in in many different use cases. On x86_64, the KeccakP-1600-times4 AVX2 implementation provides ~4x throughput for these workloads.
ARM64 servers (AWS Graviton, Ampere Altra, Apple Silicon) are increasingly popular, but currently lack parallel Keccak support in XKCP.
Technical Details
- ARM64 NEON provides 128-bit vectors (uint64x2_t), sufficient for 2-way parallel Keccak states
- The ARMv7A NEON times2 implementation exists but uses 32-bit ARM assembly syntax incompatible with AArch64
- A new implementation would need AArch64 assembly or NEON intrinsics
Requested Addition
Add to the ARMv8A target:
- KeccakP-1600-times2 implementation using AArch64 NEON
- Corresponding KeccakP-1600-times2-SnP.h header
Thank you for maintaining XKCP!
Metadata
Metadata
Assignees
Labels
No labels