Papers updated in last 183 days (1666 results)
SCA-GPT: A Generation-Planning-Tool Assisted LLM Agent for Fully Automated Side-Channel Analysis on Cryptosystems
Non-invasive security constitutes an essential component
of hardware security, primarily involving side-channel
analysis (SCA), with various international standards explicitly
mandating rigorous testing. However, current SCA assessments
rely on manual expert procedures, causing critical issues: inconsistent
results due to expert variability, error-prone multi-step
testing, and high costs with IP leakage risks for manufacturers
lacking in-house expertise. Automated SCA tools that deliver
consistent, expert-level evaluations are urgently needed. In recent
years, large language models (LLMs) have been widely adopted
in various fields owing to their emergent capabilities. Particularly,
LLM agents equipped with tool-usage capabilities have significantly
expanded the potential of these models to interact with
the physical world.
Motivated by these recent advances in LLM agents, we propose
SCA-GPT, an end-to-end automated LLM agent framework tailored
for SCA tasks. The framework integrates a domain-specific
knowledge base with multiple SCA tools to enable retrievalaugmented
generation for fully automated ISO/IEC 17825-
compliant testing. As a core component of SCA-GPT, the expert
knowledge base serves as the agent’s long-term memory, enabling
precise retrieval and contextual reasoning during automated
testing. We further present a domain-specific expert knowledge
base construction approach and two complementary evaluation
metrics.
Retrieval experiments validate the effectiveness of our knowledge
base construction, achieving strong performance with
84.44% and 98.33% on two complementary retrieval quality
metrics. We further evaluate the overall framework across three
leading LLMs: DeepSeek V3.1, Kimi K2 and Qwen3 Coder. The
evaluation uses datasets spanning six cryptographic algorithms
(e.g., AES, DES, RSA, ECDSA) and deploying on four hardware
platforms, including smart cards, microcontrollers, and FPGAs.
Results show that DeepSeek V3.1, Kimi K2, and Qwen3 Coder
achieve accuracies of 83.8%, 77.8%, and 91.4%, respectively.
The framework reduces evaluation time by 95.7% on average
compared with manual procedures while maintaining equivalent
assessment quality and automatically generating evaluation
reports. Notably, SCA-GPT is the first advanced LLM agent
specifically designed for SCA tasks.
SAT-Based Space Partitioning and Applications to Ascon-Hash256
We introduce an efficient SAT-based space partitioning technique that enables systematic exploration of large search spaces in cryptanalysis. The approach divides complex search spaces into manageable subsets through combinatorial necklace generation, allowing precise tracking of explored regions while maintaining search completeness.
We demonstrate the technique's effectiveness through extensive cryptanalysis of Ascon-Hash256. For differential-based collision attacks, we conduct an exhaustive search of 2-round collision trails, proving that no collision trail with weight less than 156 exists. Through detailed complexity analysis and parameter optimization, we present an improved 2-round collision attack with complexity $2^{61.79}$. We also discover new Semi-Free-Start (SFS) collision trails that enable practical attacks on both 3-round and 4-round Ascon-Hash256, especially improving the best known 4-round SFS trail from weight 295 to 250.
Furthermore, applying the technique to Meet-in-the-Middle structure search yields improved attacks on 3-round Ascon-Hash256. We reduce the collision attack complexity from $2^{116.74}$ to $2^{114.13}$ with memory complexity $2^{112}$ (improved from $2^{116}$), and the preimage attack complexity from $2^{162.80}$ to $2^{160.75}$ with memory complexity $2^{160}$ (improved from $2^{162}$).
On the Credibility of Deniable Communication in Court
Over time, cryptographically deniable systems have come to be associated in computer-science literature with the idea of "denying" evidence in court — specifically, with the ability to convincingly forge evidence in courtroom scenarios, and relatedly, an inability to authenticate evidence in such contexts. Indeed, in some cryptographic models, the ability to falsify mathematically implies the inability to authenticate. Evidentiary processes in courts, however, have been developed over centuries to account for the reality that evidence has always been forgeable, and relies on factors outside of cryptographic models to seek the truth "as well as possible" while acknowledging that all evidence is imperfect. We argue that deniability does not and need not change this paradigm.
Our analysis highlights a gap between technical deniability notions and their application to the real world. There will essentially always be factors outside a cryptographic model that influence perceptions of a message's authenticity, in realistic situations. We propose the broader concept of credibility to capture these factors. The credibility of a system is determined by (1) a threshold of quality that a forgery must pass to be "believable" as an original communication, which varies based on sociotechnical context and threat model, (2) the ease of creating a forgery that passes this threshold, which is also context- and threat-model-dependent, and (3) default system retention policy and retention settings. All three aspects are important for designing secure communication systems for real-world threat models, and some aspects of (2) and (3) may be incorporated directly into technical system design. We hope that our model of credibility will facilitate system design and deployment that addresses threats that are not and cannot be captured by purely technical definitions and existing cryptographic models, and support more nuanced discourse on the strengths and limitations of cryptographic guarantees within specific legal and sociotechnical contexts.
MacaKey: Full-State Keyed Sponge Meets the Summation-Truncation Hybrid
Uncategorized
Uncategorized
The keyed sponge construction has benefited from various efficiency advancements over time, most notably leading to the possibility to absorb over the entire state, as in the full-state keyed sponge. However, squeezing has always remained limited to blocks smaller than the permutation size, as security is determined by the capacity c, the size of the non-squeezed state. In this work, we present Macakey, an improved version of the full-state keyed sponge that not only absorbs over the entire state but also squeezes over the entire state. The scheme combines ideas of the full-state keyed sponge with those of the summation-truncation hybrid of Gunsing and Mennink. We demonstrate that, with no sacrifice in generic security and with only using c bits of extra storage, Macakey can significantly boost performance, particularly in scenarios requiring large amounts of output. For example, using the 320-bit Ascon permutation with a 256-bit capacity, Macakey outputs five times as many bits as the full-state keyed sponge.
On the BUFF Security of ECDSA with Key Recovery
In the usual syntax of digital signatures, the verification algorithm takes a verification key in addition to a signature and a message, whereas in ECDSA with key recovery, which is used in Ethereum, no verification key is input to the verification algorithm. Instead, a verification key is recovered from a signature and a message. In this paper, we explore BUFF security of ECDSA with key recovery (KR-ECDSA), where BUFF stands for Beyond UnForgeability Features (Cremers et al., IEEE S&P 2021). As a result, we show that KR-ECDSA provides BUFF security, except weak non-resignability (wNR). It is particularly noteworthy that the KR-ECDSA verification algorithm takes an Ethereum address addr as input. This address is defined as the rightmost 160 bits of the Keccak-256 hash of the corresponding ECDSA verification key. Crucially, the algorithm verifies that the hash of the recovered verification key matches addr. Our security analysis shows that the procedure, the process of checking whether the hash value of the recovered verification key is equal to the address, is mandatory to provide BUFF security. We also discuss whether wNR is mandatory in Ethereum or not. To clarify which part is mandatory to provide BUFF security in KR-ECDSA, we show that the original ECDSA does not provide any BUFF security. As a by-product of the analysis, we show that one of our BUFF attacks also works against Aumayr et al.'s ECDSA-based adaptor signature scheme (ASIACRYPT 2021) and Qin et al.'s blind adaptor signature scheme (IEEE S\&P 2023), which is based on Aumayr et al.'s scheme. We emphasize that the attack is positioned outside of their security models.
LightCROSS: A Secure and Memory Optimized Post-Quantum Digital Signature CROSS
Digital signature schemes derived from non-interactive zero-knowledge (NIZK) proofs are rapidly gaining prominence within post-quantum cryptography. CROSS is a promising new code-based post-quantum digital signature scheme based on the NIZK framework. It is currently in the second round of the NIST's additional call for standardization for post-quantum digital signatures. However, CROSS's reference implementation has a substantially large memory footprint. This makes its deployment on resource-constrained platforms prohibitively difficult.
In particular, we identified several mechanisms, such as Merkle tree and GGM tree structures, commitment generation process, which are part of the zero-knowledge proof generation, are one of the most memory-intensive operations. We propose several novel algorithms and implementation strategies to reduce the memory requirement of these components. Apart from these, we also propose several memory optimization techniques, such as just-in-time hashing and execution flow analysis. As a result, our implementation reduces the memory footprint of Key Generation, Signature Generation, and Verification of the CROSS reference code by as much as 95%, 92%, and 85%, respectively. This results in a suite of implementations in which all variants are under 128kB (for all security levels of KeyGen/Sign/Verify) and six variants under 32kB. Our memory optimization techniques are not specific to CROSS, but can be applied to other NIZK-based signature schemes.
Regarding efficiency, matrix multiplications are crucial to the performance of CROSS. We show how the Digital Signal Processing (DSP) instructions on ARM Cortex-M4, specifically packing and multiplying, can be utilized to efficiently implement matrix operations over finite fields. The DSP optimizations combined with the memory reductions improve the efficiency of CROSS by up to 32% and 33% in Signature Generation and Verification respectively.
Efficient Polynomial Evaluation over Structured Space and Application to Polynomial Method
It is well-known that evaluating a Boolean polynomial $f$ of any degree $d$ in $n$ variables over the full space $\mathbb F_2^n$ takes $n\cdot 2^n$ bit operations and $2^n$ bits of memory with standard Mobius transform. When $d$ is relatively small, Bouillaguet et al. proposed at CHES 2010 the fast exhaustive search (FES) algorithm. In this algorithm, by using Gray code to enumerate all elements in $\mathbb F_2^n$, evaluating $f$ on all inputs in $\mathbb F_2^n$ takes $\big(\sum_{i=0}^{d}\binom{n}{i}\big)^2+d\cdot 2^n=\binom{n}{\leq d}^2+d\cdot 2^n$ bit operations and $\binom{n}{\leq d}$ bits of memory. The term $\binom{n}{\leq d}^2$ represents the cost of the initialization phase. This problem has received new attention in recent years, which was studied by Dinur at EUROCRYPT 2021, by Furue and Takagi at PQCrypto 2023, and by Bouillaguet at TOMS 2024. All these algorithms work on the full space, and have a similar additional phase such as the initialization phase in the FES algorithm, which takes much more than $\binom{n}{\leq d}$ bit operations. In this work, we propose a simple yet efficient algorithm to evaluate $f$ over the structured space $P_{n_s}^{w_s}\times \cdots \times P_{n_1}^{w_1}\subseteq \mathbb F_2^n$ where $\sum_{i=1}^{s}n_i=n$ and $P_{n_i}^{w_i}$ denotes the set of $n_i$-bit binary strings with Hamming weight not larger than $w_i$. Our algorithm is inspired by the FES algorithm and Furue-Takagi's algorithm. However, our algorithm can work on a more general space, and is also distinguished by an efficient additional phase, which is simply reading all coefficients of $f$ and thus takes only $\binom{n}{\leq d}$ bit operations. For complexity, our algorithm takes $\binom{n}{\leq d}+d\cdot \Pi_{i=1}^{s}\binom{n_i}{\leq w_i}$ bit operations and consumes $2\cdot \binom{n}{\leq d}$ bits of memory. For applications, we prove that it is either infeasible or nontrivial to adapt the FES algorithm with monotone Gray code, which somehow answers a question raised by Dinur at EUROCRYPT 2021. Moreover, our algorithm provides a proven method to solve a critical step in Dinur's algorithm for the polynomial method, without affecting its time complexity. In particular, we also address the open problem proposed at TOMS 2024, and improve the polynomial evaluation algorithms even over the full space.
Shred-to-Shine Metamorphosis of (Distributed) Polynomial Commitments
Succinct non-interactive arguments of knowledge (SNARKs) rely on polynomial commitment schemes (PCSs) to verify polynomial evaluations succinctly. High-performance multilinear PCSs (MLPCSs) from linear codes reduce prover cost, and distributed MLPCSs reduce it further by parallelizing commitment and opening across provers. Employing a fast Reed--Solomon interactive oracle proof of proximity (FRI), we propose PIPFRI, an MLPCS that combines the linear-time proving of linear-time-encodable-code PCSs with the compact proofs and fast verification of Reed--Solomon (RS) PCSs. Reducing fast Fourier transform and hash overhead, PIPFRI is 10× faster to prove than the RS-based DeepFold (USENIX Security '25) while keeping competitive proof size and verifier time. Measured against Orion (CRYPTO '22) from linear-time-encodable codes, PIPFRI proves 3.5× faster and reduce proof size and verifier time by 15×. As a linearly scalable distributed variant, we propose DEPIPFRI, which adds accountability and distributes a single polynomial across provers, enabling the first code-based distributed SNARK for general circuits. Notably, compared with DeVirgo (CCS '22), which lacks accountability and supports only multiple independent polynomials, DEPIPFRI improves prover time by 25× and inter-prover communication by 7×. We identify shred-to-shine as the key insight: partitioning a polynomial into independently handled fragments while maintaining proof size and verifier time. Hitting the pairing regime, this insight yields a group-based MLPCS with a 16× shorter structured reference string (SRS) and a 10× faster opening time than a multilinear variant of Kate--Zaverucha--Goldberg (TCC '13).
Fully Distributed Multi-Point Functions for PCGs and Beyond
We introduce new {Distributed Multi-Point Function} (DMPF) constructions that make multi-point sharing as practical as the classic single-point (DPF) case. Our main construction, {Reverse Cuckoo}, replaces the ``theoretical'' cuckoo insertions approach to DMPFs with a MPC-friendly linear solver that circumvents the concrete inefficiencies. Combined with our new sparse DPF construction, we obtain the first fully distributed and efficient DMPF key generation that avoids trusted dealers and integrates cleanly with standard two-party MPC.
Applied to pseudorandom correlation generators (PCGs), our DMPFs remove the dominant “sum of $t$ DPFs'' bottleneck. In Ring-LPN and Stationary-LPN pipelines (Crypto 2020, 2025), this translates to {an order of magnitude more Beaver triples per second} with {an order of magnitude less communication} compared to the status quo by Keller et al (Eurocrypt 2018). The gains persist across fields and rings ($\mathbb{F}_{p^k}$, $\mathbb{Z}_{2^k}$ for $k\geq 1$) and are complementary to existing PCG frameworks: our constructions drop in as a black-box replacement for their sparse multi-point steps, accelerating {all} PCGs that rely on such encodings.
We provide a complete protocol suite (deduplication, hashing, linear solver, sparse DPF instantiation) with a semi-honest security proof via a straight-line simulator that reveals only hash descriptors and aborts with negligible (cuckoo-style) probability. A prototype implementation validates the asymptotics with strong concrete performance improvements.
End-to-End Encrypted Git Services
Git services such as GitHub, have been widely used to manage projects and enable collaborations among multiple entities. Just as in messaging and cloud storage, where end-to-end security has been gaining increased attention, such a level of security is also demanded for Git services. Content in the repositories (and the data/code supply-chain facilitated by Git services) could be highly valuable, whereas the threat of system breaches has become routine nowadays. However, existing studies of Git security to date (mostly open source projects) suffer in two ways: they provide only very weak security, and they have a large overhead.
In this paper, we initiate the needed study of efficient end-to-end encrypted Git services. Specifically, we formally define the syntax and critical security properties, and then propose two constructions that provably meet those properties. Moreover, our constructions have the important property of platform-compatibility: They are compatible with current Git servers and reserve all basic Git operations, thus can be directly tested and deployed on top of existing platforms. Furthermore, the overhead we achieve is only proportional to the actual difference caused by each edit, instead of the whole file (or even the whole repository) as is the case with existing works. We implemented both constructions and tested them directly on several public GitHub repositories. Our evaluations show (1) the effectiveness of platform-compatibility, and (2) the significant efficiency improvement we got (while provably providing much stronger security than prior ad-hoc treatments).
mmCipher: Batching Post-Quantum Public Key Encryption Made Bandwidth-Optimal
In applications such as secure group communication and broadcasting, it is important to $\mathit{efficiently}$ deliver multiple messages to different recipients at once. To this end, multi-message multi-recipient Public Key Encryption (mmPKE) enables the batch encryption of multiple messages for multiple independent recipients in one go, significantly reducing costs–particularly bandwidth–compared to the trivial solution of encrypting each message individually. This capability is especially desirable in the post-quantum setting, where the ciphertext length is typically significantly larger than the corresponding plaintext. However, almost all prior works on mmPKE are limited to quantum-vulnerable traditional assumptions.
In this work, we propose the $\mathit{first}$ CPA-secure mmPKE and Multi-Key Encapsulation Mechanism (mmKEM) from the $\mathit{standard}$ Module Learning with Errors (MLWE) lattice assumption, named $\mathsf{mmCipher}\text{-}\mathsf{PKE}$ and $\mathsf{mmCipher}\text{-}\mathsf{KEM}$, respectively. This resolves a long-standing open problem posed by Bellare et al. (PKC '03). Our design proceeds in two steps: (i) We introduce a novel generic construction of mmPKE by proposing a new PKE variant—$\mathit{extended~reproducible~PKE~(XR\mbox{-}PKE)}$—that enables the reproduction of ciphertexts through additional hints; (ii) We instantiate a lattice-based XR-PKE using a new technique that can precisely estimate the impact of such hints on the ciphertext security while also establishing suitable parameters. We believe both to be of independent interest. As a bonus contribution, we explore generic constructions of $\mathit{adaptively~secure}$ mmPKE, resisting adaptive corruption and chosen-ciphertext attacks.
We also provide an efficient implementation and thorough evaluation of the practical performance of our $\mathsf{mmCipher}$. The results demonstrate substantial bandwidth and computational savings over the state-of-the-art. For example, for $1024$ recipients, our $\mathsf{mmCipher}\text{-}\mathsf{KEM}$ achieves a $23$--$45\times$ reduction in bandwidth overhead, with ciphertexts only $4$--$9\%$ larger than the plaintexts ($\mathit{near~optimal~bandwidth}$), while also offering a $3$--$5\times$ reduction in computational cost.
TSM+ and OTSM - Correct Application of Time Sharing Masking in Round-Based Designs
Among the countermeasures against side-channel analysis attacks, masking offers formal security guarantees and composability, yet remains challenging to implement efficiently in hardware due to physical defaults like glitches and transitions. Low-latency masking techniques aim to mitigate the performance penalties but can inadvertently compromise security in certain architectural contexts. In particular, the recently proposed Time Sharing Masking (TSM) technique enables single-cycle masked implementations with composability under the SNI and PINI notions but fails to satisfy stronger composability guarantees required in iterative designs, i.e., OPINI. In this work, we show that TSM-based constructions can exhibit first-order leakage when used in single-register feedback architecture, such as round-based implementations of ciphers. To address this, we propose two new masking schemes: TSM+, a more efficient variant of TSM satisfying only PINI (but not SNI), and OTSM, a construction satisfying OPINI, enabling secure round-based designs.
Our improved round-based masked implementations of PRINCE and AES ensure security in latency-critical applications under both glitch- and transition-extended probing model while demanding for slightly more area consumption.
HybridPlonk: SubLogarithmic Linear Time SNARKs from Improved Sum-Check
We present HybridPlonk: the first SNARK that simultaneously achieves linear-time prover, sublogarithmic proof size, provable security in the random oracle model, and also features an updatable setup. As a core technical contribution (possibly of independent interest), we reduce the communication complexity of the classical sumcheck protocol for multivariate polynomials from logarithmic to sublogarithmic, while retaining linear prover complexity. For degree $d$ multivariate polynomials in $\mu$ variables which can be decomposed into $\ell$ multilinear polynomials, our protocol achieves $O(\ell + d\log( \log n))$ communication and $O(n)$ prover cost for $n = 2^\mu$. Our protocol leverages recently proposed multilinear polynomial commitment schemes (PCS) with linear-time prover and constant proof size.
Multivariate sumcheck is a key ingredient in the design of several prover-efficient SNARKs, such as HyperPlonk (Eurocrypt'23), Spartan (Crypto'20), Hyrax (S&P'18), Libra (Crypto'19), Gemini (Eurocrypt'22), Virgo (S&P'20) etc. All of these SNARKs incur $\Omega(\log n)$ proof size, with the smallest concrete proof sizes ranging from $5$KB-$10$KB for circuits of size $2^{20}$-$2^{30}$.
We compile a variant of HyperPlonk multilinear PIOP with our improved sumcheck to realize HybridPlonk. HybridPlonk achieves $O(n)$ prover, $O(\log\log n)$ proof size, and $O(\log n)$ verifier, while avoiding proof recursion and non-black-box use of cryptographic primitives. We implement HybridPlonk to show that it is efficient in practice. We compare its performance with several state-of-the-art prover-efficient SNARKs. For circuits of sizes of upto $2^{30}$, HybridPlonk achieves a proof size of $\approx 2.2$ KB, which is $2.5-4\times$ smaller than the most compact prover-efficient SNARKs, while retaining comparable prover costs.
Automatically Detecting Compromised Secrets: Foundations, Design Principles, and Applications
We develop foundations and several constructions for security protocols that can automatically detect, without false positives, if a secret (such as a key or password) has been compromised. Such constructions can be used, e.g., to automatically shut down compromised services, or to automatically revoke compromised secrets to minimize the effects of compromise. Our threat model includes malicious agents, (temporarily or permanently) compromised agents, and clones.
Previous works have studied domain-specific partial solutions to this problem. For example, Google's Certificate Transparency aims to provide infrastructure to detect the compromise of a certificate authority's signing key, logs have been used for detecting endpoint compromise, and protocols have been proposed to detect cloned RFID/smart cards. Contrary to these existing approaches, for which the designs are interwoven with domain-specific considerations and which usually do not enable fully automatic response (i.e., they need human assessment), our approach shows where automatic action is possible. Our results unify, provide design rationales, and suggest improvements for the existing domain-specific solutions.
Based on our analysis, we construct several mechanisms for the detection of compromised secrets. Our mechanisms enable automatic response, such as revoking keys or shutting down services, thereby substantially limiting the impact of a compromise.
In several case studies, we show how our mechanisms can be used to substantially increase the security guarantees of a wide range of systems, such as web logins, payment systems, or electronic door locks. For example, we propose and formally verify an improved version of Cloudflare's Keyless SSL protocol that enables key compromise detection.
PICS: Private Intersection over Committed (and reusable) Sets
Private Set Intersection (PSI) enables two parties to compute the intersection of their private sets without revealing any additional information. While maliciously secure PSI protocols prevent many attacks, adversaries can still exploit them by using inconsistent inputs across multiple sessions. This limitation stems from the definition of malicious security in secure multiparty computation, but is particularly problematic in PSI because: (1) real-world applications---such as Apple’s PSI protocol for CSAM detection and private contact discovery in messaging apps---often require multiple PSI executions over consistent inputs, and (2) the PSI functionality makes it relatively easy for adversaries to infer additional information.
We propose {\em Private Intersection over Committed Sets (PICS)}, a new framework that enforces input consistency across multiple sessions via committed sets. Building on the state-of-the-art maliciously secure PSI framework (i.e., VOLE-PSI [EUROCRYPT 2021]), we present an efficient instantiation of PICS using lightweight cryptographic tools. Our protocol achieves strong receiver-side input consistency (i.e., the receiver uses the exact committed set) and weak sender-side input consistency (i.e., the sender cannot inject new elements into the committed set but can potentially use a subset of the committed set). We implement our protocol to demonstrate concrete efficiency. Compared to VOLE-PSI, our communication overhead is a small constant between $1.57 - 2.04\times$ for set sizes between $2^{16}-2^{24}$, and the total end-to-end running time overhead is $1.22 - 1.98\times$ across various network settings.
$\textbf{Note:}$ The previous version of this paper had a soundness issue in the way we checked the consistency of the sender’s input. This revised draft presents a much simpler and cleaner approach to ensuring input consistency for the sender.
IND-CCA Lattice Threshold KEM under 30 KiB
At Asiacrypt'25, Lapiha and Prest proposed a lattice-based IND-CCA threshold key-encapsulation mechanism (TKEM) obtained from a threshold identity-based encryption (TIBE) and a signature scheme. Their construction relies on a variant of the Boneh-Canetti-Halevi-Katz (BCHK) transform, instantiated with a lattice-based TIBE. However it suffers from large ciphertexts at 540 KiB for $\kappa = 128$ bits of security.
We present substantial improvements to their TIBE, resulting in the first concretely efficient lattice-based IND-CCA TKEM, with ciphertexts just under 30 KiB for a threshold $T = 32$, $Q = 2^{45}$ queries, and the same $\kappa$.
Our design simplifies the original framework by leveraging the power of random oracles already present in their construction. We further enhance efficiency by adopting approximate computations where appropriate and by replacing module-NTRU trapdoors with NTRU trapdoors, achieving a remarkable eighteenfold reduction in ciphertext size. Finally, leveraging recent developments in secret sharing, we ensure the verifiability of key-extraction shares even in the presence of malicious parties.
One-Step Schnorr Threshold Identification
Threshold cryptographic primitives have not been widely adopted in real-world distributed systems (i.e., beyond the closed committee model), presumably due to state-synchronization overhead and complex certification processes for the shareholders. These are both aspects of their over-reliance on infrastructure, a rather strong assumption that is usually glossed over in their design. In this work, we propose $\textsf{OSST}$, a Schnorr-based real-time threshold identification protocol that achieves non-interactivity and non-reliance on public shares by means of direct proof interpolation. Given a Shamir $(n, t)$-shared secret $x$, the proposed scheme allows any $t^* \ge t$ (but no less) shareholders to prove over designated communication channels that their secret keys interpolate to $x$ without revealing any information beyond that. Provers do not engage in distributed computations, sending their packets to the verifier asynchronously; conversely, verifiers need only know the combined public key $y \equiv g ^ x$, without need to pre-validate and register the individual member identities. The protocol is intended for use in permissionless or unmanaged meshes that both lack overlay networks and persistent trust infrastructure, a use case space that has been tacitly neglected as "niche" by the current mainstream. No auditable multi-key setup is required beyond distributing $x$ according to Shamir's secret sharing (or equivalent distributed key generation scheme) and correctly advertising its public counterpart; in particular, the protocol is intended to be secure against impersonation attacks without relying on the consistency of any advertised shares. We provide evidence that this has good chances to hold true by giving a formal security proof in the random oracle model under the one-more discrete-logarithm ($\textsf{OMDL}$) hardness assumption.
Hardware-Friendly Robust Threshold ECDSA in an Asymmetric Model
We propose Asymmetric Robust Threshold ECDSA (ART-ECDSA), a robust and hardware-friendly threshold ECDSA protocol designed for asymmetric settings where one participant is a resource-constrained hardware device. The scheme achieves full robustness and cheater identification while minimizing the computational and communication burden on the hardware signer. Our design leverages Castagnos–Laguillaumie (CL) homomorphic encryption to replace Paillier-based operations and remove costly range proofs, yielding compact ciphertexts and simple zero-knowledge proofs. All heavy multiparty computations, including multiplicative-to-additive (MtA) conversions and distributed randomness generation, are offloaded to online cosigners, allowing the hardware party to remain lightweight. ART-ECDSA provides an efficient asymmetric signing protocol with formal security proofs in the UC framework, achieving both robustness and hardware efficiency within a single design.
Our implementation on an ARM Cortex-M7 microcontroller (400 MHz, 3 MB Flash, 2 MB SRAM) shows that the hardware party performs only lightweight computation (50 ms in presigning and ≤ 10 s in signing) and transmits about 300 Bytes and 3 KB in each phase, which easily fits within the bandwidth limits of BLE and NFC. These results demonstrate that ART-ECDSA is practical for cold-storage and embedded hardware environments without compromising security.
Noisette: Certifying Differential Privacy Mechanisms Efficiently
Differential privacy (DP) has emerged as a rigorous framework for privacy-preserving data analysis, with widespread deployment in industry and government. Yet existing implementations typically assume that the party applying the mechanism can be trusted to sample noise correctly. This trust assumption is overly optimistic: a malicious party may deviate from the protocol to gain accuracy or avoid scrutiny, thereby undermining users’ privacy guarantees.
In this paper, we introduce Noisette, a family of efficient protocols for certifying DP noise sampling across both discrete and continuous settings. We design a protocol that supports any discrete distribution through certifiable lookup table evaluation, and introduce a staircase-based optimization that greatly improves efficiency without compromising privacy or utility. We further extend this framework to continuous mechanisms, providing the first efficient protocol for certifiable continuous noise sampling.
We demonstrate the practicality of our protocols through concrete DP applications, including mean estimation and federated learning. Our protocols outperform the prior state-of-the-art by up to $64\times$ in runtime and $24\times$ in communication, while preserving the same accuracy as uncertified DP mechanisms. These results establish Noisette as the first efficient, scalable, and general-purpose solution for certifiable DP noise sampling, making certified privacy guarantees practical in high-stakes applications.
Learning from Leakage: Database Reconstruction from Just a Few Multidimensional Range Queries
Searchable Encryption (SE) has shown a lot of promise towards enabling secure and efficient queries over encrypted data. In order to achieve this efficiency, SE inevitably leaks some information, and a big open question is how dangerous this leakage is. While prior reconstruction attacks have demonstrated effectiveness in one-dimensional settings, extending them to high-dimensional datasets remains challenging. Existing methods either demand excessive query information (e.g. an attacker that has observed all possible responses) or produce low-quality reconstructions in sparse databases. In this work, we present REMIN, a new leakage-abuse attack against SE schemes in multi-dimensional settings, based on access and search pattern leakage from range queries. Our approach leverages unsupervised representation learning to transform query co-occurrence frequencies into geometric signals, allowing the attacker to infer relative spatial relationships between records. This enables accurate and scalable reconstruction of high-dimensional datasets under minimal leakage. We begin with a passive adversary that persistently observes all encrypted queries and responses, and later extend our analysis to an more active attacker capable of poisoning the dataset. Furthermore, we introduce REMIN-P, a practical variant of the attack that incorporates a poisoning strategy. By injecting a small number of auxiliary anchor points REMIN-P significantly improves reconstruction quality, particularly in sparse or boundary regions. We evaluate our attacks extensively on both synthetic and real-world structured datasets. Compared to state-of-the-art reconstruction attacks, our reconstruction attack achieves up to 50% reduction in mean squared error (MSE), all while maintaining fast and scalable runtime. Our poisoning attack can further reduce MSE by an additional 50% on average, depending on the poisoning strategy.
Stealth and Beyond: Attribute-Driven Accountability in Bitcoin Transactions
Bitcoin enables decentralized, pseudonymous transactions, but balancing privacy with accountability remains a challenge. This paper introduces a novel dual accountability mechanism that enforces both sender and recipient compliance in Bitcoin transactions. Senders are restricted to spending Unspent Transaction Outputs (UTXOs) that meet specific criteria, while recipients must satisfy legal and ethical requirements before receiving funds. We enhance stealth addresses by integrating compliance attributes, preserving privacy while ensuring policy adherence. Our solution introduces a new cryptographic primitive, Identity-Based Matchmaking Stealth Signatures (IB-MSS), which supports streamlined auditing. Our approach is fully compatible with existing Bitcoin infrastructure and does not require changes to the core protocol, preserving both privacy and decentralization while enabling transaction auditing and compliance.
Streaming Function Secret Sharing and Its Applications
Collecting statistics from users of software and online services is crucial to improve service quality, yet obtaining such insights while preserving individual privacy remains a challenge. Function secret sharing (FSS) is a promising tool for this problem. However, FSS-based solutions still face several challenges for streaming analytics, where messages are continuously sent, and secure computation tasks are repeatedly performed over incoming messages.
We introduce a new cryptographic primitive called streaming function secret sharing (SFSS), a new variant of FSS that is particularly suitable for secure computation over streaming messages. We formalize SFSS and propose concrete constructions, including SFSS for point functions, predicate functions, and feasibility results for generic functions. SFSS powers several promising applications in a simple and modular fashion, including conditional transciphering, policy-hiding aggregation, and attribute-hiding aggregation. In particular, our SFSS formalization and constructions identify security flaws and efficiency bottlenecks in existing solutions, and SFSS-powered solutions achieve the expected security goal with asymptotically and concretely better efficiency and/or enhanced functionality.
Optimized Implementation of ML-KEM on ARMv9-A with SVE2 and SME
As quantum computing continues to advance, traditional public-key cryptosystems face increasing vulnerability, necessitating a global transition toward post-quantum cryptography (PQC). A primary challenge for both cryptographers and system architects is the efficient integration of PQC into high-performance computing platforms. ARM, a dominant processor architecture, has recently introduced ARMv9-A to accelerate modern workloads such as artificial intelligence and cloud computing. Leveraging its Scalable Vector Extension 2 (SVE2) and Scalable Matrix Extension (SME), ARMv9-A provides sophisticated hardware support for high-performance computing. This architectural evolution motivates the need for efficient implementations of PQC schemes on the new architecture. In this work, we present a highly optimized implementation of ML-KEM, the post-quantum key encapsulation mechanism (KEM) standardized by NIST as FIPS 203, on the ARMv9-A architecture. We redesign the polynomial computation pipeline to achieve deep alignment with the vector and matrix execution units. Our optimizations encompass refined modular arithmetic and highly vectorized polynomial operations. Specifically, we propose two NTT variants tailored to the architectural features of SVE2 and SME: the vector-based NTT (VecNTT) and the matrix-based NTT (MatNTT), which effectively utilize layer fusion and optimized data access patterns. Experimental results on the Apple M4 Pro processor demonstrate that VecNTT and MatNTT achieve performance improvements of up to $7.18\times$ and $7.77\times$, respectively, compared to the reference implementation. Furthermore, the matrix-vector polynomial multiplication, which is the primary computational bottleneck of ML-KEM, is accelerated by up to $5.27\times$. Our full ML-KEM implementation achieves a 52.47% to 60.09% speedup in key encapsulation across all security levels. To the best of our knowledge, this is the first work to implement and evaluate ML-KEM leveraging SVE2 and SME on real ARMv9-A hardware, providing a practical foundation for future PQC deployments on next-generation ARM platforms.
Integrity from Algebraic Manipulation Detection in Trusted-Repeater QKD Networks
Quantum Key Distribution (QKD) allows secure communication without relying on computational assumptions, but can currently only be deployed over relatively short distances due to hardware constraints. To extend QKD over long distances, networks of trusted repeater nodes can be used, wherein QKD is executed between neighbouring nodes and messages between non-neighbouring nodes are forwarded using a relay protocol. Although these networks are being deployed worldwide, no protocol exists which provides provable guarantees of integrity against manipulation from both external adversaries and corrupted intermediates. In this work, we present the first protocol that provably provides both confidentiality and integrity. Our protocol combines an existing cryptographic technique, Algebraic Manipulation Detection (AMD) codes, with multi-path relaying over trusted repeater networks. This protocol achieves Information-Theoretic Security (ITS) against the detection of manipulation, which we prove formally through a sequence of games.
Side-Channel and Fault Injection Attacks on VOLEitH Signature Schemes: A Case Study of Masked FAEST
Ongoing efforts to transition to post-quantum public-key cryptosystems have created the need for algorithms with a variety of performance characteristics and security assumptions.
Among the candidates in NIST's post-quantum standardisation process for additional digital signatures is FAEST, a Vector Oblivious Linear Evaluation in-the-Head (VOLEitH)-based scheme, whose security relies on the one-wayness of the Advanced Encryption Standard (AES).
The VOLEitH paradigm enables competitive performance and signature sizes under conservative security assumptions.
However, since it was introduced recently, in 2023, its resistance to physical attacks has not yet been analysed. In this paper, we present the first security analysis of VOLEitH-based signature schemes in the context of side-channel and fault injection attacks. We demonstrate four practical attacks on a masked implementation of FAEST in ARM Cortex-M4 capable of recovering the full secret key with high probability (greater than 0.87) from a single signature. These attacks exploit vulnerabilities of components specific to VOLEitH schemes and FAEST, such as the parallel all-but-one vector commitments, the VOLE generation, and the AES proof generation. Finally, we propose countermeasures to mitigate these attacks and enhance the physical security of VOLEitH-based signature schemes.
SNARGs for NP via Fiat--Shamir in the Plain Model
We consider constructions of succinct non-interactive arguments (SNARGs) for NP in the standard model. Specifically, we revisit the seminal Micali transformation (applying Fiat-Shamir to Kilian's protocol), which has traditionally only been analyzed in the random oracle model.
We show that the Micali framework can be successfully instantiated in the standard model by leveraging a new interaction between two primitives: a PCP satisfying a property we term shadow soundness, and a vector commitment scheme satisfying function statistical binding.
We prove a general theorem stating that any language admitting a suitable shadow PCP combined with a compatible vector commitment yields a secure SNARG. We instantiate this paradigm using sub-exponential indistinguishability obfuscation (iO) and sub-exponential learning with error (LWE) to obtain a SNARG for all of NP.
Our result serves as the first concrete validation of the Micali blueprint, and in particular of the Fiat-Shamir transformation, in the standard model. As a corollary, we refute "universal" attacks on the Micali framework by demonstrating that there exist concrete instantiations of the underlying components for which the transformation is sound.
MALeak: Blind Side-Channel Key Recovery Exploiting Modular Addition Leakage in ARX-based Block Ciphers
Side-channel analysis (SCA) is a powerful attack that can recover secret keys by exploiting physical leakages emitted during cryptographic computations. However, most existing approaches assume that an attacker knows the plaintext or ciphertext corresponding to each observed leakage trace. In realistic adversarial settings, the input data corresponding to each leakage trace may be unknown or unavailable. To address this limitation, blind side-channel analysis (blind SCA) aims to recover secret keys using only side-channel traces, without access to plaintext or ciphertext information. Despite this goal, prior blind-SCA studies have largely focused on S-box-induced nonlinearity, leaving other operations of nonlinearity less explored. In this paper, we present the first systematic formulation of a blind SCA scenario targeting modular addition, which is the core nonlinear operation in ARX-based block ciphers. We define the analysis point using a generalized nonlinear function that integrates both the secret key and the modular addition operation. We then observe the feasibility of key recovery through simulation and evaluate robustness under various noise conditions. Building on this formulation, we instantiate the generalized model for concrete ARX-based block ciphers. In particular, we adapt it to the round function structures of HIGHT and SPECK, and derive practical blind SCA procedures tailored to each cipher. Finally, we evaluate our approach in both simulation and real-world settings, using power consumption traces collected from an ARM Cortex-M4 MCU (STM32F415) for the real-world experiments. Our results demonstrate that, even without plaintext or ciphertext information, the proposed approach can meaningfully reduce key candidates and achieve successful key recovery for ARX-based block ciphers.
Complete Characterization of Randomness Extraction from DAG-Correlated Sources
We introduce the SHEDAG (Somewhere Honest Entropic sources over Directed Acyclic Graphs) source model, a general model for multi-block randomness sources with causal correlations.
A SHEDAG source is defined over a directed acyclic graph (DAG) $G$ whose nodes output $n$-bit blocks. Blocks output by honest nodes are independent (by default uniformly random, more generally having high min-entropy), while blocks output by corrupted nodes are arbitrary functions of their causal views (all predecessors in $G$).
We tightly characterize the conditions under which randomness extraction from SHEDAG sources is possible.
$\textbf{Zero-error extraction:}$ We show that perfect extraction from SHEDAG sources with $t$ corruptions is possible if and only if $G$ contains an "unrelated set" (an antichain under reachability) of size at least $t+1$. Conversely, if every unrelated set has size at most $t$, we show that no function can output a perfectly uniform bit. We also provide a polynomial-time algorithm to find a maximum unrelated set, thus efficiently identifying the largest corruption threshold $t$ allowing perfect extraction.
$\textbf{Negligible-error extraction:}$
We identify a quantity that we call "resilience" of a DAG $G$, denoted $\text{res}(G)$, that characterizes the possibility of randomness extraction with negligible error (in the block length).
We show that negligible-error extraction is impossible whenever $t>\text{res}(G)$, and, to complement this, for every $t\leq \text{res}(G)$ we construct explicit extractors with polynomial output length and negligible error.
Our results generalize prior online source models studied by (Aggarwal, Obremski, Ribeiro, Siniscalchi, Visconti, Eurocrypt 2020) and (Chattopadhyay, Gurumukhani, Ringach, FOCS 2024), which correspond to the special case of a SHEDAG source whose DAG $G$ is a path.
EWEMrl: A White-Box Secure Cipher with Longevity
We propose the first updatable white-box secure cipher, EWEMrl (Extended
WEM with longevity against non-adaptive read-only adversaries), and its natural
extension, EWEMxl (Extended WEM with longevity against executable adversaries),
both based on WEM (White-box Even-Mansour), and both achieving longevity against
non-adaptive read-only malware. The notion of longevity, introduced by Koike et
al., addresses continuous code leakage and is stronger than incompressibility. While
Yoroi claimed longevity, but was broken by Isobe and Todo. Given the prevalence
of continuous leakage, developing such ciphers is crucial in white-box cryptography.
Precisely, we have the following.
• We first present EWEMr (Extended WEM against non-adaptive read-only adver-
saries), a generalization of WEM (White-box Even-Mansour). WEM is the first
(and possibly only) white-box cipher based on Even-Mansour (EM), replacing its
key addition layer with a secret Sbox. EWEMr achieves a high space-hardness
bound in the non-adaptive model, with a new generic proof strategy, but does
not provide longevity. Instead, it serves as the base for EWEMrl.
• We also present EWEMx (Extended WEM against executable adversaries), which
uses EWEMr as subroutines and achieves a high space-hardness bound in the
stronger adaptive model. While EWEMx does not achieve longevity, it is the
base design for EWEMxl.
• We next propose EWEMrl, that achieves longevity against non-adaptive read-only
malware. None of the existing ciphers, such as SPNbox and SPACE, are designed
for longevity. We show that EWEMrl ensures (against non-adaptive read-only
adversaries) (1) longevity, (2) high space-hardness in both known-space and
chosen-space settings, and (3) security against hybrid code-lifting attacks.
• Finally, we introduce EWEMxl, a natural extension of EWEMrl with a structure
similar to EWEMx. EWEMxl achieves (2) and (3) in the stronger adaptive model
while maintaining (1) in the same non-adaptive and read-only setting.
In summary, our proposals EWEMrl and EWEMxl provide longevity against non-
adaptive read-only malware while ensuring security confidence in the black-box
setting.
Round-Optimal Pairing-Free Blind Signatures
We present the first practical, round-optimal blind signatures in pairing-free groups.
We build on the Fischlin paradigm (EUROCRYPT 2007) where a first signature is computed on a commitment to the message and the final signature is a zero-knowledge proof of the first signature.
We use the Nyberg-Rueppel signature scheme as the basis (CCS 1993), it is a well-studied scheme with a verification equation that is sufficiently algebraic to allow efficient proofs, that do not need to make non-black box use of a random oracle.
Our construction offers flexibility for trade-offs between underlying assumptions and supports issuance of signatures on vectors of attributes making it suitable for use in anonymous credential systems.
As a building block, we show how existing NIZKs can be modified to allow for straight-line extraction.
We implement variants of our construction to demonstrate its practicality, varying the choice of elliptic curve and the proof system used to compute the NIZK.
With conservative parameters (NIST-P256 and SHA-256) and targeting short proofs, signatures are 1349 bytes long, and on a typical laptop can be generated in under 500ms and verified in under 100ms.
On the Impossibility of Round-Optimal Pairing-Free Blind Signatures in the ROM
Blind signatures play a central role in cryptographic protocols for privacy-preserving authentication and have attracted substantial attention in both theory and practice. A major line of research, dating back to the 1990s, has focused on constructing blind signatures from pairing-free groups. However, all known constructions in this setting require at least three moves of interaction between the signer and the user. These schemes treat the underlying group as a black box and rely on the random oracle in their security proofs. While computationally efficient, they suffer from the drawback that the signer must maintain state during a signing session. In contrast, round-optimal solutions are known under other assumptions and structures (e.g., RSA, lattices, and pairings), or via generic transformations such as Fischlin’s method (CRYPTO~'06), which employ non-black-box techniques.
This paper investigates whether the three-round barrier for pairing-free groups is inherent. We provide the first negative evidence by proving that, in a model combining the Random Oracle Model (ROM) with Maurer’s Generic Group Model, no blind signature scheme can be secure if it signs sufficiently long messages while making at most a logarithmic number of random oracle queries. Our lower-bound techniques are novel in that they address the interaction of both models (generic groups and random oracles) simultaneously.
The Billion Dollar Merkle Tree
The Plonky3 Merkle tree implementation has become one of the most widely deployed Merkle tree constructions due to its high efficiency, and—through its integration into numerous succinct-argument systems—it currently helps secure an estimated \$4 billion in assets. Somewhat paradoxically, however, the underlying 2-to-1 compression function is not collision-resistant, nor even one-way, which at first glance appears to undermine the security of the entire Merkle tree. The prevailing ad-hoc countermeasure is to pre-hash data before using them as leaves in this otherwise insecure Merkle tree.
In this work, we provide the first rigorous security analysis of this Merkle tree design and show that the Plonky3 approach is, in fact, sound. Concretely, we show (strong) position-binding and extractability.
BLISK: Boolean circuit Logic Integrated into the Single Key
This paper introduces BLISK, a framework that compiles a monotone Boolean authorization policy into a single signature verification key, enabling only the authorized signer subset to produce the standard constant-size aggregated signatures. BLISK combines (1) $n$-of-$n$ multisignatures to realize conjunctions, (2) key agreement protocols to realize disjunctions, and (3) verifiable group operations (for instance, based on the 0-ART framework). BLISK avoids distributed key generation (allowing users to reuse their long-term keys), supports publicly verifiable policy compilation, and enables non-interactive key rotation.
Improving ML-KEM and ML-DSA on OpenTitan - Efficient Multiplication Vector Instructions for OTBN
This work improves upon the instruction set extension proposed in the paper "Towards ML-KEM and ML-DSA on OpenTitan", in short OTBNTW, for OpenTitan’s big number coprocessor OTBN. OTBNTW introduces a dedicated vector instruction for prime-field Montgomery multiplication, with a high multi-cycle latency and a relatively low utilization of the underlying integer multiplication unit. The design targets post-quantum cryptographic schemes ML-KEM and ML-DSA, which rely on 12-bit and 23-bit prime field arithmetic, respectively. We improve the efficiency of the Montgomery multiplication by fully exploiting existing integer multiplication resources and move modular multiplication from hardware back to software by providing more powerful and versatile integer-multiplication vector instructions. This enables us not only to reduce the overall computational overhead through lazy reduction in software but also to improve performance in other functions beyond finite-field arithmetic. We provide two variants of our instruction set extension, each offering different trade-offs between resource usage and performance. For ML-KEM and ML-DSA, we achieve a speedup of up to 17% in cycle count, with an ASIC area increase of up to 6% and an FPGA resource usage increase of up to 4% more LUT, 20% more CARRY4, 1% more FF, and the same number of DSP compared to OTBNTW. Overall, we significantly reduce the ASIC time-area product, if the designs are clocked at their individual maximum frequency, and at least match that of OTBNTW, if the designs are clocked at the same frequency.
Augmenting BBS with Conventional Signatures
Anonymous credential schemes such as BBS face a significant deployment barrier: currently available secure hardware such as HSMs required for eIDAS Level of Assurance High does not yet support BBS signatures or pairing-friendly curves. We address this challenge by augmenting BBS credentials with a conventional signature (such as ECDSA), where the issuer additionally signs part of the BBS signature using a conventional signature private key that can be secured in widely available HSMs. While disclosing the extra signature breaks unlinkability, we argue this is acceptable for high-assurance use cases where disclosed attributes already uniquely identify the user. For use cases not requiring this additional security, the conventional signature can be omitted to preserve BBS unlinkability. We prove that augmented BBS credentials are existentially unforgeable under chosen message attacks, with security depending solely on the conventional signature private key rather than the BBS private key. This approach provides a practical migration path to full BBS deployment while (apart from unlinkability) maintaining several key BBS advantages.
2PC Memory-Manipulating Programs with Constant Overhead
General-purpose secure multiparty computation (MPC) remains bottlenecked in large part by a lack of efficient techniques for handling memory access. We demonstrate a remarkably simple and efficient 2PC instantiation of random access memory (RAM), based on distributed point functions (DPFs, Gilboa and Ishai, Eurocrypt'14). Our semi-honest 2PC protocol can be achieved from oblivious transfer (OT) and a black-box pseudorandom generator (PRG).
For a memory that stores large enough data words, our 2PC RAM incurs constant communication overhead per access. Like prior works that leverage DPFs to achieve memory access, our work incurs linear computation per access, but our per-access communication is lean.
Our 2PC RAM is built on top of an obliviousness-friendly model of computation called the single access machine model (SAM, Appan et al., CCS'24). In the SAM model, each memory slot can be read at most once. We present a simple 2PC SAM protocol, where each single-access memory operation incurs at most $2w + O(\lambda \lg n)$ bits of communication, where $w$ is the word size, $n$ is the number of memory words, and $\lambda$ is a security parameter. Of this cost, only $2w + 2\lg n$ bits are incurred in the online phase.
Our RAM operations are (non-cryptographically) compiled to SAM operations. At most a logarithmic number of SAM operations are needed per RAM operation; if word size is large, even fewer SAM operations are required. Alternatively, there are now many oblivious algorithms that compile directly to SAM more efficiently than via a compilation to RAM, and our 2PC SAM can instantiate these algorithms. As one example, we can use our 2PC SAM to implement privacy-preserving graph traversal (DFS or BFS) over a secret-shared size-$n$ graph while revealing nothing beyond the runtime of the SAM program. Our construction achieves online communication $O(n \lg n)$ bits, asymptotically matching the number of bits touched in a corresponding cleartext graph traversal.
Key-Updatable Identity-Based Signature Schemes
Identity-based signature (IBS) schemes eliminate the need for certificate management, thereby reducing communication and computational overhead. A major challenge, however, is the efficient update or revocation of compromised keys, as existing approaches such as revocation lists or periodic key renewal incur significant network costs in dynamic settings. We address this challenge by introducing a symmetric element that enables key updates in IBS schemes through a single multicast message. Our approach achieves logarithmic network overhead in the number of keys, with constant computation and memory costs. We further propose a general framework that transforms any IBS scheme into a key-updatable IBS scheme ($\mathsf{KUSS}$), and formalize the associated security requirements, including token security, forward security, and post-compromise security. The versatility of our framework is demonstrated through five instantiations based on Schnorr-type, pairing-based, and isogeny-based IBS, and we provide a detailed security analysis.
Device-Bound Anonymous Credentials With(out) Trusted Hardware
Anonymous credentials enable unlinkable and privacy-preserving user authentication. To ensure non-transferability of credentials among corrupt users, they can additionally be device-bound. Therein, a credential is tied to a key protected by a secure element (SE), usually a hardware component, and any presentation of the credential requires a fresh contribution of the SE. Interestingly, despite being a fundamental aspect of user credentials, device binding for anonymous credentials is relatively unstudied. Existing constructions either require multiple calls to the SE, or need the SE to keep a credential-specific state -- violating core design principles of shielded SEs. Further, constructions that are compatible with the most mature credential scheme BBS rely on the honesty of the SE for privacy, which is hard to vet given that SEs are black-box components.
In this work, we thoroughly study and solve the problem of device-bound anonymous credentials (DBACs). We model DBACs to ensure the unforgeability and non-transferability of credentials, and to guarantee user privacy at the same time. Our definitions cover a range of SE trust levels, including the case of a subverted or fully corrupted SE. We also define blind DBACs, in which the SE learns nothing about the credential presentations it helped compute. This targets the design of a remote, cloud-based SE which is a deployment model considered for the EU Digital Identity (EUDI) wallet to address the fact that most user phones are not equipped with a sufficiently secure SE. Finally, we present three simple and round-optimal constructions for device binding of BBS credentials, and prove their security in the AGM+ROM and privacy unconditionally. The SE therein is extremely lightweight: it only has to compute a BLS or Schnorr signature in a single call. We also give the BLS-based construction in a blind variant, yielding the first protocol that enables privacy-preserving device binding for anonymous credentials when being used with a remote SE.
Beyond-Birthday-Bound Security with HCTR2: Cascaded Construction and Tweak-based Key Derivation
The block cipher (BC) mode for realizing a variable-input-length strong tweakable pseudorandom permutation (VIL-STPRP), also known as the accordion mode, is a rapidly growing research field driven by NIST's standardization project, which considers AES as a primitive. Widely used VIL-STPRP modes, such as HCTR2, have birthday-bound security and provide only 64-bit security with AES. To provide higher security, NIST is considering two directions: to develop new modes with beyond-birthday-bound (BBB) security and to use Rijndael-256-256 with HCTR2. This paper pursues the first direction while maintaining compatibility with HCTR2. In particular, we provide two solutions to achieve BBB security for two different approaches: (i) general cases without any conditions on the tweak and (ii) under the condition that the same tweak is not repeated too often as adopted in bbb-ddd-AES recently presented at Eurocrypt 2025. For the first approach, we propose a new mode, CHCTR, that iterates HCTR2 with two independent keys, which achieves $2n/3$-bit security in the multi-user (mu) setting and satisfies NIST's requirements. For the second approach, we prove mu security of HCTR2, which allows us to apply the tweak-based key derivation (TwKD) to HCTR2 in a provable manner. When the number of BC calls processed by a single tweak is upper-bounded by $2^{n/3}$, HCTR2-TwKD achieves $2n/3$-bit mu security. By benchmarking optimized software implementations, we show that CHCTR with AES-256 outperforms HCTR2 with Rijndael-256-256, in all the twelve processor models examined. Similarly, HCTR2-TwKD outperforms bbb-ddd-AES in general cases, and it is even comparable to bbb-ddd-AES rigorously optimized for tweak-repeating use cases using precomputation.
Combined Indistinguishability Analysis - Verifying random probing leakage under random faults
Cryptographic hardware implementations are vulnerable to combined physical implementation attacks, integrating Side-Channel Analysis and Fault-Injection Analysis to compromise their security. Although theoretically sound countermeasures exist, their practical application is often complicated and error-prone, making automated security verification a necessity. Various tools have been developed to address this need, using different approaches to formally verify security, but they are limited in their ability to analyze complex hardware circuits in the context of Combined Analysis and advanced probabilistic adversary models.
In this work, we introduce a novel verification method that assesses the security of complex hardware circuits in the context of random probing with random faults, a scenario that more closely reflects real-world combined attack scenarios. Our approach centers around symbolic fault simulation and the derivation of a fault-enhanced leakage function using the Fourier-Hadamard Transform, enabling the computation of tight leakage probabilities for arbitrary circuits and providing a more accurate and comprehensive security analysis. By integrating our method into the INDIANA security verification framework, we extended its capabilities to analyze the leakage behavior of circuits in the presence of random faults, demonstrating the practicality of our approach.
The results of our evaluation highlight the versatility and scalability of our approach, which can efficiently compute leakage probabilities under various fault scenarios for large-scale attacks, e.g., for a masked round of the PRESENT cipher. Notably, our method can complete most experiments in less than an hour, demonstrating a significant improvement over existing estimation-based tools. This achievement confirms the potential of our approach to provide a more comprehensive and practically useful security assessment of hardware circuits, and marks an important step forward for the development of secure hardware systems.
Tag-Friendly Lattice Sampler and Applications
The NIST lattice-based cryptographic standards are set to be widely adopted, offering solutions to the most common cryptographic needs, namely key establishment and authentication (signature). This shifted the attention to more advanced primitives such as threshold cryptography as well as privacy-enhanced technologies, where the transition is expected to be more complex. This is particularly true in the context of post-quantum anonymous authentication where the existing mechanisms may not match the performance requirements of industrial applications. An important avenue for improvement of said performances is the lattice sampler, which is at the center of these mechanisms. Despite recent progress, prior samplers neglected one component: the tag. The latter is not only necessary for security, but it also impacts the efficiency of the subsequent constructions if not handled properly.
In this paper, we introduce a new sampler with an enhanced tag management that yet retain the main features of current samplers, and can thus be used as a plug-in replacement. It offers a sampling quality independent of the tag, allowing for producing preimages that are both smaller and faster to generate than those from the very recent sampler of Jeudy and Sanders (Asiacrypt'25). Far from being anecdotal, plugging it into several advanced authentication mechanisms results in size improvements of up to 30%, while being 35% faster.
Argo MAC: Garbling with Elliptic Curve MACs
Off-chain cryptography enables more expressive smart contracts for Bitcoin. Recent work, including BitVM, use SNARKs to prove arbitrary computation, and garbled circuits to verifiably move proof verification off-chain. We define a new garbling primitive, Argo MAC, that enables over $1000\times$ more efficient garbled SNARK verifiers. Argo MAC efficiently translates from an encoding of the bit decomposition of a curve point to a homomorphic MAC of that point. These homomorphic MACs enable much more efficient garbling. In subsequent work, we will describe how to use Argo MAC to construct garbled SNARK verifiers for pairing-based SNARKs.
Rank Syndrome Decoding Estimator - An Asymptotic and Concrete Analysis
The Rank Syndrome Decoding (RSD) problem forms the foundation of many post-quantum cryptographic schemes. Its inherent hardness, with best known algorithms for common parameter regimes running in time exponential in $n^2$ (for $n$ being the code length), enables compact parameter choices and efficient constructions. Several RSD-based submissions to the first NIST PQC process in 2017 were, however, invalidated by algebraic attacks, raising fundamental concerns about the security of RSD-based designs.
In this work, we revisit the parameters of prominent rank-based constructions and analyze the rationales that guided their selection, as well as their security against modern attacks. We provide a unified complexity analysis of all major RSD algorithms, including combinatorial, algebraic, and hybrid approaches, under a common cost model. All estimates are made publicly available through a dedicated open source module.
Furthermore, we present the first asymptotic analysis of these algorithms, yielding deep insights into the relations between different procedures. We show that all studied algorithms converge to one of three distinct asymptotic runtime exponents.
We then provide an asymptotic baseline in terms of the worst-case decoding exponent. In particular, we find that for an extension degree equal to the code length, the best known algorithms achieve a complexity of $2^{0.1481n^2 + o(n^2)}$, attained simultaneously by algebraic and combinatorial approaches. Overall, our results reinforce confidence in the RSD assumption and the design rationales of modern RSD-based schemes such as RYDE.
HYPERSHIELD: Protecting the Hypercube MPC-in-the-Head Framework Against Differential Probing Adversaries without Masking
Post-quantum secure digital signatures based on the MPC-in-the-Head (MPCitH) paradigm, a zero-knowledge (ZK) proof-based construction, are becoming increasingly popular due to their small public key size. However, the development of techniques for protecting MPCitH-based schemes against side-channel attacks remains slow, despite them being critical for real-world deployment.
In this work, we adapt the Hypercube-MPCitH framework exploiting its native use of additive secret sharing to enable inherent protection against first- and high-order differential power analysis (DPA). We first perform a sensitivity analysis of the Hypercube Syndrome Decoding in the Head (SDitH) digital signature scheme with respect to both simple and differential power analysis. Based on the insight into its side-channel sensitivity, we then propose a tweak to the signature scheme to increase its inherent resistance against DPAs by design, eliminating the need to explicitly mask large parts of the signing procedure. More specifically, this is achieved through the novel (k+1)-Hypercube ZK Protocol: the proposed tweak increases the number of hidden shares an adversary must probe to recover the secret key from one to k+1, thus achieving inherent masking order k. Typically, increasing the amount of hidden shares results in a degradation of soundness in the zero-knowledge proof and as a result increases the signature size to a point where the scheme becomes of limited practical interest. To address this, we propose a technique to select the hidden shares in a more structured and optimal fashion, by exploiting the GGM tree structure in the Hypercube-MPCitH framework. As a result, the amount of revealed seeds is reduced, thus resulting in a smaller signature size even compared to the original hypercube protocol.
Finally, we implement and benchmark the proposed Hypercube-SDitH signature scheme, comparing it against the cost of traditional masking. We propose different parameter sets that explore a trade-off between computational overhead and signature size. For 3rd-order protection, our tweaked signature scheme only incurs a 35-50% overhead in computational cost, compared to an estimated overhead of 300% for a fully masked implementation, while the overhead in signature size stays relatively low (52%). Overall, we demonstrate that the proposed (k+1)-Hypercube ZK Protocol can be used to construct efficient, DPA-resistant MPCitH-based digital signatures.
Blind Adaptor Signatures, Revisited: Stronger Security Definitions and Their Construction toward Practical Applications
Although both blind signatures and adaptor signatures have individually attracted attention, there is little research on combining these primitives so far. To the best of our knowledge, although the only existing scheme is the scheme by Qin et al. (S\&P 2023), it does not consider practical security notions, namely full extractability, unlinkability, and pre-verify soundness, especially against adversaries with rich attack interfaces.
In this paper, we propose the first blind adaptor signature scheme that satisfies the above security definitions. We first formalize the security of a blind adaptor signature scheme and prove a relationship between our security definitions and the existing security definitions, as well as showing several gaps in the existing schemes as a technical problem. Our main idea to overcome this problem is to leverage relations that support random self-reducibility instead of additional random numbers for blind signatures. Such a construction can embed relations into the signature components by re-randomizing them with the relations, and hence satisfies all the above security definitions. We then introduce new proof techniques to prove the full extractability by leveraging the unlinkability. We also discuss applications of the proposed scheme.
Computing \(2^a\)-isogenies in Legendre Form
We introduce a method for efficiently computing $2^a$-isogenies in Legendre form with applications in post-quantum cryptography. An example of a secure application is the Charles-Goren-Lauter (CGL) hash function, which recently saw significant improvement in complexity by Doliskani et al.
The majority of work on isogeny computation uses elliptic curves in Montgomery form; this includes the original work on SIDH by Jao, De Feo and Plût and the state of the art implementation of SIKE. Elliptic curves in twisted Edwards form have also been used due to their efficient elliptic curve arithmetic, and complete Edwards curves have been used for their benefit of providing added security against side channel attacks. As far as we know, elliptic curves in Legendre form have not yet been explored for isogeny-based cryptography. Legendre form has the benefit of a very simple defining equation, and the simplest possible representation of the $2$-torsion subgroup. In this work, we develop a new framework for constructing $2^a$-isogenies using elliptic curves in Legendre form, and in doing so optimize Legendre curve arithmetic and $2$-isogeny computations on Legendre curves by avoiding any square root computations.
FABS: Fast Attribute-Based Signatures
Attribute-based signatures (ABS) provide fine-grained control over who
can generate digital signatures and have many real-world applications.
This paper presents a pair of fast ABS schemes: one for Key-Policy ABS
(KP-ABS) and another for Signature-Policy ABS (SP-ABS). Both schemes
support expressive policies using Monotone Span Programs (MSP), and
offer practical features such as large universe, arbitrary attributes, and
adaptive security. Most notably, we provide the first implementation of
MSP-based ABS schemes and demonstrate that our schemes achieve
the best-known asymptotic and concrete performance in this domain.
Asymptotically, key generation, signing and verification time scale
linearly with the number of attributes; verification requires only two
pairing operations. In concrete terms, for 100 attributes, our KP-ABS
scheme performs key generation, signing, and verification in 0.16s, 0.10s,
and 0.13s, respectively; our SP-ABS scheme achieves times of 0.082s,
0.26s, and 0.21s for the same operations.
SoK: Outsourced Private Set Intersection
Private set intersection (PSI) protocols are an essential privacy-enhancing technology for many real-world use cases, ranging from mobile contact discovery to fraud detection. However, PSI executed directly between input parties can result in unreasonable performance overhead. This motivates the study of outsourced PSI, where clients delegate the heavy PSI operations to an untrusted (cloud) server.
In this SoK, we introduce a framework of 12 distinct properties that characterize outsourced PSI protocols based on security, functionality, and efficiency. By analyzing 20 protocols through this framework, we provide a valuable resource and an interactive tool for researchers and practitioners to select the most suitable protocols for their specific requirements. Finally, we discuss research gaps between trends in regular PSI and the current state of outsourced PSI, identifying promising avenues for future work.
On the Estonian Internet Voting System, IVXV, SoK and Suggestions
The Estonian i-voting experience is probably the richest to analyze; a country that is considered a pioneer in digitizing both the government and private sector since 2001 followed by online internet voting (i-voting) in 2005. However, there are still some complaints submitted, critics and remarks to consider about the IVXV system. In this paper, we introduce a Systemization of Knowledge of the Estonian IVXV i-voting system and propose some added security enhancements. The presented SoK discusses applications implemented by election observers in 2023 & 2024 elections, which, to our knowledge, have never been mentioned and/or analyzed in the academia before. We also point out to unnoticed automated formal verification analysis of IVXV; the researchers discovered a privacy attack that we show extendable to a possible large scale encrypted vote copying. In addition, we identify and analyze recent fixes and improvements in the June 2024 version used in the European Parliament elections connecting them to their academic sources. Finally, we discuss the current system status, propose our own suggestions to some remaining vulnerabilities, then raise the inevitable question of the approaching quantum threat.
A Study of Blockchain Consensus Protocols
When Nakamoto invented Bitcoin, the first generation of cryptocurrencies followed it in applying POW (Proof of Work) consensus mechanism; due to its excessive energy consumption and heavy carbon footprints, new innovations evolved like Proof of Space, POS (Proof of Stake), and a lot more with many variants for each. Furthermore, the emergence of more blockchain applications and kinds beyond just cryptocurrencies needed more consensus mechanisms that is optimized to fit requirements of each application or blockchain kind; examples range from IoT (Internet of Things) blockchains for sustainability applications that often use variants of BFT (Byzantine Fault Tolerance) algorithm, and consensus needed to relay transactions and/or assets between different blockchains in interoperability solutions. Previous studies concentrated on surveying and/or proposing different blockchain consensus rules, on a specific consensus issue like attacks, randomization, or on deriving theoretical results. Starting from discussing most important theoretical results, this paper tries to gather and organize all significant existing material about consensus in the blockchain world explaining design challenges, tradeoffs and research areas. We realize that the topic could fit for a complete textbook, so we summarize the basic concepts and support with tables and appendices. Then we highlight some case examples from interoperability solutions to show how flexible and wide the design space is to fit both general and special purpose systems. The aim is to provide researchers with a comprehensive overview of the topic, along with the links to go deeper into every detail.
Uniform Sharing in Multiple Stages: NullFresh for Arbitrary Functions
In the field of hardware masking, threshold implementations are a well-known technique that provides glitch-resistant power analysis security. While they guarantee probing security, finding a uniform sharing without additional randomness is difficult, making it challenging to apply to certain functions and, consequently, making it impossible to develop a tool that can straightforwardly generate the masked circuit. Additionally, this approach forces designers to use at least three shares in the underlying masking, which can make the design more costly. Other schemes, like DOM, which can work with two shares, often require fresh randomness. To address these issues, Shahmirzadi and Moradi introduced the NullFresh masking technique at CHES 2021. This method allows for uniform sharing with no additional randomness, using the minimal number of shares. However, similar to original threshold implementations, it is not always straightforward to find a NullFresh masking for arbitrary functions.
In this work, we introduce an automated technique to provide masking for arbitrary functions, ensuring first-order security. This technique is applicable to functions where the number of output bits does not exceed the number of input bits. While this technique introduces additional register stages (resulting in higher latency and area) compared to existing methods, it addresses the automation challenges of threshold implementations, which have remained an open problem since their inception. We present the masking technique, along with proofs of glitch-extended probing security, and demonstrate its application to several ciphers, including PRINCE, MIDORI, SKINNY, KECCAK, and AES. The masked designs were verified using SILVER and PROLEAD, and tested on an FPGA through TVLA.
Quantum Voting Protocol from Classical Assumptions
Quantum voting allows us to design voting scheme by quantum mechanics. The existing quantum voting protocols mainly use quantum entangled states. However, the existing protocols rarely consider the problem of repeated voting and tampered voting by malicious voters, and hybrid quantum voting protocols have not been discussed. In this paper, we use EFI pairs (Entity-Friendly Integer pairs) instead of quantum entangled states to address the shortage of existing protocols, and propose a new quantum voting protocol. Our protocol is structured to avoid repeated voting by any voter, and can prevent the leakage of voters' voting information. The security of our protocol can be finally reduced to a classical assumption i.e. BQP = QMA. Combined with quantum key distribution (QKD), we further optimize the protocol to prevent malicious adversaries from interfering with the final voting results. Moreover, we use extended noisy trapdoor claw-free function (ENTCF) to construct the first hybrid quantum voting protocol, which allows a classical voter to interact with a quantum center through a classical channel to complete the voting process.
Efficient Polynomial Multiplication for HQC on ARM Cortex-M4
HQC, a code-based Key Encapsulation Mechanism (KEM) recently selected for NIST post-quantum cryptography standardization, utilizes polynomial multiplication over $\mathbb{F}_2[x]$, the efficiency of which determines the overall performance of the scheme on embedded targets. On the ARM Cortex-M4, where carry-less multiplication instructions are unavailable, prior works have focused on the Frobenius Additive FFT (FAFFT) and radix-16 methods. However, for polynomial lengths $n=2^{k}+r$ with small $r$, such as those in HQC-1 and -3, applying FAFFT requires a transform length of $2^{k+2} \approx 4n$, introducing substantial padding overhead.
In this paper, we propose three optimized approaches to enhance HQC performance on the Cortex-M4: (i) Hybrid FAFFT-CRT, (ii) Hybrid FAFFT-Karatsuba, and (iii) a radix-16 method. To reduce the padding overhead of FAFFT, we use the Hybrid FAFFT-CRT and FAFFT-Karatsuba methods. The Hybrid FAFFT-CRT method maps the polynomial ring to a product of a FAFFT-friendly ring and a small-degree polynomial ring, and the Hybrid FAFFT-Karatsuba method utilizes a 2-way split to align subproducts with smaller power-of-two FAFFT lengths. We further optimize the core FAFFT butterfly operation by shortening XOR sequences and improving register scheduling. For the radix-16 approach, we apply it to HQC for the first time and develop an operation-count cost model to identify optimal Karatsuba and Toom-Cook combinations.
On a NUCLEO-L4R5ZI board with a Cortex-M4 microcontroller, all of these methods are more efficient than the state-of-the-art for HQC-1 and -3. The Hybrid FAFFT-CRT method reduces polynomial multiplication cycles by 33.4% and 27.2% for HQC-1 and -3, respectively. This leads to cycle reductions for key generation, encapsulation, and decapsulation by 23.7%, 24.2%, and 21.9% for HQC-1, and 18.0%, 18.6%, and 17.7% for HQC-3. For HQC-5, our optimized butterfly FAFFT provides a consistent speedup of 1.4 - 1.6% across all KEM operations.
Revisiting PQ WireGuard: A Comprehensive Security Analysis With a New Design Using Reinforced KEMs
WireGuard is a VPN based on the Noise protocol, known for its high performance, small code base, and unique security features. Recently, Hülsing et al. (IEEE S&P'21) presented post-quantum (PQ) WireGuard, replacing the Diffie-Hellman (DH) key exchange underlying the Noise protocol with key-encapsulation mechanisms (KEMs). Since WireGuard requires the handshake message to fit in one UDP packet of size roughly 1200 B, they combined Classic McEliece and a modified variant of Saber. However, as Classic McEliece public keys are notoriously large, this comes at the cost of severely increasing the server's memory requirement. This hinders deployment, especially in environments with constraints on memory (allocation), such as a kernel-level implementations.
In this work, we revisit PQ WireGuard and improve it on three fronts: design, (computational) security, and efficiency. As KEMs are semantically, but not syntactically, the same as DH key exchange, there are many (in hindsight) ad-hoc design choices being made, further amplified by the recent finding on the binding issues with PQ KEMs (Cremers et al., CCS'24). We redesign PQ WireGuard addressing these issues, and prove it secure in a new computational model by fixing and capturing new security features that were not modeled by Hülsing et al. We further propose 'reinforced KEM' (RKEM) as a natural building block for key exchange protocols, enabling a PQ WireGuard construction where the server no longer needs to store Classical McEliece keys, reducing public key memory by 190 to 390×. In essence, we construct a RKEM named 'Rebar' to compress two ML-KEM-like ciphertexts which may be of an independent interest.
Breaking the Myth of MPCitH Inefficiency: Optimizing MQOM for Embedded Platforms
Signature schemes based on the MPC-in-the-Head (MPCitH) paradigm play an important role in enabling cryptosystems founded on a wide diversity of hardness assumptions. While the design of such schemes is currently stabilizing, providing efficient implementations on embedded devices remains a critical challenge, as MPCitH frameworks are known to manipulate large data structures and to rely heavily on symmetric primitives.
In this work, we present a highly optimized implementation of the NIST candidate MQOM (version 2) targeting embedded microcontrollers. Our implementation significantly outperforms existing MPCitH implementations on such platforms, both in terms of memory footprint and execution time. In particular, for the L1 parameter set, we can achieve an SRAM usage below 10 KB, including the key and signature buffers, while preserving practical signing and verification performance.
We also provide the first memory-friendly implementation of the one-tree technique, which is used to reduce signature sizes in several MPCitH-based schemes. This enables a comparative analysis of the implementation costs of correlated trees versus the one-tree technique. We then demonstrate how streaming and precomputation techniques can further mitigate the impact of the running time and the signature size.
Formalizing Privacy in Decentralized Identity: A Provably Secure Framework with Minimal Disclosure
This paper presents a formal framework for enhancing privacy in decentralized identity (DID) systems, resolving the inherent conflict between blockchain verifiability and the principle of minimal data disclosure. At its core, we introduce a provably secure cryptographic protocol that leverages attribute commitments on-chain and zero-knowledge proofs for off-chain validation. This approach allows users to demonstrably prove the validity of predicates about their attributes without revealing the underlying sensitive values.
We formally define the security and privacy requirements for such a system—including consistency, attribute-based indistinguishability, and predicate-based indistinguishability—within a semi-honest adversarial model. We then construct a concrete scheme that realizes these properties under standard cryptographic assumptions. The proposed architecture is designed for full backward compatibility with W3C DID standards, ensuring practical deployability. Security analysis provides rigorous, provable guarantees, while performance evaluation confirms the efficiency of the core cryptographic operations, supporting its use in resource-constrained environments. This work establishes a foundational and analyzable basis for building decentralized identity systems where both accountability and user privacy are essential.
GPV Preimage Sampling with Weak Smoothness and Its Applications to Lattice Signatures
The lattice trapdoor associated with Ajtai's function is the cornerstone of many lattice-based cryptosystems.
The current provably secure trapdoor framework, known as the GPV framework, uses a strong smoothness condition, i.e. $\epsilon\ll \frac{1}{n^2}$ for smoothing parameter $\eta_{\epsilon}(\mathbb{Z}^{n})$, to ensure the correctness of the security reduction.
In this work, we investigate the feasibility of weak smoothness, e.g. $\epsilon = O,(\frac{1}{n})$ or even $O(1)$ in the GPV framework and present several positive results.
First, we provide a theoretical security proof for GPV with weak smoothness under a new assumption.
Then, we present Gaussian samplers that are compatible with the weak smoothness condition.
As direct applications, we present several practical GPV signature instantiations based on a weak smoothness condition.
Our first instantiation is a variant of Falcon, called, Falcon$^{a ws}$, achieving smaller size and higher security.
The public key sizes are $21\%$ to $28\%$ smaller, and the signature sizes are $23.5\%$ to $29\%$ smaller than Falcon.
We also showcase an NTRU-based GPV signature scheme that employs the Peikert sampler with weak smoothness.
This offers a simple implementation while the security level is greatly lower.
Nevertheless, at the NIST-3 security level, our scheme achieves a $49\%$ reduction in size compared to Dilithium-3.
We also derive a weak smoothness variant of the Antrag signature scheme, called Antrag$^{ws}$, along with a floating-point free version, which offers an attractive trade-off among portability, efficiency and security.
Compared to Antrag, Antrag$^{ws}$ achieves reductions of $16.5\%$ to $21.8\%$ in signature size and $14.3\%$ to $21.4\%$ in public key size.
Furthermore, we adapt the compact gadget framework to the weak smoothness setting, removing the need for the new assumption and improving concrete parameters of gadget-based signatures, e.g. Hufu.
\textsc{Npir}: High-Rate PIR for Databases with Moderate-Size Records
Private information retrieval (PIR) is a widely used technique in privacy-preserving applications that enables users to retrieve records from a database without revealing any information about their queries. This study focuses on a type of PIR that has a high ratio between the size of the record retrieved by the client and the server's response. Although significant progress has been made in high-rate PIR in recent years, the computational overhead on the server side remains rather high. This results in low server throughput, particularly for applications involving databases with moderate-size records (i.e. tens of kilobytes), such as private advertising system.
In this paper, we present \textsc{Npir}, a high-rate single-server PIR that is based on NTRU encoding and outperforms the state-of-the-art Spiral (Menon \& Wu, S\&P 2022) and NTRUPIR (Xia \& Wang, EuroS\&P 2024) in terms of server throughput for databases with moderate-size records. In specific, for databases ranging from 1 GB to 32 GB with 32 KB records, the server throughput of \textsc{Npir} is 1.50 to 2.84 times greater than that of Spiral and 1.77 to 2.55 times greater than that of NTRUPIR.
To improve server throughput without compromising the high-rate feature, we propose a novel tool called NTRU packing, which compresses the constant terms of underlying polynomials of multiple NTRU encodings into a single NTRU encoding, thereby reducing the size of the server's response. Furthermore, \textsc{Npir} naturally supports batch processing for moderate-size records, and can easily handle retrieving for records of varying sizes.tions, we advance secure communication protocols under challenging conditions.
Lether: Practical Post-Quantum Account-Based Private Blockchain Payments
We introduce Lether, the first practical account-based private block-chain payment protocol based on post-quantum lattice assumptions, following the paradigm of Anonymous Zether (FC '19, IEEE S&P '21). The main challenge in building such a protocol from lattices lies in the absence of core building blocks: unbounded-level additively-homomorphic multi-message multi-recipient public key encryption (mmPKE), and event-oriented linkable ring signatures with support for multiple tags (events). To address these issues, we propose a verifiable refreshable additively-homomorphic mmPKE scheme and a plug-and-play event-oriented linkable tag scheme from lattices. We believe both to be of independent interest.
To achieve unbounded-level homomorphic evaluation in the lattice-based setting without relying on heavy techniques such as bootstrapping or large moduli (e.g., over 60 bits) in fully homomorphic encryption (FHE), we introduce a lightweight and blockchain-friendly mechanism called refresh. Namely, each user is required to verifiably refresh their account after a certain number of transactions. With our tailored parameter settings, the amortized per-refresh costs of communication and computation are only about 1.3% and 1.5%, respectively, of the cost of a transaction.
We also optimize the implementations of LNP22 lattice-based zero-knowledge proof system (Crypto '22) in the LaZer library (CCS ’24), to support efficient batching of various proof components. Overall, for a typical transaction, the total communication cost becomes about 68 KB, with the associated zero-knowledge proof accounting for about 51 KB of this total. Each of proof generation and verification take a fraction of a second on a standard PC.
As an additional contribution, we formalize new definitions for Anonymous Zether-like protocols that more accurately capture real-world blockchain settings. These definitions are generic and are expected to benefit the broader development of account-based private blockchain payment protocols, beyond just lattice settings.
From $\textsf{TS-SUF-2}$ to $\textsf{TS-SUF-4}$: Practical Security Enhancements for $\textsf{FROST2}$ Threshold Signatures
Threshold signature schemes play a vital role in securing digital assets within blockchain and distributed systems. $\textsf{FROST2}$ stands out as a practical threshold Schnorr signature scheme, noted for its efficiency and compatibility with standard verification processes. However, under the one-more discrete logarithm assumption, with static corruption and centralized key generation settings, $\textsf{FROST2}$ has been shown by Bellare et al. (in CRYPTO 2022) to achieve only $\textsf{TS-SUF-2}$ security, which is a consequence of its vulnerability to $\textsf{TS-UF-3}$ attacks.
In this paper, we address this security limitation by presenting two enhanced variants of $\textsf{FROST2}$: $\textsf{FROST2}\texttt{+}$ and $\textsf{FROST2}\texttt{#}$, both achieving the $\textsf{TS-SUF-4}$ security level under the same computational assumptions as the original $\textsf{FROST2}$.
The first variant, $\textsf{FROST2}\texttt{+}$, strengthens $\textsf{FROST2}$ by integrating additional pre-processing token verifications that help mitigate $\textsf{TS-UF-3}$ and $\textsf{TS-UF-4}$ vulnerabilities while maintaining practical efficiency.
We show that $\textsf{FROST2}\texttt{+}$ can achieve $\textsf{TS-SUF-4}$ security not only under the same conditions as the original $\textsf{FROST2}$ analysis, but also when initialized with a distributed key generation protocol such as $\textsf{PedPoP}$.
Building on these improvements, we identify optimization opportunities that lead to our second variant, $\textsf{FROST2}\texttt{#}$, which achieves $\textsf{TS-SUF-4}$ security with enhanced computational efficiency by eliminating redundant calculations.
Our benchmark shows that the performance of $\textsf{FROST2}\texttt{+}$ is comparable to $\textsf{FROST2}$ while $\textsf{FROST2}\texttt{#}$ is at least 3 times faster than $\textsf{FROST2}$.
Breaking the KAZ Suite: Practical Key Recovery Attacks on MySEAL 2.0’s Post-Quantum Candidates
Targeting the suite of four cryptographic schemes under review in Malaysia's MySEAL 2.0 initiative, we present practical key recovery attacks that break three of them: the KAZ-KA key agreement scheme, the KAZ-KEM key encapsulation mechanism, and the KAZ-SIGN v2.0 digital signature. All three schemes operate over $\mathbb{Z}_N$ where $N$ is a primorial-the product of consecutive small primes. This design choice makes the group order $\varphi(N)$ extremely smooth, enabling efficient attacks. For KAZ-KA and KAZ-KEM, we recover the private key by enumerating candidates modulo each small prime factor and solving discrete logarithms in small groups. For KAZ-SIGN v2.0, we exploit the linear structure of signatures to formulate a hidden number problem instance, which we solve using lattice reduction with only two signatures. Our attacks, executed on a MacBook, recover the secret keys in under one second for all recommended security levels (128, 192, and 256 bits), demonstrating that these schemes are fundamentally insecure.
BEAST-MEV: Batched Threshold Encryption with Silent Setup for MEV prevention
Threshold encrypted mempools protect the privacy of transactions up until the point their inclusion on chain is confirmed. They are a promising approach to protection against front-running attacks on decentralized blockchains.
Recent works have introduced two key properties that an encryption scheme must satisfy in order to scale to large scale decentralized blockchains such as Ethereum:
Silent Setup [Garg-Kolonelos-Policharla-Wang, CRYPTO'24], demands that a threshold encryption scheme does not require any interaction during the setup phase and only relies on the existence of Public Key Infrastructure. Batched Decryption [Choudhuri-Garg-Piet-Policharla, USENIX'24], demands that an entire block containing $B$ encrypted transactions can be decrypted using communication that is independent of (or sublinear in) $B$, without compromising the privacy of transactions that have not yet been confirmed.
While existing constructions achieve either Silent Setup or Batched Decryption independently, a truly decentralized and scalable encrypted mempool requires both properties to be satisfied simultaneously. In this work, we present the first ``Batched Threshold Encryption scheme with Silent Setup'' built using bilinear pairings. We provide formal definitions for the primitive, and prove security in the Generic Group Model. We provide several optimizations and implement our scheme to evaluate its performance. Our experiments demonstrate its efficiency for deployment in blockchain systems.
zkRNN: Zero-Knowledge Proofs for Recurrent Neural Network Inference
Neural networks have achieved remarkable success across a wide range of domains, including applications involving sequential data such as natural language processing and time-series prediction. However, in many real-world deployments, it is essential to ensure the integrity of the inference process–namely, that the output of a model is correctly computed–without revealing the model’s data. While prior work has introduced zero-knowledge proof (ZKP) schemes for convolutional and feedforward neural networks, these do not extend naturally to recurrent architectures due to the challenges introduced by temporal dependencies and weight sharing.
In this paper, we propose zkRNN, a novel ZKP system for recurrent neural networks (RNNs), enabling the prover to demonstrate that a model’s output is correctly computed over a sequential input without revealing any information about the model parameters. Our approach builds upon the GKR protocol, and a recursive sum-check framework introduced in prior work and adapts them to handle the recurrent structure of RNNs. We design a circuit representation that encodes hidden-state transitions, unrolls computation across time steps, and shares weights in a manner compatible with sum-check-based verification. Our protocol achieves polylogarithmic verifier time and proof size in the size of the final iteration circuit and remains independent of the sequence length. The evaluation results demonstrate practical proof generation and succinct, sequence-length-independent verification, with second-scale proving and millisecond-scale verification.
Analyze Your Leakage! Security Analysis of Encryption Schemes for Substring Search
Searchable symmetric encryption (SSE) enables queries over symmetrically encrypted databases. To achieve practical efficiency, SSE schemes incur a certain amount of leakage; however, this leads to the possibility of leakage cryptanalysis, i.e., cryptanalytic attacks that exploit the leakage from the target SSE scheme to subvert its data and query privacy guarantees. Leakage cryptanalysis has been widely studied in the context of SSE schemes supporting either keyword queries or range queries, often with devastating consequences. However, little or no attention has been paid to cryptanalysing substring SSE schemes, i.e., SSE schemes supporting arbitrary substring queries over encrypted data. This is despite their relevance to many real-world applications, e.g., in the context of securely querying outsourced genomic databases.
In this paper, we present the first leakage cryptanalysis of three prominent substring-SSE schemes due to Chase and Shen (PoPETS '15), Faber et al. (ESORICS '15), and Hahn et al. (SIGMOD 2018). We propose novel, inference-based query reconstruction attacks on each of these schemes that exploit their respective leakage profiles. We implement our attacks and experimentally validate their success rates and efficiency over real-world datasets. Our attacks achieve high query reconstruction rate with practical efficiency, and scale smoothly to large datasets. Notably, our attack success attack rate is the highest against the scheme due to Hahn et al. (SIGMOD 2018), which is the most recent of the three schemes. This demonstrates that, as far as substring SSE is concerned, newer schemes are not necessarily "more secure".
To the best of our knowledge, ours are the first and only query reconstruction attacks on (and the first systematic leakage cryptanalysis of) any substring-SSE scheme to date. Query reconstruction against substring SSE schemes is inherently harder to achieve than against traditional SSE schemes for keyword search, as the query space for substring SSE is significantly larger. In fact, while the vast majority of known attacks against SSE schemes for keyword search restrict the query space to the most-frequent keywords, our attacks achieve decent query recovery rates without any such restriction on the queried substrings.
Our work carries a wider message for designers of SSE schemes: it is not sufficient to merely identify the leakage profile of any new scheme; rather, it is incumbent on designers to cryptanalyze that leakage profile and show that is not consequential in the face of state-of-the-art attacks. The novel attack techniques in our paper provide a starting point for such analysis for substring SSE schemes.
Scalable Distributed Key Generation for Blockchains
Distributed key generation (DKG) is a foundational building block for designing efficient threshold cryptosystems, which are crucial components of blockchain ecosystems. Existing DKG protocols address the problem in a standalone setting, focusing on establishing the final DKG public key and individual secret keys among the participating parties. This work focuses on DKG primitives for use over blockchain, where the final DKG public key must be available on-chain, enabling on-chain smart contracts to seamlessly execute threshold cryptographic verifications. We observe that existing standalone DKG designs do {\em not} sufficiently exploit the presence of blockchain, leaving substantial scope for improvement in performance.
In this work, we design the first discrete-log-based DKG protocol tailored for use over blockchain, leveraging the blockchain's built-in consensus mechanism to realize DKG efficiently. Interestingly, the use of blockchains enables us to solve DKG while tolerating up to one-half Byzantine faults even in non-synchronous settings. Our protocol is asynchronous, allowing it to operate independently of the network's timing assumptions, with the exact network model depending on the destination blockchain.
Our solution further utilizes an associated random beacon to select smaller committees and achieves a DKG protocol with sub-cubic communication complexity, sub-quadratic computation complexity, and minimal on-chain storage. Notably, our protocol employs a single invocation of consensus and can terminate in just eleven communication rounds in the good case when deployed on an optimal latency partially synchronous blockchain. Our experiments show that our protocol terminates faster than state-of-the-art standalone protocols, with similar bandwidth overhead for committee members and significantly reduced bandwidth for other parties. Additionally, our protocol benefits from higher CPU resources—when deployed on machines with $32$ vCPUs, it completes in approximately $6.5$ seconds in the optimistic case, even for larger systems with $256$ nodes.
Uncertainty Estimation in Neural Network-enabled Side-channel Analysis and Links to Explainability
Side-channel analysis (SCA) has emerged as a critical field in securing
hardware implementations against potential vulnerabilities. With the advent of artificial intelligence(AI), neural network-based approaches have proven to be among the most useful techniques for profiled SCA. Despite the success of NN-assisted SCA, a critical challenge remains, namely understanding predictive uncertainty. NNs are often uncertain in their predictions, leading to incorrect key guesses with high
probabilities, corresponding to a higher rank associated with the correct key. This uncertainty stems from multiple factors, including measurement errors, randomness in physical quantities, and variability in NN training. Understanding whether this uncertainty arises from inherent data characteristics or can be mitigated through better training is crucial. Additionally, if data uncertainty dominates, identifying
specific trace features responsible for misclassification becomes essential.
We propose a novel approach to estimating uncertainty in NN-based SCA by leveraging Renyi entropy, which offers a generalized framework for capturing various forms of uncertainty. This metric allows us to quantify uncertainty in NN predictions and explain its impact on key recovery. We decompose uncertainty into epistemic (model-related) and aleatoric (data-related) components. Given the challenge of estimating probability distributions in high-dimensional spaces, we use matrix-based Renyi α-entropy and α-divergence to better approximate leakage distributions, addressing the limitations of KL divergence in SCA. We also explore the sources of uncertainty, e.g., resynchronization, randomized keys, as well as hyperparameters related to NN training. To identify which time instances (features in traces) contribute most to uncertainty, we also integrate SHAP explanations with our framework, overcoming the limitations of conventional sensitivity analysis. Lastly, we show that predictive uncertainty strongly correlates with standard SCA metrics like rank, offering a complementary measure for evaluating attack complexity. Our theoretical findings are backed by extensive experiments on available datasets and NN models.
Multi-Party Distributed Point Functions with Polylogarithmic Key Size from Invariants of Matrices
Distributed point functions (DPFs), introduced in 2014, are a widely used primitive in secure computation for a wide variety of applications. However, until recently, constructions for DPFs with polylogarithmic keys were known only for the two-party setting, multi-party schemes have key sizes exponential in the number of parties or the domain size.
We generalize the efficient tree-based two-party DPF approach and get a scheme for a polylogarithmic-size DPF for an any number of parties. We use a technique where we have the invariant for off-path leaves such that secret-shared vector is mapped to secret submodules by public matrices.
We show, using a technique by Shamir, that these vectors are hard to compute over $\mathbb{Z}_{pq}$ if factoring is hard. Our scheme is a secure DPF under two new assumptions related to Generic Group Model and the Linear Code Equivalence.
The output of our scheme is in the exponent in some group where Diffie-Hellman type problems are hard which limits the usability of the scheme. Still, it is the first multi-party DPF generalizing the tree-based two-party DPF approach. Our scheme is the first where the key is polylogarithmic in the domain size and independent of the number of parties and the key generation and evaluation can be computed efficiently independently of the number of parties.
Correction-Based Fault Attack Against Randomized MAYO
This paper introduces a novel fault injection attack targeting the randomized version of the MAYO post-quantum signature scheme. While prior attacks on MAYO either relied on deterministic signing modes or specific memory assumptions, our attack succeeds without such constraints. By exploiting the inherent structural properties of MAYO signatures, we combine targeted fault injections with signature correction techniques to extract partial information about the secret oil space. By systematically accumulating such partial information across multiple fault-induced signatures and utilizing linear dependencies among oil vectors, we present an efficient method for achieving full secret key recovery. The attack requires only one fault injection per oil coefficient, repeated a small (i.e., 8,17,10, or 12 for the different MAYO versions, respectively) number of times. We demonstrate the targeted fault injection attack on a MAYO implementation on an ARM Cortex-M4 processor via clock glitching, establishing the feasibility of the attack in practice. Our approach is validated through simulations, and a detailed computational cost analysis is provided. Additionally, we demonstrate the ineffectiveness of some previously proposed countermeasures against our attack, thereby highlighting the urgent need for developing more robust protection mechanisms for multivariate post-quantum signature schemes, such as MAYO.
Reed–Muller Encoding Leakage Enables Single-Trace Message Recovery in HQC
HQC is a code-based key-encapsulation mechanism standardized by NIST, whose decapsulation
follows a Fujisaki--Okamoto (FO) transform and therefore re-executes encryption-side
encoding during deterministic re-encryption. In this paper, we show that this design
choice exposes a critical leakage point in the \emph{Reed--Muller (RM) encoding} routine:
across the NIST-submitted implementations, the HQC team's official codebase, and the
PQClean implementations.
We demonstrate the practical impact of this leakage on a ChipWhisperer CW308 UFO board with an STM32F303 (Cortex-M4) target. Using a total of 5{,}000 power traces for profiling and evaluation, we recover the full 128-bit encapsulation message from a \emph{single} decapsulation trace with up to 96.9\% success. In comparison, the current state of the art for single-trace HQC message recovery based on \emph{soft-analytical side-channel attacks} (SASCA) reports profiling on the order of
500{,}000 traces; our approach therefore reduces the required profiling budget by two
orders of magnitude while achieving comparable single-trace capability.
Beyond session-key compromise, we show that direct recovery of the decrypted message can serve as an oracle primitive that substantially lowers the cost of oracle instantiation in prior HQC secret-key recovery frameworks. While prior oracle instantiations typically map leakage to a discrete set of task-specific labels, our approach recovers the decrypted message itself, and thus applies uniformly over the full message space (i.e., arbitrary $m'$ values). Concretely, we reduce the profiling cost required to instantiate a \emph{decryption success/failure} oracle, multi-value plaintext-checking, and full-decryption oracles by approximately 90.3\%, 84.83\%, and 26.7\%, respectively.
Unlocking the True Potential of Decryption Failure Oracles: A Hybrid Adaptive-LDPC Attack on ML-KEM Using Imperfect Oracles
Side-channel attacks exploiting Plaintext-Checking (PC) and Decryption Failure (DF) oracles are a pressing threat to deployed post-quantum cryptography. These oracles can be instantiated from tangible leakage sources like timing, power, and microarchitectural behaviors, making them a practical concern for leading schemes based on lattices, codes, and isogenies. In this paper, we revisit chosen-ciphertext side-channel attacks that leverage the DF oracle on ML-KEM. While DF oracles are often considered inefficient compared to their binary PC counterparts in lattice-based schemes, we demonstrate that their full potential has been largely unrealized.
We introduce a novel attack framework that combines adaptive query generation with belief propagation for Low-Density Parity-Check (LDPC) codes. Our methodology crafts carefully balanced parity checks over multiple secret coefficients, maximizing the Shannon information extracted from each oracle query, even in the presence of significant noise. This approach dramatically reduces the number of queries required for a full key recovery, achieving near-optimal efficiency by approaching the theoretical Shannon information bound. For ML-KEM-768 with an oracle accuracy of 95%, our attack requires only 2950 queries (a 1.35 ratio to the Shannon lower bound), establishing that a well-designed DF attack can surpass the efficiency of state-of-the-art binary PC attacks.
To validate the practical impact of our findings, we apply our framework to the recent GoFetch attack, showing significant gains in this real-world, microarchitectural side-channel scenario. Our method reduces the required measurement traces by over an order of magnitude and eliminates the need for computationally expensive post-processing, enabling a full key recovery on higher-security schemes previously considered intractable.
PQCUARK: A Scalar RISC-V ISA Extension for ML-KEM and ML-DSA
Recent advances in quantum computing pose a threat to the security of digital communications, as large-scale quantum machines can break commonly used cryptographic algorithms, such as RSA and ECC. To mitigate this risk, post-quantum cryptography (PQC) schemes are being standardized, with recent NIST recommendations selecting two lattice-based algorithms: ML-KEM for key encapsulation and ML-DSA for digital signatures. Two computationally intensive kernels dominate the execution of these schemes: the Number-Theoretic Transform (NTT) for polynomial multiplication and the Keccak-f1600 permutation function for polynomial sampling and hashing. This paper presents PQCUARK, a scalar RISC-V ISA extension that accelerates these key operations. PQCUARK integrates two novel accelerators within the core pipeline: (i) a packed SIMD butterfly unit capable of performing NTT butterfly operations on 2×32bit or 4×16bit polynomial coefficients, and (ii) a permutation engine that delivers two Keccak rounds per cycle, hosting a private state and a direct interface to the core Load Store Unit, eliminating the need for a custom register file interface. We have integrated PQCUARK into an RV64 core and deployed it on an FPGA. Experimental results demonstrate that PQCUARK provides up to 10.1× speedup over the NIST baselines and 2.3× over the optimized software, and it outperforms similar state-of-the-art approaches between 1.4-12.3× in performance. ASIC synthesis in GF22-FDSOI technology shows a moderate core area increase of 8% at 1.2 GHz, with PQCUARK units being outside the critical path.
qFALL – Rapid Prototyping of Lattice-based Cryptography
We introduce qFALL, an open-source library for rapid prototyping of lattice-based cryptography written in Rust. qFALL is designed to bridge the gap between theory and practice by offering a modular architecture that provides a theory-affine, flexible, high-level interface for mathematics and common algorithms in lattice-based constructions with representative runtime performance. This enables researchers to rapidly assemble minimal working prototypes that are easily auditable, modifiable, and allow users to assess algorithmic trade-offs as well as the viability of their constructions early in the development cycle. Furthermore, the library supports an incremental optimization workflow, allowing users to replace bottlenecks with optimized modules to evolve the codebase toward a fully optimized implementation. We demonstrate that qFALL allows for efficient assembly of auditable cryptographic constructions that approximate the performance of optimized implementations and serve as a reusable resource to the scientific community.
Revisiting Polynomial NTRU for FHE: Amortized Bootstrapping with Sparse Keys
Fully homomorphic encryption (FHE) enables computation over encrypted data, playing a fundamental role in privacy-preserving machine learning. Recent work has demonstrated that FHE schemes can be constructed under the NTRU assumption, leveraging the benefits of compact ciphertexts. However, existing NTRU-based FHE constructions rely on matrix representations, which break the polynomial ring structure and prevent the direct adoption of modern amortized bootstrapping techniques.
In this work, we revisit NTRU-based FHE by reformulating the matrix-based construction into a standard polynomial-ring setting. We show that the NTRU decryption operation can be decomposed into a set of inner products compatible with FHEW-style accumulators, while preserving the polynomial structure required for further optimization. Building on this formulation, we adapt a recent amortized bootstrapping approach based on monomial-by-polynomial multiplication to the NTRU setting using sparse secret keys.
The resulting scheme combines compact ciphertexts with efficient amortized bootstrapping, reducing both computational cost and bootstrapping key size when the secret key has low Hamming weight. A proof-of-concept Python implementation validates the correctness of the proposed scheme and confirms the expected reduction in bootstrapping cost under conservative security parameters.
Qurrency: a quantum-secure, private, and auditable platform for digital assets
Central bank digital currencies (CBDCs) and other related digital asset platforms have the potential to revolutionize the financial world. While these platforms have been deployed in test environments by virtually all large financial institutions, including central banks, there are still several limitations of these systems that prevent widespread adoption. These include (i) privacy, (ii) security against quantum adversaries, and (iii) auditability. In this work, we undertake (to our knowledge) the first formal study of these systems.
While there have been many digital asset platforms implemented, we do not know of any formal model for a fundamentally UTXO-based digital asset platform/CBDC. Our first contribution is a formal modeling of a UTXO-based private digital asset system that meets our requirements listed above. This model is loosely based upon the open source software that we found came the closest to meeting our requirements, Linux Foundation Decentralized Trust (LFDT) Zeto. In the course of our formal modeling, we helped to improve the security of Zeto. We then provide an efficient construction of such a system, which we call Qurrency. Qurrency is an efficient UTXO-based privacy-preserving token system that includes an auditing mechanism and is secure against "harvest now, decrypt later" attacks, which is critically important for several central banks, including the Bank of Brazil. We implemented our construction to show that it is practically efficient and can be used on any EVM-based blockchain system with ease.
BABE: Verifying Proofs on Bitcoin Made 1000x Cheaper
Endowing Bitcoin with the ability to verify succinct proofs has been a longstanding problem with important applications such as scaling Bitcoin and allowing the Bitcoin asset to be used in other blockchains trustlessly. It is a challenging problem due to the lack of expressiveness in the Bitcoin scripting language and the small Bitcoin block space. BitVM2 is the state-of-the-art verification protocol for Bitcoin used in several mainnets and testnets, but it suffers from very high on-chain Bitcoin transaction fees in the unhappy path (over $14,000 in a recent experiment). Recent research BitVM3 dramatically reduces this on-chain cost by using a garbled SNARK verifier circuit to shift most of the verification off-chain, but each garbled circuit is 42 Gibytes in size, so the off-chain storage and setup costs are huge. This paper introduces BABE, a new proof verification protocol on Bitcoin, which preserves BitVM3's savings of on-chain costs but reduces its off-chain storage and setup costs by three orders of magnitude. BABE uses a witness encryption scheme for linear pairing relations to verify Groth16 proofs. Since Groth16 verification involves non-linear pairings, this witness encryption scheme is augmented with a secure two-party computation protocol implemented using a very efficient garbled circuit for scalar multiplication on elliptic curves. The design of this garbled circuit builds on a recent work, Argo MAC, which gives an efficient garbling scheme to compute homomorphic MACs on such curves.
AVX2 Implementation of QR-UOV for Modern x86 Processors
QR-UOV is a multivariate signature scheme selected as one of the candidates in the second round of the NIST PQC Additional Digital Signatures process. This paper presents software acceleration methods for QR-UOV optimized for modern x86 architectures. QR-UOV operates over small odd prime-power extension fields such as $\mathrm{GF}(31^3)$ and $\mathrm{GF}(127^3)$ unlike other multivariate cryptosystem candidates. This property allows direct utilization of hardware multipliers for field arithmetic, offering a distinctive advantage for high-performance implementations. Yet, how to implement QR-UOV efficiently on modern CPUs based on this property remains unclear so far. Our implementation benefits from two proposed optimizations: (1) reducing the computational overhead of the QR-UOV algorithm through algorithm-level optimization, and (2) leveraging advanced SIMD instruction set extensions (e.g., AVX2, AVX512) to accelerate main operations such as matrix multiplication. Our implementation achieves substantial speedups over the Round 2 reference: for the parameter set $(q,\ell)=(127,3)$ at NIST security level I, it delivers a $5.1\times$ improvement in key generation, $3.6\times$ in signature generation, and $5.7\times$ in signature verification. These results demonstrate that QR-UOV achieves performance comparable or higher than that of UOV implementations, particularly at higher security levels.
Rejection Matters: Efficient Non-Profiling Side-Channel Attack on ML-DSA via Exploiting Public Templates
ML-DSA (formerly CRYSTALS-Dilithium), NIST’s primary post-quantum signature standard, is increasingly deployed along with the post-quantum transitions. Yet when the implementations of ML-DSA are deployed in practice, their physical security remains underexplored. In this work, we reveal a new attack surface against ML-DSA by exploiting the leakages from both rejected signing trials and the final accepted signing trial. We present, to the best of our knowledge, the first side-channel attack that simultaneously leverages leakage from both trials without relying on clone devices. Unlike traditional Secret-based Template Attacks, which require profiling the leakage of the sensitive intermediates on a clone device, our PTA (Public-based Template Attack) builds leakage templates solely from publicly available data on the target device itself. With challenge $c$ known, we then perform CPA on the sensitive intermediates using traces from both rejected and accepted signing trials, quadrupling (on average) exploitable leakage per signing request for ML-DSA-44. The experimental results on power traces from an ARM Cortex-M4 board show that challenges $c$ are fully recovered with only {96 traces}, and then the key recovery succeeds in around 300 traces — a fact of 10x fewer than prior art. We highlight that our attack can be applied across all three ML-DSA variants with different security levels.
Moreover, our attack works straightforwardly in the hedged (non-deterministic) mode of ML-DSA, demonstrating that the hedging offers no SCA protection in this scenario.
Policy-based Access Tokens: Privacy-Preserving Verification for Digital Identity
Passports, driving licences, and other government-issued identity documents are frequently used to prove attributes about an individual, such as their date of birth or home address. Traditional paper-based approaches are being transitioned to digital identities, which are becoming increasingly important for online interactions and transactions, allowing individuals to prove their identity without needing to present physical documents. However, existing solutions suffer from cumbersome primitives, for example, the European Commission is actively experimenting with Zero-Knowledge proof based solutions for the EU’s Digital Identity Wallet, or lack of functionality such as the UK’s right-to-work share codes.
In this paper, we present a new cryptographic primitive, Policy-Based Access Tokens, that allows for lightweight verification of user attributes through a service (such as a government office). We propose two variants of the scheme: PAT-I offers token unforgeability such that malicious parties cannot verify personal data without a valid token. This is then extended in PAT-II to allow for distributed delegation to a set of proxies, offering fine-grained revocation. We consider stronger security properties that prevent proxies colluding, whilst providing anonymity against the service provider. We give generic constructions of our schemes, prove their security in the standard model, and provide instantiations based on bilinear pairings. Finally, we provide a proof-of-concept implementation which demonstrates that our protocols are efficient, with token verification taking ≈ 100ms.
(Fine-Grained) Unbounded Inner-Product Functional Encryption from LWE
Inner-product functional encryption (IPFE), introduced by Abdalla-Bourse-De Caro-Pointcheval (PKC'15), is a public-key primitive that allows to decrypt an encrypted vector $\mathbf{x}$ with a secret key associated to a vector $\mathbf{y}$ such that only their inner-product $\langle\mathbf{x},\mathbf{y}\rangle$ is revealed. The initial definition and constructions all required the length of such vectors to be bounded at setup, and therefore, be fixed in the public parameters.
In order to overcome this drawback, Dufour-Sans-Pointcheval (ACNS'19) and Tomida-Takashima (AC'18) introduced the notion of unbounded IPFE, where the length of vectors does not need to be fixed during the setup phase, and gave constructions from pairing-based assumptions.
In this paper, we make progress and provide the first unbounded IPFE constructions that i) are based on the Learning With Errors (LWE) assumption and proven secure in the standard model, ii) achieve adaptive security, iii) provide fine-grained access control, i.e., are identity- and attribute-based, and iv) rely only on black-box access to cryptographic and lattice algorithms. Hence, our constructions are also plausibly post-quantum secure.
Optimizing the Post Quantum Signature Scheme CROSS for Resource Constrained Devices
Post-quantum cryptosystems are currently attracting significant research efforts due to the continuous improvements in quantum computing technologies, which led the National Institute of Standards and Technology (NIST) to open standardization competitions to foster proposals and public scrutiny of cryptosystems and digital signatures. Whilst NIST has chosen, after four selection rounds, three digital signature algorithms, it also has opened a new selection process as the chosen candidates were either relying only on lattice-based computationally hard problems, or had unsatisfactory performance figures. In this work, we propose two optimized implementations of the Codes and Restricted Objects Signature Scheme (CROSS) targeting the Cortex-M4 platform. One implementation targets the minimal possible stack size while the other trades some memory space for performance optimization using DSP instructions for some performance critical arithmetic operations. We show that all parameter sets fit within at maximum 24 kB of stack which corresponds to a reduction by a factor of 15 to 45 with respect to the reference implementation. The memory footprint of our implementation, taking the size of the signature also into account, is less than 128 kB. We additionally outline different stack reduction options which allow for a fine grained trade-off between memory footprint and performance of the algorithm. Notably, we also show that our memory optimizations alone have no significant performance impact on the signature verification of CROSS while we even achieve a speed-up factor of up to 1.7 when taking the stack and speed optimizations into account.
$L$ for the Price of One: On the Benefits of Using more than $t+1$ Parties in Threshold Signing
In threshold ECDSA a committee of $N$ parties holds---say, Shamir---shares of degree $t$ of a secret key, where typically $N\gg t$ for operational purposes (e.g. redundancy to prevent losing the key). At signing time, $t+1$ parties can execute a protocol to produce a signature on a given message without leaking anything about the secret key. In this work we show that if we use $n=t+2(\ell-1) + 1$ parties for signing instead, we can compute $\ell$ signatures without increasing at all the communication costs per party, essentially getting $\ell\times$ more signatures almost for free in a dishonest majority.
Our result is achieved by making use of packed secret-sharing to distribute multiple secrets with no communication penalty. This introduces several challenges not present in the non-packed domain, which leads us to introduce two primitives that may be of independent interest: we show how to prove that a sharing contains small elements efficiently, and its use in distributing consistent sharings of the same secret modulo two different integers. We also show how to generate degree-$2$ preprocessing material with constant communication via an adaptation of the virtual parties idea by Bracha from 1987.
We compare the communication of our protocol to sign $\ell$ messages with respect to the state-of-the-art in $t+1$-party ECDSA signing by (Doerner et al, S&P'24), which needs to be repeated $\ell$ times. Our results show that, for appropriate regimes of $(t,n,\ell)$, our protocol can achieve 5x less communication (and even a larger factor) than theirs while adding only a few extra parties for the computation.
qedb: Expressive and Modular Verifiable Databases (without SNARKs)
Verifiable Databases (VDBs) allow clients to outsource data storage without trusting the provider: a client holding only a short digest can verify any query response using a compact server-provided proof. Given the ubiquity of both databases and outsourced storage, VDBs address a fundamental need.
Our work advances the state of the art in VDB design. Our main contribution is $\mathsf{qedb}$, a simple and performant construction for SQL queries based on bilinear pairings. Like some prior VDB schemes, $\mathsf{qedb}$ leverages features specific to the database setting; however, it differs from such approaches in its technical blueprint, the breadth of supported queries, and performance. Notably, it is the first scheme of its kind with proof size independent of database size and without quadratic scaling for storage or preprocessing. Compared to VDB solutions based on general-purpose proofs, $\mathsf{qedb}$ offers stronger tradeoffs in at least one of the following: provable security, proof size and verification time, or system complexity and maintainability (over an order of magnitude fewer lines of code).
As additional contributions, we provide both an implementation of $\mathsf{qedb}$ and new theoretical foundations for VDB design—a new framework modeling $\textit{idealized}$ protocols for verifiable databases, which future works can use in a plug-and-play manner. Through our modular approach we can get more provably secure instantiations of $\mathsf{qedb}$ $\textit{for free}$, including a post-quantum one from lattices.
New Upper and Lower Bounds for Perfectly Secure MPC
We consider perfectly secure MPC for $n$ players and $t$ malicious corruptions. We ask whether requiring only security with abort (rather than guaranteed output delivery, GOD) can help to achieve protocols with better resilience, communication complexity or round complexity. We show that for resilience and communication complexity, abort security does not help, one still needs $3t< n$ for a synchronous network and $4t< n$ in the asynchronous case. And, in both cases, a communication overhead of $O(n)$ bits per gate is necessary.
When $O(n)$ overhead is inevitable, one can explore if this overhead can be pushed to the preprocessing phase and the online phase can be achieved with $O(1)$ overhead. This result was recently achieved in the synchronous setting, in fact, with GOD guarantee. We show this same result in the asynchronous setting. This was previously open since the main standard approach to getting constant overhead in a synchronous on-line phase fails in the asynchronous setting. In particular, this shows that we do not need to settle for abort security to get an asynchronous perfectly secure protocol with overheads $O(n)$ and $O(1)$.
Lastly, in the synchronous setting, we show that perfect secure MPC with abort requires only 2 rounds, in contrast to protocols with GOD that require 4 rounds.
E2E-AKMA: An End-to-End Secure and Privacy-Enhancing AKMA Protocol Against the Anchor Function Compromise
The Authentication and Key Management for Applications (AKMA) system represents a recently developed protocol established by 3GPP, which is anticipated to become a pivotal component of the 5G standards. AKMA enables application service providers to delegate user authentication processes to mobile network operators, thereby eliminating the need for these providers to store and manage authentication-related data themselves. This delegation enhances the efficiency of authentication procedures but simultaneously introduces certain security and privacy challenges that warrant thorough analysis and mitigation.
The 5G AKMA service is facilitated by the AKMA Anchor Function (AAnF), which may operate outside the boundaries of the 5G core network. A compromise of the AAnF could potentially allow malicious actors to exploit vulnerabilities, enabling them to monitor user login activities or gain unauthorized access to sensitive communication content. Furthermore, the exposure of the Subscription Permanent Identifier (SUPI) to external Application Functions poses substantial privacy risks, as the SUPI could be utilized to correlate a user's real-world identity with their online activities, thereby undermining user privacy.
To mitigate these vulnerabilities, we propose a novel protocol named E2E-AKMA, which facilitates the establishment of a session key between the User Equipment (UE) and the Application Function (AF) with end-to-end security, even in scenarios where the AAnF has been compromised. Furthermore, the protocol ensures that no entity, aside from the 5G core network, can link account activities to the user's actual identity. This architecture preserves the advantages of the existing AKMA scheme, such as eliminating the need for complex dynamic secret data management and avoiding reliance on specialized hardware (apart from standard SIM cards). Experimental evaluations reveal that the E2E-AKMA framework incurs an overhead of approximately 9.4\% in comparison to the original 5G AKMA scheme, which indicates its potential efficiency and practicality for deployment.
Heli: Heavy-Light Private Aggregation
This paper presents Heli, a system that lets a pair of servers collect aggregate statistics about private client-held data without learning anything more about any individual client's data. Like prior systems, Heli protects client privacy against a malicious server, protects correctness against misbehaving clients, and supports common statistical functions: average, variance, and more. Heli's innovation is that only one of the servers (the "heavy server") needs to do per-run work proportional to the number of clients; the other server (the "light server") does work sublinear in the number of clients, after a one-time setup phase. As a result, a computationally limited party, such as a low-budget non-profit, could potentially serve as the second server for a Heli deployment with millions of clients.
Heli relies on a new cryptographic primitive, aggregation-only encryption, that allows computing certain restricted functions on many clients' encrypted data. In a deployment with ten million clients, in which the servers privately compute the sum of 32 client-held 1-bit integers, Heli's heavy server does 240,000 core-s of work and the light server does 7 core-ms of work. Compared with prior work, the heavy server does 38$\times$ more computation, but the light server does 120,000$\times$ less.
$\mathsf{Cougar}$: Cubic Root Verifier Inner Product Argument under Discrete Logarithm Assumption
An inner product argument (IPA) is a cryptographic proof system that serves as a fundamental building block for various applications, such as zero knowledge proofs and verifiable computation. Bulletproofs (IEEE S&P 2018), a well-known IPA under the discrete logarithm (DL) assumption, features a short, logarithmically-sized proof, making it suitable for blockchain applications. However, its major drawback is the linear verifier cost ($O(N)$), which presents a significant bottleneck in settings like verifiable computation. To address this, recent advancements have successfully reduced the verification complexity to square-root order ($O(\sqrt{N})$) under the same assumption (e.g., Asiacrypt 2022, IEEE TIFS).
In this work, we propose $\textsf{Cougar}$, a novel IPA that breaks this square-root barrier to achieve an unprecedented cubic-root verifier complexity ($O(\sqrt[3]{N})$)}, while strictly maintaining the compact logarithmic proof size ($O(\log N)$) characteristic of Bulletproofs. To achieve this, $\textsf{Cougar}$ introduces a generalized two-tier commitment framework combined with a \textit{disjoint interpolation} strategy for efficient consistency checks.
We implemented $\textsf{Cougar}$ in Rust and performed a comprehensive benchmarking against Bulletproofs and $\textsf{Leopard}$ (IEEE TIFS). Our evaluation demonstrates that while $\textsf{Cougar}$ incurs a moderate increase in prover overhead, its verification time scales significantly better for large instances. Concretely, for a witness size of $N = 2^{20}$, $\textsf{Cougar}$ achieves a $50\times$ verification speed-up over Bulletproofs and exhibits a superior asymptotic growth rate compared to existing sublinear IPAs.
Single-server Stateful PIR with Verifiability and Balanced Efficiency
Recent stateful private information retrieval (PIR) schemes have significantly improved amortized computation and amortized communication while aiming to keep client storage minimal. However, all the schemes in the literature still suffer from a poor tradeoff between client storage and computation.
We present BALANCED-PIR, a stateful PIR scheme that effectively balances computation and client storage. For a database of a million entries, each of 8 bytes, our scheme requires 0.2 MB of client storage, 0.2 ms of amortized computation, and 11.14 KB of amortized communication. Compared with the state-of-the-art scheme using a similar storage setting, our scheme is almost 9x better in amortized computation and 40x better in offline computation.
Verifiable private information retrieval has been gaining more attention recently. However, all existing schemes require linear amortized computation and huge client storage. We present Verifiable BALANCED-PIR, a verifiable stateful PIR scheme with sublinear amortized computation and small client storage. In fact, our Verifiable BALANCED-PIR adds modest computation, communication, and storage costs on top of BALANCED-PIR. Compared with the state-of-the-art verifiable scheme, the client storage of our scheme is 100x smaller, the amortized computation is 15x less, and the amortized communication is 2.5x better.
Practical SNARGs for Matrix Multiplications over Encrypted Data
Fully Homomorphic Encryption (FHE) enables computations to be performed directly on encrypted data, without ever requiring decryption. This capability is particularly crucial for privacy-preserving outsourced computation in sensitive fields such as healthcare and finance. While FHE ensures data confidentiality under the honest-but-curious adversarial model, achieving full malicious security, encompassing both integrity and privacy, requires an additional layer of verifiability.
To address this, a growing body of research has explored combining FHE with techniques from verifiable computation, leading to the notion of verifiable FHE (vFHE). However, the integration of these two paradigms often results in substantial computational overhead, making existing approaches largely impractical for real-world deployment.
In this work, rather than targeting general-purpose verifiable FHE, we design a novel and practical verifiable homomorphic encryption scheme tailored for an important and widely used operation: matrix–vector multiplication. We provide an open-source implementation and our experimental results demonstrate that the proposed scheme achieves high efficiency, making it ready for practical adoption.
Zero Knowledge (About) Encryption: A Comparative Security Analysis of Three Cloud-based Password Managers
The paper is currently under embargo, and will be released mid-February 2026.
Timed Commitments and Timed Encryption: Generic Constructions and Instantiations from Isogenies
Introduced by Boneh and Naor (CRYPTO 2000), timed commitments are a versatile primitive that found numerous applications in e-voting, contract signing and auctions. In TCC 2020, Katz, Loss and Xu showed that non-interactive timed commitments (NITC) can be generically built from timed public key encryption (TPKE). Unfortunately, almost all constructions for either primitive rely on classical, i.e. non post-quantum, assumptions or require inefficient building blocks like indistinguishable obfuscation or fully homomorphic encryption.
In this work, we propose generic constructions for non-interactive timed commitments and timed encryption, assuming only efficient building blocks like verifiable random functions, trapdoor delay functions and NIZK proof systems. Both our NITC (called LEIBNITC) and our TPKE (called NYTPKE) can be instantiated from isogenies, making them post-quantum secure. The instantiation of LEIBNITC with isogenies is very efficient and yields commitments of size 2328 bits, representing one of the most efficient timed commitments in the literature.
Bridging Keyword PIR and Index PIR via MPHF and Batch PIR
This paper presents a Keyword Private Information Retrieval (Keyword PIR) scheme that achieves a constant-factor online computation and communication overhead compared to the underlying Index PIR, bridging the gap between Keyword PIR and Index PIR, and enabling efficient and privacy-preserving queries over diverse databases. We introduce a new Key-Value Store (KVS) instantiate by Minimal Perfect Hash Function, referred to as MPHF-KVS, in which each keyword query requires only a single index query. We then develop a generic Batch PIR framework that converts Index PIR into Keyword PIR using KVS encoding.
In particular, when the KVS is instantiated using a Binary Fuse Filter (BFF-KVS), Keyword PIR can be reduced to Batch PIR. Leveraging the updatable hint structure of PIR with side information, we propose a novel {Rewind \& Skip} technique that enables the execution of multiple queries within a single round.
In MPHF-KVS, the online computation and communication costs are at most $2\times$ those of Index PIR. In our Batch PIR with BFF-KVS, building upon three recent PIR schemes with sublinear server-side online computation and communication cost and without extra hint store, our approach inherits their advantages and achieves keyword query costs of less than $7\times$ the cost of an index query, while still maintaining sublinear online complexity.
BEANIE – A 32-bit Cipher for Cryptographic Mitigations against Software Attacks
In modern CPU architectures, various security features to mitigate software attacks can be found. Examples of such features are logical isolation, memory tagging or shadow stacks. Basing such features on cryptographic isolation instead of logical checks can have many advantages such as lower memory overhead and more robustness against misconfiguration or low-cost physical attacks. The disadvantage of such an approach is however that the cipher that has to be introduced has a severe impact on the system performance, either in terms of additional cycles or a decrease of the maximum achievable frequency. Finally, as of today, there is no suitable low-latency cipher design available for encrypting 32-bit words as is common in microcontrollers. In this paper, we propose a 32-bit tweakable block cipher tailored to memory encryption for microcontroller units. We optimize this cipher for low latency, which we achieve by a careful selection of components for the round function and leveraging an attack scenario similar to the one used to analyze the cipher SCARF. To mitigate some attack vectors introduced by this attack scenario, we deploy a complex tweak-key schedule. Due to the shortage of suitable 32-bit designs, we compare our design to various low-latency ciphers with different block sizes. Our hardware implementation shows competitive latency numbers.
Preimage-type Attacks for Reduced Ascon-Hash: Application to Ed25519
Hash functions and extendable output functions are some of the most fundamental building blocks in cryptography. They are often used to build commitment schemes where a committer binds themselves to some value that is also hidden from the verifier until the opening is sent. Such commitment schemes are commonly used to build signature schemes, e.g., Ed25519 via Schnorr signatures, or non-interactive zero-knowledge proofs. We specifically analyze the binding security when Ascon-Hash256 or Ascon-XOF128 is used inside of Ed25519, which is closely related to finding second preimages. While there is ample prior work on Ascon-XOF128 and Ascon-Hash256, none of it applies in this setting either because it analyzes short outputs of 64 or 128 bits or because the complexity is above the security claim and generic attack of 128 bits. We show how to exploit the setting of finding a forgery for Ed25519. We find that this setting is quite challenging due to the large 320-bit internal state combined with the 128-bit security level. We propose a second-preimage attack for 1-round Ascon-Hash256 with a complexity of $2^{64}$ Gaussian eliminations and a random-prefix-preimage attack (also known as Nostradamus attack) for 1-round Ascon-Hash256, for the Ed25519 setting, with complexity $2^{29.7}$ Gaussian eliminations.
Euston: Efficient and User-Friendly Secure Transformer Inference with Non-Interactivity
Secure TransFormer Inference (STFI) frameworks have been proposed to address privacy concerns over user inputs and model parameters in Transformer-based LLMs. While most existing solutions rely on interactive protocols that incur substantial user-server communication overhead, non-interactive STFI variants have recently emerged to eliminate such dependencies. Nevertheless, state-of-the-art non-interactive STFI frameworks still suffer from critical limitations. (i) Large ciphertext sizes and multiple rotations alongside heavy user-side overhead in Homomorphic Matrix Multiplication (HMM). (ii) High approximation costs and depth consumptions in Homomorphic Nonlinear Evaluations (HNE).
To address these limitations, we present Euston, an efficient and user-friendly STFI with non-interactivity. By combining RNS-CKKS fully homomorphic encryption with optimized methods, Euston achieves unprecedented efficiency in offline online inference paradigm. The key innovations are twofold. (i) For linear operations, we adopt Singular Value Decomposition (SVD) with our novel batched HMMs to minimize ciphertext size and reduce rotation counts, simultaneously lowering user-side computational, communication and storage overhead. (ii) For nonlinear operations, we employ column(diagonal)-packed ciphertext matrix formats to eliminate costly rotations and depth regulation strategies to reduce depth consumption in non-interactive HNEs, which not only avoids user-server communications but also accelerates inference performance. In comparision with the state-of-the-art approach (NEXUS, NDSS 2025), Euston achieves up to 3100× lower preprocessing costs for the user and 8.8× higher system-wide inference performance, specifically delivering a 90× speedup for HMM and a 165.7× speedup for HNE. Our results demonstrate that Euston establishes new efficiency frontiers for user-friendly STFI deployment across cloud and edge environments.
Threshold ECDSA in Two Rounds
We propose the first two-round multi-party signing protocol for the Elliptic Curve Digital Signature Algorithm (ECDSA) in the threshold-optimal setting, reducing the number of rounds by one compared to the state of the art (Doerner et al., S&P '24). We also resolve the security issue of presigning pointed out by Groth and Shoup (Eurocrypt '22), evading a security loss that increases with the number of pre-released, unused presignatures, for the first time among threshold-optimal schemes.
Our construction builds on Non-Interactive Multiplication (NIM), a notion proposed by Boyle et al. (PKC '25), which allows parties to evaluate multiplications on secret-shared values in one round. In particular, we use the construction of Abram et al. (Eurocrypt '24) instantiated with class groups. The setup is minimal and transparent, consisting of only two class-group generators. The signing protocol is efficient in bandwidth, with a message size of 1.9 KiB at 128-bit security, and has competitive computational performance.
An Open-Source Framework for Efficient Side-Channel Analysis on Cryptographic Implementations
Side-channel attacks are increasingly recognized as a significant threat to hardware roots of trust. As a result, cryptographic module designers must ensure that their modules are resilient to such attacks before deployment. However, efficient evaluation of side-channel vulnerabilities in cryptographic implementations remains challenging. This paper introduces an open-source framework integrating FPGA designs, power measurement tools, and high-performance side-channel analysis libraries to streamline the evaluation process. The framework provides design templates for two widely used FPGA boards in the side-channel analysis community, enabling Shell-Role architecture, a modern FPGA design pattern. This shell abstraction allows designers to focus on developing cryptographic modules while utilizing standardized software tools for hardware control and power trace acquisition. Additionally, the framework includes acceleration plugins for ChipWhisperer, the leading open-source side-channel analysis platform, to enhance the performance of correlation power analysis (CPA) attacks. These plugins exploit modern many-core processors and graphics processing units (GPUs) to speed up analysis significantly. To showcase the capabilities of the proposed framework, we conducted multiple case studies and highlighted significant findings that advance side-channel research. Furthermore, we compare our CPA plugins with existing tools and show that our plugins achieve up to 8.60x speedup over the state-of-the-art CPA tools.
Ajax: Fast Threshold Fully Homomorphic Encryption without Noise Flooding
Threshold fully homomorphic encryption (ThFHE) enables multiple parties to perform arbitrary computation over encrypted data, while the secret key is distributed across the parties. The main task of designing ThFHE is to construct threshold key-generation and decryption protocols for FHE schemes. Among existing FHE schemes, FHEW-like cryptosystems enjoy the advantage of fast bootstrapping and small parameters.
However, known ThFHE solutions use the ``noise-flooding'' technique to realize threshold decryption, which requires either large parameters or switching to a scheme with large parameters via bootstrapping, leading to a slow decryption process. Besides, for key generation, existing ThFHE schemes either assume a generic MPC or a trusted setup, or incur noise growth that is linear in the number $n$ of parties.
In this paper, we propose a fast ThFHE scheme Ajax, by designing threshold key-generation and decryption protocols for FHEW-like cryptosystems. In particular, for threshold decryption, we eliminate the need for noise flooding, and instead present a new technique called ``mask-then-open'' based on random double sharings over different rings, while keeping the advantage of small parameters.
For threshold key generation, we show a simple approach to reduce the noise growth from $n$ times to $max(0.038n,2)$ times in the honest-majority setting, where at most $t=\floor{(n-1)/2}$ parties are corrupted. Our end-to-end implementation reports the running time 17.6 $s$ and 0.9 $ms$ (resp., 91.9 $s$ and 4.4 $ms$) of generating a set of keys and decrypting a single ciphertext respectively, for $n=3$ (resp., $n=21$) parties under the network of 1 Gbps bandwidth and 1 $ms$ ping time. Compared to the state-of-the-art implementation, our protocol improves the end-to-end performance of the threshold decryption protocol by a factor of at least $5.7\times$ $\sim$ $283.6\times$ across different network latencies from $t=1$ to $t=13$. Our approaches can also be applied in other types of FHE schemes like BGV, BFV, and CKKS.
Prover - Toward More Efficient Formal Verification of Masking in Probing Model
In recent years, formal verification has emerged as a crucial method for assessing security against Side-Channel attacks of masked implementations, owing to its remarkable versatility and high degree of automation. However, formal verification still faces technical bottlenecks in balancing accuracy and efficiency, thereby limiting its scalability. Former tools like maskVerif and CocoAlma are very efficient but they face accuracy issues when verifying schemes that utilize properties of Boolean functions. Later, SILVER addressed the accuracy issue, albeit at the cost of significantly reduced speed and scalability compared to maskVerif. Consequently, there is a pressing need to develop formal verification tools that are both efficient and accurate for designing secure schemes and evaluating implementations. This paper’s primary contribution lies in proposing several approaches to develop a more efficient and scalable formal verification tool called Prover, which is built upon SILVER. Firstly, inspired by the auxiliary data structures proposed by Eldib et al. and optimistic sampling rule of maskVerif, we introduce two reduction rules aimed at diminishing the size of observable sets and secret sets in statistical independence checks. These rules substantially decrease, or even eliminate, the need for repeated computation of probability distributions using Reduced Ordered Binary Decision Diagrams (ROBDDs), a time-intensive procedure in verification. Subsequently, we integrate one of these reduction rules into the uniformity check to mitigate its complexity. Secondly, we identify that variable ordering significantly impacts efficiency and optimize it for constructing ROBDDs, resulting in much smaller representations of investigated functions. Lastly, we present the algorithm of Prover, which efficiently verifies the security and uniformity of masked implementations in probing model with or without the presence of glitches. Experimental results demonstrate that our proposed tool Prover offers a better balance between efficiency and accuracy compared to other state-of-the-art tools (IronMask, CocoAlma, maskVerif, and SILVER). In our experiments, we also found an S-box that can only be verified by Prover, as IronMask cannot verify S-boxes, and both CocoAlma and maskVerif suffer from false positive issues. Additionally, SILVER runs out of time during verification.
MIFA: An MILP-based Framework for Improving Differential Fault Attacks
At ASIACRYPT 2021, Baksi et al. introduced DEFAULT, a block cipher designed to algorithmically resist Differential Fault Attack (DFA), claiming 64-bit DFA security regardless of the number of injected faults. At EUROCRYPT 2022, Nageler et al. demonstrated that DEFAULT's claimed DFA resistance can be broken by applying an information-combining technique. More recently, at ASIACRYPT 2024, Jana et al. improved DFA by searching for differential trails with a single solution. They showed that, for DEFAULT with a simple key schedule, injecting five faults at the fifth-to-last round reduces the key space to one, and for BAKSHEESH, injecting twelve faults at the third-to-last round achieves the same result.
In this paper, we propose a new DFA framework that utilizes a Mixed-Integer Linear Programming (MILP) solver. This framework makes it possible to attack deeper rounds than previously achieved, reducing the number of fault injections required for key recovery. Furthermore, we present a method to determine the most efficient fault injection bit positions by systematically analyzing the input differences from all possible single bit-flip faults, thereby further reducing the required number of faults. This systematic analysis has the significant advantage of allowing us to theoretically calculate the required number of faults. Applying our framework, for DEFAULT, injecting three faults at the sixth-to-last round and two faults at the seventh- and eighth-to-last rounds reduces the key space to one.
A Scalable Coercion-resistant Voting Scheme for Blockchain Decision-making
Typically, a decentralized collaborative blockchain decision-making mechanism is realized by remote voting. To date, a number of blockchain voting schemes have been proposed; however, to the best of our knowledge, none of these schemes achieve coercion-resistance. In particular, for most blockchain voting schemes, the randomness used by the voting client can be viewed as a witness/proof of the actual vote, which enables improper behaviors such as coercion and vote-buying. Unfortunately, the existing coercion-resistant voting schemes cannot be directly adopted in the blockchain context. In this work, we design the first scalable coercion-resistant blockchain voting scheme that supports private weighted votes and 1-layer liquid democracy as introduced by Zhang et al. (NDSS '19). Its overall complexity is $O(n)$, where $n$ is the number of voters. Moreover, the ballot size is reduced from Zhang et al.'s $\Theta(m)$ to $\Theta(1)$, where $m$ is the number of experts and/or candidates. We formally prove that our scheme has ballot privacy, verifiability, and coercion-resistance. We implement a prototype of the scheme, and the evaluation results show that our scheme's tally procedure is more than 6x faster than VoteAgain (USENIX '20) in an election with over 50,000 voters and over 50\% extra ballot rate.