Note: this was bought up in PQClean/PQClean#601 while looking at the PQClean code and then directed here, so I'm writing this issue to see if there's interest in a PR.
In FIPS 203 there are a few input validation checks which were added to ML-KEM which are missing from the Kyber specification and also seem to be missing from this code. As repositories such as PQClean use this code upstream for ML-KEM, you end up with "non-FIPS ML-KEM" if the code misses the validation checks.
The length checks are essentially done for free by how the code is written, but two checks:
- Modulus check: ensure that the encoded value of the decoded $\hat{t}$ matches the input bytes (ensures all coefficients are in canonical range).
- Hash check: hash the public key and ensure it matches the hash of the public key in the secret key
Seem to be missing. If these checks are thought to be useful, then I think it wouldnt be too much work to add them and I'm happy to add a PR for this.
I'll sketch the two changes below and see what the owners of this code think. The general idea is to return 0 when all checks pass and 1 on failure (originally I returned -1 which seemed more natural, but then I saw verify() returns 1 on mismatch, so I tried to follow this convention)
Modulus Check
I think the easiest place to put the modulus check is in unpack_pk() itself, which would do the necessary decoding check and return 0 on success and 1 on failure. This means taking the polyvec *pk vector after decoding, re-encoding it and doing a byte comparison.
static int unpack_pk(polyvec *pk,
uint8_t seed[KYBER_SYMBYTES],
const uint8_t packedpk[KYBER_INDCPA_PUBLICKEYBYTES])
{
polyvec_frombytes(pk, packedpk);
memcpy(seed, packedpk+KYBER_POLYVECBYTES, KYBER_SYMBYTES);
// Preform the modulus check
modulus_check uint8_t [KYBER_POLYVECBYTES]
polyvec_tobytes(modulus_check, pk);
// if modulus_check == packedpk[..KYBER_POLYVECBYTES] return 0; else 1;
return verify(modulus_check, packedpk, KYBER_POLYVECBYTES);
}
This then means we return int rather than void in indcpa_enc()
int indcpa_enc(uint8_t c[KYBER_INDCPA_BYTES],
const uint8_t m[KYBER_INDCPA_MSGBYTES],
const uint8_t pk[KYBER_INDCPA_PUBLICKEYBYTES],
const uint8_t coins[KYBER_SYMBYTES])
{
// SNIP
int modulus_check;
modulus_check = unpack_pk(&pkpv, seed, pk);
// SNIP
return modulus_check;
and the only change to enc is that we return crypto_kem_enc_derand instead of 0.
int crypto_kem_enc(uint8_t *ct,
uint8_t *ss,
const uint8_t *pk)
{
uint8_t coins[KYBER_SYMBYTES];
randombytes(coins, KYBER_SYMBYTES);
return crypto_kem_enc_derand(ct, ss, pk, coins);
}
Hash Check
I think for the hash check, we can do everything within crypto_kem_dec(). It basically means performing one additional hash and byte comparison and we can do this effectively at any point in the code (we could also make a perform_hash_check() method if this is deemed cleaner)
int crypto_kem_dec(uint8_t *ss,
const uint8_t *ct,
const uint8_t *sk)
{
// SNIP
uint8_t hash_check[KYBER_SYMBYTES];
hash_h(hash_check, pk, KYBER_INDCPA_PUBLICKEYBYTES)
// if h(ek_indcpa) == sk_hash return 0; else 1;
return verify(hash_check, pk + KYBER_INDCPA_PUBLICKEYBYTES, KYBER_SYMBYTES);
Note: this was bought up in PQClean/PQClean#601 while looking at the PQClean code and then directed here, so I'm writing this issue to see if there's interest in a PR.
In FIPS 203 there are a few input validation checks which were added to ML-KEM which are missing from the Kyber specification and also seem to be missing from this code. As repositories such as PQClean use this code upstream for ML-KEM, you end up with "non-FIPS ML-KEM" if the code misses the validation checks.
The length checks are essentially done for free by how the code is written, but two checks:
Seem to be missing. If these checks are thought to be useful, then I think it wouldnt be too much work to add them and I'm happy to add a PR for this.
I'll sketch the two changes below and see what the owners of this code think. The general idea is to return
0when all checks pass and1on failure (originally I returned -1 which seemed more natural, but then I sawverify()returns1on mismatch, so I tried to follow this convention)Modulus Check
I think the easiest place to put the modulus check is in
unpack_pk()itself, which would do the necessary decoding check and return0on success and1on failure. This means taking thepolyvec *pkvector after decoding, re-encoding it and doing a byte comparison.This then means we return
intrather than void inindcpa_enc()and the only change to
encis that we returncrypto_kem_enc_derandinstead of0.Hash Check
I think for the hash check, we can do everything within
crypto_kem_dec(). It basically means performing one additional hash and byte comparison and we can do this effectively at any point in the code (we could also make aperform_hash_check()method if this is deemed cleaner)