Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Missing validation for ML-KEM #107

@GiacomoPope

Description

@GiacomoPope

Note: this was bought up in PQClean/PQClean#601 while looking at the PQClean code and then directed here, so I'm writing this issue to see if there's interest in a PR.

In FIPS 203 there are a few input validation checks which were added to ML-KEM which are missing from the Kyber specification and also seem to be missing from this code. As repositories such as PQClean use this code upstream for ML-KEM, you end up with "non-FIPS ML-KEM" if the code misses the validation checks.

The length checks are essentially done for free by how the code is written, but two checks:

  1. Modulus check: ensure that the encoded value of the decoded $\hat{t}$ matches the input bytes (ensures all coefficients are in canonical range).
  2. Hash check: hash the public key and ensure it matches the hash of the public key in the secret key

Seem to be missing. If these checks are thought to be useful, then I think it wouldnt be too much work to add them and I'm happy to add a PR for this.

I'll sketch the two changes below and see what the owners of this code think. The general idea is to return 0 when all checks pass and 1 on failure (originally I returned -1 which seemed more natural, but then I saw verify() returns 1 on mismatch, so I tried to follow this convention)

Modulus Check

I think the easiest place to put the modulus check is in unpack_pk() itself, which would do the necessary decoding check and return 0 on success and 1 on failure. This means taking the polyvec *pk vector after decoding, re-encoding it and doing a byte comparison.

static int unpack_pk(polyvec *pk,
                      uint8_t seed[KYBER_SYMBYTES],
                      const uint8_t packedpk[KYBER_INDCPA_PUBLICKEYBYTES])
{
  polyvec_frombytes(pk, packedpk);
  memcpy(seed, packedpk+KYBER_POLYVECBYTES, KYBER_SYMBYTES);

  // Preform the modulus check
  modulus_check uint8_t  [KYBER_POLYVECBYTES]
  polyvec_tobytes(modulus_check, pk);

  // if modulus_check == packedpk[..KYBER_POLYVECBYTES] return 0; else 1;
  return verify(modulus_check, packedpk, KYBER_POLYVECBYTES);
}

This then means we return int rather than void in indcpa_enc()

int indcpa_enc(uint8_t c[KYBER_INDCPA_BYTES],
                const uint8_t m[KYBER_INDCPA_MSGBYTES],
                const uint8_t pk[KYBER_INDCPA_PUBLICKEYBYTES],
                const uint8_t coins[KYBER_SYMBYTES])
{
  // SNIP
  int modulus_check;
  modulus_check = unpack_pk(&pkpv, seed, pk);

  // SNIP
  return modulus_check;

and the only change to enc is that we return crypto_kem_enc_derand instead of 0.

int crypto_kem_enc(uint8_t *ct,
                   uint8_t *ss,
                   const uint8_t *pk)
{
  uint8_t coins[KYBER_SYMBYTES];
  randombytes(coins, KYBER_SYMBYTES);
  return crypto_kem_enc_derand(ct, ss, pk, coins);
}

Hash Check

I think for the hash check, we can do everything within crypto_kem_dec(). It basically means performing one additional hash and byte comparison and we can do this effectively at any point in the code (we could also make a perform_hash_check() method if this is deemed cleaner)

int crypto_kem_dec(uint8_t *ss,
                   const uint8_t *ct,
                   const uint8_t *sk)
{
  // SNIP

  uint8_t hash_check[KYBER_SYMBYTES];
  hash_h(hash_check, pk, KYBER_INDCPA_PUBLICKEYBYTES)
  // if h(ek_indcpa) == sk_hash return 0; else 1;
  return verify(hash_check, pk + KYBER_INDCPA_PUBLICKEYBYTES, KYBER_SYMBYTES);

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions