Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@kevinlewi
Copy link
Contributor

@kevinlewi kevinlewi commented Jul 22, 2025

(Depends on #460, which should be merged first)

The goal of this change is to improve the overall efficiency of the auditor function (verify_consecutive_append_only()) which is used to verify the validity of an append-only proof between two epochs. Note that this should be completely backwards-compatible with the previous way of verifying (and generating) audit proofs -- no actual correctness logic has changed.

As an example of the improvements, here is a before and after of the result of attempting to audit epoch 714064, a particularly large proof of size 291MB. The below examples are run on my laptop, with the previous performance taking 2 minutes and 9GB of RAM to audit, and the new performance taking 21 seconds and only 760MB to audit.

Old behavior:

$ cargo run -p examples --release -- whatsapp-kt-auditor -e 714064
[00:00:17] Successfully downloaded proof for epoch 714064. (291.6 MB)
[00:02:01] Audit proof for epoch 714064 has verified successfully!
$ top
...
45321  akd-examples 280.4     02:19.56 11/3   1     55     **8959M+** 0B     7146M+ 45321 1201
...

New behavior:

$ cargo run -p examples --release -- whatsapp-kt-auditor -e 714064
[00:00:18] Successfully downloaded proof for epoch 714064. (291.6 MB)
[00:00:21] Audit proof for epoch 714064 has verified successfully!
$ top
...
44251  akd-examples 297.8     00:38.32 11/3   1     56     **760M+**  0B     397M-  44251 1201
...

How it works

The way auditing works is by creating two new Azks instances from scratch, building each one up by inserting nodes (the first one from the proof's unchanged nodes and the second one from the proof's unchanged + inserted nodes), and then checking that the root hash of each Azks tree matches the expected root hash (the start hash or the end hash). Note that during this process, we are building the tree just for the purpose of computing the root hash, and nothing else -- the trees are discarded afterwards.

In the previous behavior, we would use an AsyncInMemoryDatabase to host all of the storage for the nodes that we insert into the tree (along with the resulting intermediary tree nodes). This means that the Azks instance would keep all of the nodes that have been inserted into the tree in memory at all times, up until the final root hash is computed.

In the new behavior, we still use an AsyncInMemoryDatabase, but we enable a flag which more aggressively removes nodes that no longer need to be kept in memory in order to compute the root hash. In particular, whenever we attempt to add a parent node to storage, we remove the left and right children from storage, if they exist. This takes advantage of the fact that batch insertion works on a level-by-level basis, computing the children hashes before computing their parent hash. So, once a parent hash has been computed, we no longer need the intermediary computations (corresponding to the child hashes).

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 22, 2025
@codecov-commenter
Copy link

codecov-commenter commented Jul 22, 2025

Codecov Report

❌ Patch coverage is 90.32258% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.03%. Comparing base (3ce5335) to head (5697d89).
⚠️ Report is 28 commits behind head on main.

Files with missing lines Patch % Lines
akd/src/storage/memory.rs 86.20% 4 Missing ⚠️
akd/src/errors.rs 0.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #470      +/-   ##
==========================================
+ Coverage   88.61%   89.03%   +0.41%     
==========================================
  Files          39       38       -1     
  Lines        9109     7632    -1477     
==========================================
- Hits         8072     6795    -1277     
+ Misses       1037      837     -200     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@kevinlewi kevinlewi marked this pull request as ready for review July 22, 2025 23:05
@kevinlewi kevinlewi requested review from cryo28 and dillongeorge July 22, 2025 23:05
Comment on lines 70 to 91
let manager1 = StorageManager::new_no_cache(
AsyncInMemoryDatabase::new_with_remove_child_nodes_on_insertion(),
);
let mut azks1 = Azks::new::<TC, _>(&manager1).await?;
azks1
.batch_insert_nodes::<TC, _>(
&manager1,
proof.unchanged_nodes.clone(),
InsertMode::Auditor,
AzksParallelismConfig::default(),
)
.await?;
let computed_start_root_hash: Digest = azks1.get_root_hash::<TC, _>(&manager1).await?;
if computed_start_root_hash != start_hash {
return Err(AkdError::AzksErr(AzksError::VerifyAppendOnlyProof(
format!(
"Start hash {} does not match computed root hash {}",
hex::encode(start_hash),
hex::encode(computed_start_root_hash)
),
)));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like we're mostly doing the same thing in the StorageManager instances we're creating here (i.e. d manager1 and manager2). That is:

  1. Creating the per-level cache + an akzs
  2. Inserting some set of nodes into the azks
  3. Asserting the resultant root hash is equal to an expected root hash

With the above in mind, do you think it might make sense to define a helper-like function which captures the common? E.g.

async fn verify_append_only_hash<TC:Configuration>(
    nodes: Vec<AzksElement>,
    expected_hash: Digest,
    latest_epoch: Option<u64>,
) -> Result<(), AkdError> {
    let manager = StorageManager::new_no_cache(
        AsyncInMemoryDatabase::new_with_remove_child_nodes_on_insertion(),
    );
    let mut azks = Azks::new::<TC, _>(&manager1).await?;
    if let Some(epoch) = latest_epoch {
        azks.latest_epoch = epoch;
    }
    azks
        .batch_insert_nodes::<TC, _>(
            &manager,
            nodes,
            InsertMode::Auditor,
            AzksParallelismConfig::default(),
        )
        .await?;
    let computed_root_hash: Digest = azks.get_root_hash::<TC, _>(&manager).await?;
    if computed_root_hash != expected_hash {
        return Err(AkdError::AzksErr(AzksError::VerifyAppendOnlyProof(
            format!(
                "Expected hash {} does not match computed root hash {}",
                hex::encode(expected_hash),
                hex::encode(computed_root_hash)
            ),
        )));
    }
}

If we do something like that, then I think this function essentially becomes:

pub async fn verify_consecutive_append_only<TC: Configuration>(
    proof: &SingleAppendOnlyProof,
    start_hash: Digest,
    end_hash: Digest,
    end_epoch: u64,
) -> Result<(), AkdError> {
    let _unchanged = verify_append_only_hash::<TC>(
        proof.unchanged_nodes.clone(),
        start_hash,
        None,
    ).await?;
    let unchanged_with_inserted_nodes = proof.unchanged_nodes.clone().extend(proof.inserted.iter().map(|x| {
        let mut y = *x;
        y.value = AzksValue(TC::hash_leaf_with_commitment(x.value, end_epoch).0);
        y
    }));
    let _changed = verify_append_only_hash::<TC>(
        unchanged_with_inserted_nodes,
        end_hash,
        Some(end_epoch - 1),
    ).await
}

Note: I didn't run, nor did I format, any of the code above. It's just meant to reflect an idea to reduce some duplication, but please feel free to ignore if you prefer what's here.

/// technique takes advantage of the way batch insertion of nodes into the tree works,
/// since we always process all of the children of a particular subtree before processing
/// the root of that subtree.
remove_child_nodes_on_insertion: bool,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm generally not the biggest fan of using a boolean to differentiate behavior, as I'd like to use something like a different type to reflect that the in-memory store we're using doesn't store everything, but I think that's a bit more of a rework than what we have now.

As such, I think what we have here is sufficient since we need to influence how inner workings of batch_set and using something like a newtype isn't necessarily going to make that easy given that we're not calling something before or after existing functionality. Additionally, you've commented this really well so it's pretty clear + the associated function to instantiate is super clear 👍

@kevinlewi kevinlewi closed this Aug 4, 2025
@kevinlewi kevinlewi deleted the auditing_improvements branch August 4, 2025 21:51
@kevinlewi kevinlewi restored the auditing_improvements branch August 4, 2025 21:54
@kevinlewi kevinlewi reopened this Aug 4, 2025
@kevinlewi kevinlewi force-pushed the auditing_improvements branch from d7577cf to 59be8b4 Compare August 5, 2025 00:47
@kevinlewi kevinlewi force-pushed the auditing_improvements branch from 59be8b4 to 5697d89 Compare August 5, 2025 00:50
@kevinlewi kevinlewi merged commit 9b2f0bc into facebook:main Aug 5, 2025
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants