Auditing improvements #470

kevinlewi · 2025-07-22T22:15:44Z

(Depends on #460, which should be merged first)

The goal of this change is to improve the overall efficiency of the auditor function (verify_consecutive_append_only()) which is used to verify the validity of an append-only proof between two epochs. Note that this should be completely backwards-compatible with the previous way of verifying (and generating) audit proofs -- no actual correctness logic has changed.

As an example of the improvements, here is a before and after of the result of attempting to audit epoch 714064, a particularly large proof of size 291MB. The below examples are run on my laptop, with the previous performance taking 2 minutes and 9GB of RAM to audit, and the new performance taking 21 seconds and only 760MB to audit.

Old behavior:

$ cargo run -p examples --release -- whatsapp-kt-auditor -e 714064
[00:00:17] Successfully downloaded proof for epoch 714064. (291.6 MB)
[00:02:01] Audit proof for epoch 714064 has verified successfully!

$ top
...
45321  akd-examples 280.4     02:19.56 11/3   1     55     **8959M+** 0B     7146M+ 45321 1201
...

New behavior:

$ cargo run -p examples --release -- whatsapp-kt-auditor -e 714064
[00:00:18] Successfully downloaded proof for epoch 714064. (291.6 MB)
[00:00:21] Audit proof for epoch 714064 has verified successfully!

$ top
...
44251  akd-examples 297.8     00:38.32 11/3   1     56     **760M+**  0B     397M-  44251 1201
...

How it works

The way auditing works is by creating two new Azks instances from scratch, building each one up by inserting nodes (the first one from the proof's unchanged nodes and the second one from the proof's unchanged + inserted nodes), and then checking that the root hash of each Azks tree matches the expected root hash (the start hash or the end hash). Note that during this process, we are building the tree just for the purpose of computing the root hash, and nothing else -- the trees are discarded afterwards.

In the previous behavior, we would use an AsyncInMemoryDatabase to host all of the storage for the nodes that we insert into the tree (along with the resulting intermediary tree nodes). This means that the Azks instance would keep all of the nodes that have been inserted into the tree in memory at all times, up until the final root hash is computed.

In the new behavior, we still use an AsyncInMemoryDatabase, but we enable a flag which more aggressively removes nodes that no longer need to be kept in memory in order to compute the root hash. In particular, whenever we attempt to add a parent node to storage, we remove the left and right children from storage, if they exist. This takes advantage of the fact that batch insertion works on a level-by-level basis, computing the children hashes before computing their parent hash. So, once a parent hash has been computed, we no longer need the intermediary computations (corresponding to the child hashes).

codecov-commenter · 2025-07-22T22:53:02Z

Codecov Report

❌ Patch coverage is 90.32258% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.03%. Comparing base (3ce5335) to head (5697d89).
⚠️ Report is 28 commits behind head on main.

Files with missing lines	Patch %	Lines
akd/src/storage/memory.rs	86.20%	4 Missing ⚠️
akd/src/errors.rs	0.00%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #470      +/-   ##
==========================================
+ Coverage   88.61%   89.03%   +0.41%     
==========================================
  Files          39       38       -1     
  Lines        9109     7632    -1477     
==========================================
- Hits         8072     6795    -1277     
+ Misses       1037      837     -200

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

dillongeorge · 2025-07-30T08:17:34Z

akd/src/auditor.rs

+    let manager1 = StorageManager::new_no_cache(
+        AsyncInMemoryDatabase::new_with_remove_child_nodes_on_insertion(),
+    );
+    let mut azks1 = Azks::new::<TC, _>(&manager1).await?;
+    azks1
+        .batch_insert_nodes::<TC, _>(
+            &manager1,
+            proof.unchanged_nodes.clone(),
+            InsertMode::Auditor,
+            AzksParallelismConfig::default(),
+        )
+        .await?;
+    let computed_start_root_hash: Digest = azks1.get_root_hash::<TC, _>(&manager1).await?;
+    if computed_start_root_hash != start_hash {
+        return Err(AkdError::AzksErr(AzksError::VerifyAppendOnlyProof(
+            format!(
+                "Start hash {} does not match computed root hash {}",
+                hex::encode(start_hash),
+                hex::encode(computed_start_root_hash)
+            ),
+        )));
+    }


It seems like we're mostly doing the same thing in the StorageManager instances we're creating here (i.e. d manager1 and manager2). That is:

Creating the per-level cache + an akzs

Inserting some set of nodes into the azks

Asserting the resultant root hash is equal to an expected root hash

With the above in mind, do you think it might make sense to define a helper-like function which captures the common? E.g.

async fn verify_append_only_hash<TC:Configuration>( nodes: Vec<AzksElement>, expected_hash: Digest, latest_epoch: Option<u64>, ) -> Result<(), AkdError> { let manager = StorageManager::new_no_cache( AsyncInMemoryDatabase::new_with_remove_child_nodes_on_insertion(), ); let mut azks = Azks::new::<TC, _>(&manager1).await?; if let Some(epoch) = latest_epoch { azks.latest_epoch = epoch; } azks .batch_insert_nodes::<TC, _>( &manager, nodes, InsertMode::Auditor, AzksParallelismConfig::default(), ) .await?; let computed_root_hash: Digest = azks.get_root_hash::<TC, _>(&manager).await?; if computed_root_hash != expected_hash { return Err(AkdError::AzksErr(AzksError::VerifyAppendOnlyProof( format!( "Expected hash {} does not match computed root hash {}", hex::encode(expected_hash), hex::encode(computed_root_hash) ), ))); } }

If we do something like that, then I think this function essentially becomes:

pub async fn verify_consecutive_append_only<TC: Configuration>( proof: &SingleAppendOnlyProof, start_hash: Digest, end_hash: Digest, end_epoch: u64, ) -> Result<(), AkdError> { let _unchanged = verify_append_only_hash::<TC>( proof.unchanged_nodes.clone(), start_hash, None, ).await?; let unchanged_with_inserted_nodes = proof.unchanged_nodes.clone().extend(proof.inserted.iter().map(|x| { let mut y = *x; y.value = AzksValue(TC::hash_leaf_with_commitment(x.value, end_epoch).0); y })); let _changed = verify_append_only_hash::<TC>( unchanged_with_inserted_nodes, end_hash, Some(end_epoch - 1), ).await }

Note: I didn't run, nor did I format, any of the code above. It's just meant to reflect an idea to reduce some duplication, but please feel free to ignore if you prefer what's here.

dillongeorge · 2025-07-30T08:22:56Z

akd/src/storage/memory.rs

+    /// technique takes advantage of the way batch insertion of nodes into the tree works,
+    /// since we always process all of the children of a particular subtree before processing
+    /// the root of that subtree.
+    remove_child_nodes_on_insertion: bool,


I'm generally not the biggest fan of using a boolean to differentiate behavior, as I'd like to use something like a different type to reflect that the in-memory store we're using doesn't store everything, but I think that's a bit more of a rework than what we have now.

As such, I think what we have here is sufficient since we need to influence how inner workings of batch_set and using something like a newtype isn't necessarily going to make that easy given that we're not calling something before or after existing functionality. Additionally, you've commented this really well so it's pretty clear + the associated function to instantiate is super clear 👍

kevinlewi added 2 commits July 21, 2025 17:58

Adding individual epoch auditing in kt auditor example

6a2c0f7

Auditor efficiency improvements

81b2732

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 22, 2025

kevinlewi marked this pull request as ready for review July 22, 2025 23:05

kevinlewi requested review from cryo28 and dillongeorge July 22, 2025 23:05

dillongeorge approved these changes Jul 30, 2025

View reviewed changes

kevinlewi closed this Aug 4, 2025

kevinlewi deleted the auditing_improvements branch August 4, 2025 21:51

kevinlewi restored the auditing_improvements branch August 4, 2025 21:54

kevinlewi reopened this Aug 4, 2025

kevinlewi force-pushed the auditing_improvements branch from d7577cf to 59be8b4 Compare August 5, 2025 00:47

Auditing improvements

5697d89

kevinlewi force-pushed the auditing_improvements branch from 59be8b4 to 5697d89 Compare August 5, 2025 00:50

kevinlewi merged commit 9b2f0bc into facebook:main Aug 5, 2025
15 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Auditing improvements #470

Auditing improvements #470

Uh oh!

kevinlewi commented Jul 22, 2025 •

edited

Loading

Uh oh!

codecov-commenter commented Jul 22, 2025 •

edited

Loading

Uh oh!

dillongeorge Jul 30, 2025

Uh oh!

dillongeorge Jul 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Auditing improvements #470

Auditing improvements #470

Uh oh!

Conversation

kevinlewi commented Jul 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Old behavior:

New behavior:

How it works

Uh oh!

codecov-commenter commented Jul 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

dillongeorge Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

dillongeorge Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

kevinlewi commented Jul 22, 2025 •

edited

Loading

codecov-commenter commented Jul 22, 2025 •

edited

Loading