Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Verify snapshot integrity with checksum in shard snapshot transfer (Fixes #3372)#8765

Open
Artur-Sulej wants to merge 1 commit into
qdrant:devfrom
Artur-Sulej:fix-3372
Open

Verify snapshot integrity with checksum in shard snapshot transfer (Fixes #3372)#8765
Artur-Sulej wants to merge 1 commit into
qdrant:devfrom
Artur-Sulej:fix-3372

Conversation

@Artur-Sulej
Copy link
Copy Markdown

Verify snapshot integrity with checksum in shard snapshot transfer

Fixes #3372

Problem

During shard snapshot transfer, the .snapshot file received by the remote node was restored without any integrity check. A corrupted or truncated transfer would silently produce a broken shard with no indication of failure.

The checksum infrastructure was already fully in place — SnapshotDescription includes a SHA256 checksum computed at snapshot creation time, and the receiver-side recover_shard_snapshot function already has checksum verification logic (used by user-facing snapshot recovery). The checksum simply wasn't being passed through the transfer call chain.

Solution

Thread the snapshot checksum from the sender through to the receiver:

  1. lib/collection/src/shards/remote_shard.rs — Added a checksum: Option<String> parameter to recover_shard_snapshot_from_url and forwarded it into the RecoverShardSnapshotRequest gRPC message (the checksum field already exists in the proto definition).

  2. lib/collection/src/shards/transfer/snapshot.rs — On the sender side, captured snapshot_description.checksum (populated during snapshot creation) and passed it to recover_shard_snapshot_from_url.

The receiver already handles a non-None checksum in recover_shard_snapshot: it computes the SHA256 of the downloaded file and compares it, returning a bad_input error on mismatch. No changes were needed on the receiver side.

Scope

  • Non-streaming path (nodes < v1.12): Fully covered. The snapshot file is created locally, its checksum is in SnapshotDescription, and is now forwarded to the receiver for verification.
  • Streaming path (nodes ≥ v1.12): Unchanged — no snapshot file is created on the sender side, so no pre-computed checksum is available. snapshot_checksum stays None and behavior is identical to before.

All Submissions:

  • Contributions should target the dev branch. Did you create your branch from dev?
  • Have you followed the guidelines in our Contributing document?
  • Have you checked to ensure there aren't other open Pull Requests for the same update/change?

New Feature Submissions:

  1. Does your submission pass tests?
  2. Have you formatted your code locally using cargo +nightly fmt --all command prior to submission?
  3. Have you checked your code using cargo clippy --workspace --all-features command?

Changes to Core Features:

  • Have you added an explanation of what your changes do and why you'd like us to include them?
  • Have you written new tests for your core changes, as applicable?
  • Have you successfully ran tests with your changes locally?

During shard snapshot transfer, the received .snapshot file was restored
without verifying its checksum. A corrupted transfer could silently
produce a broken shard.

- Add `checksum` parameter to `recover_shard_snapshot_from_url`
- Capture `snapshot_description.checksum` on the sender side and forward
  it in the gRPC `RecoverShardSnapshotRequest` to the receiver
- The receiver already verifies the checksum in `recover_shard_snapshot`
  when a non-None value is provided

Fixes qdrant#3372
coderabbitai[bot]

This comment was marked as resolved.

@qdrant qdrant deleted a comment from coderabbitai Bot Apr 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant