Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Verify snapshot integrity with checksum in shard snapshot transfer #3372

@timvisee

Description

@timvisee

Depends on #3371.

Is your feature request related to a problem? Please describe.
In #2840 we've added checksums for snapshot files. The implementation is somewhat limited however, and requires further integration to make full use of it.

Since Qdrant 1.7 different shard transfer methods are supported. Snapshot transfers have been added to make these transfers more capable by utilizing snapshots.

One problem with this approach is that we have no integrity checks for the actual snapshot files. If such file would become corrupted, Qdrant will happily restore possibly resulting in a broken shard.

Describe the solution you'd like
When a shard snapshot transfer happens, we should check integrity of the snapshot file by verifying the attached checksum. Since #2840, the checksum is attached to the SnapshotDescription object.

#3371 will implement a checksum field in snapshot recovery endpoints. We'll have to wait for this to be implemented so that we can utilize this in the snapshot transfer progress.

The right approach is probably to pass the checksum along in the recovery call here:

log::trace!("Transferring and recovering shard {shard_id} snapshot on peer {remote_peer_id}");
remote_shard
.recover_shard_snapshot_from_url(
collection_name,
shard_id,
&shard_download_url,
SnapshotPriority::ShardTransfer,
)
.await
.map_err(|err| {
CollectionError::service_error(format!(
"Failed to recover shard snapshot on remote: {err}"
))
})?;

If checksum verification on the remote node fails, we should clean up the snapshot file and return with an error. Cleaning up on the remote is probably already handled with #3371.

Additional context
There's other work to be done to properly integrate checksums, but that will be handled in different issues/PRs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions