Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Skip slot cache optimization for AOF client to prevent key duplication and data corruption#3004

Merged
murphyjacob4 merged 1 commit intovalkey-io:unstablefrom
AdityaTeltia:fix-duplicate-keys-loading-AOF
Jan 10, 2026
Merged

Skip slot cache optimization for AOF client to prevent key duplication and data corruption#3004
murphyjacob4 merged 1 commit intovalkey-io:unstablefrom
AdityaTeltia:fix-duplicate-keys-loading-AOF

Conversation

@AdityaTeltia
Copy link
Contributor

@AdityaTeltia AdityaTeltia commented Jan 4, 2026

When loading AOF in cluster mode, keys inside a MULTI/EXEC block could be
inserted into wrong hash slots, causing key duplication and data corruption.

The root cause was the slot caching optimization in getKeySlot(). This
optimization reuses a cached slot value to avoid recalculating the hash
for every key operation. However, when replaying AOF, a transaction may
contain commands affecting keys in different slots. The cached slot from
a previous command (e.g., SET k1) would incorrectly be used for subsequent
commands in the transaction (e.g., SET k0), causing k0 to be stored in k1's
slot.

The existing code already skipped this optimization for replicated clients
(commands from primary) using isReplicatedClient(). This change extends
that to also skip for AOF clients by using mustObeyClient() instead, which
covers both replicated clients and the AOF client.

Fixes #2995, introduced in #1949.

@codecov
Copy link

codecov bot commented Jan 5, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 74.15%. Comparing base (263d9ea) to head (26c7680).
⚠️ Report is 13 commits behind head on unstable.

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #3004      +/-   ##
============================================
- Coverage     74.27%   74.15%   -0.12%     
============================================
  Files           129      129              
  Lines         70896    70896              
============================================
- Hits          52656    52573      -83     
- Misses        18240    18323      +83     
Files with missing lines Coverage Δ
src/db.c 94.01% <100.00%> (ø)

... and 26 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@enjoy-binbin enjoy-binbin changed the title fix: skip slot cache optimization for AOF client to prevent key duplication (#2995) Skip slot cache optimization for AOF client to prevent key duplication and data corruption Jan 5, 2026
Copy link
Member

@enjoy-binbin enjoy-binbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the fix.

@enjoy-binbin enjoy-binbin added the release-notes This issue should get a line item in the release notes label Jan 6, 2026
@murphyjacob4
Copy link
Contributor

For the purpose of understanding the impact, I think the only way to get a cross-slot MULTI/EXEC in the AOF is:

  1. Load an AOF from a cluster mode disabled instance (where cross-slot transactions are permitted)
  2. Load an AOF with content generated from a script or module that executes a cross-slot transaction

But yeah definitely a bug. Good find!

@murphyjacob4
Copy link
Contributor

I think the only way to get a cross-slot MULTI/EXEC in the AOF is

Sorry, read through the issue again. It looks like the issue is also occurring with a same-slot MULTI/EXEC. It is basically using the slot of the previous command execution. So yeah - not just cross-slot MULTI/EXEC

@murphyjacob4 murphyjacob4 merged commit de00054 into valkey-io:unstable Jan 10, 2026
58 checks passed
@github-project-automation github-project-automation bot moved this to To be backported in Valkey 9.0 Jan 10, 2026
zuiderkwast pushed a commit to zuiderkwast/placeholderkv that referenced this pull request Jan 29, 2026
…n and data corruption (valkey-io#3004)

When loading AOF in cluster mode, keys inside a MULTI/EXEC block could
be
inserted into wrong hash slots, causing key duplication and data
corruption.

The root cause was the slot caching optimization in getKeySlot(). This
optimization reuses a cached slot value to avoid recalculating the hash
for every key operation. However, when replaying AOF, a transaction may
contain commands affecting keys in different slots. The cached slot from
a previous command (e.g., SET k1) would incorrectly be used for
subsequent
commands in the transaction (e.g., SET k0), causing k0 to be stored in
k1's
slot.

The existing code already skipped this optimization for replicated
clients
(commands from primary) using isReplicatedClient(). This change extends
that to also skip for AOF clients by using mustObeyClient() instead,
which
covers both replicated clients and the AOF client.

Fixes valkey-io#2995, introduced in valkey-io#1949.

Signed-off-by: aditya.teltia <[email protected]>
zuiderkwast pushed a commit to zuiderkwast/placeholderkv that referenced this pull request Jan 30, 2026
…n and data corruption (valkey-io#3004)

When loading AOF in cluster mode, keys inside a MULTI/EXEC block could
be
inserted into wrong hash slots, causing key duplication and data
corruption.

The root cause was the slot caching optimization in getKeySlot(). This
optimization reuses a cached slot value to avoid recalculating the hash
for every key operation. However, when replaying AOF, a transaction may
contain commands affecting keys in different slots. The cached slot from
a previous command (e.g., SET k1) would incorrectly be used for
subsequent
commands in the transaction (e.g., SET k0), causing k0 to be stored in
k1's
slot.

The existing code already skipped this optimization for replicated
clients
(commands from primary) using isReplicatedClient(). This change extends
that to also skip for AOF clients by using mustObeyClient() instead,
which
covers both replicated clients and the AOF client.

Fixes valkey-io#2995, introduced in valkey-io#1949.

Signed-off-by: aditya.teltia <[email protected]>
@zuiderkwast zuiderkwast moved this from To be backported to 9.0.2 WIP in Valkey 9.0 Jan 30, 2026
zuiderkwast pushed a commit that referenced this pull request Feb 3, 2026
…n and data corruption (#3004)

When loading AOF in cluster mode, keys inside a MULTI/EXEC block could
be
inserted into wrong hash slots, causing key duplication and data
corruption.

The root cause was the slot caching optimization in getKeySlot(). This
optimization reuses a cached slot value to avoid recalculating the hash
for every key operation. However, when replaying AOF, a transaction may
contain commands affecting keys in different slots. The cached slot from
a previous command (e.g., SET k1) would incorrectly be used for
subsequent
commands in the transaction (e.g., SET k0), causing k0 to be stored in
k1's
slot.

The existing code already skipped this optimization for replicated
clients
(commands from primary) using isReplicatedClient(). This change extends
that to also skip for AOF clients by using mustObeyClient() instead,
which
covers both replicated clients and the AOF client.

Fixes #2995, introduced in #1949.

Signed-off-by: aditya.teltia <[email protected]>
harrylin98 pushed a commit to harrylin98/valkey_forked that referenced this pull request Feb 19, 2026
…n and data corruption (valkey-io#3004)

When loading AOF in cluster mode, keys inside a MULTI/EXEC block could
be
inserted into wrong hash slots, causing key duplication and data
corruption.

The root cause was the slot caching optimization in getKeySlot(). This
optimization reuses a cached slot value to avoid recalculating the hash
for every key operation. However, when replaying AOF, a transaction may
contain commands affecting keys in different slots. The cached slot from
a previous command (e.g., SET k1) would incorrectly be used for
subsequent
commands in the transaction (e.g., SET k0), causing k0 to be stored in
k1's
slot.

The existing code already skipped this optimization for replicated
clients
(commands from primary) using isReplicatedClient(). This change extends
that to also skip for AOF clients by using mustObeyClient() instead,
which
covers both replicated clients and the AOF client.

Fixes valkey-io#2995, introduced in valkey-io#1949.

Signed-off-by: aditya.teltia <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-notes This issue should get a line item in the release notes

Projects

Status: 9.0.2

Development

Successfully merging this pull request may close these issues.

[BUG] Valkey cluster is duplicating keys when loading AOF

3 participants