Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Fix flaky cluster tests by accepting either retry limit error (maxAttempts or maxTotalRetriesDuration)#4399

Merged
ggivo merged 2 commits into
masterfrom
topic/ggivo/fix-flaky-test-cluster-deadline-exceede
Jan 14, 2026
Merged

Fix flaky cluster tests by accepting either retry limit error (maxAttempts or maxTotalRetriesDuration)#4399
ggivo merged 2 commits into
masterfrom
topic/ggivo/fix-flaky-test-cluster-deadline-exceede

Conversation

@ggivo
Copy link
Copy Markdown
Collaborator

@ggivo ggivo commented Jan 13, 2026

Fix flaky cluster tests by accepting either retry limit error

Problem

Tests flaked when expecting "No more cluster attempts left" but got
"Cluster retry deadline exceeded" due to randomized backoff jitter.

Root Cause

On the final retry attempt, backoff can sleep for 0 to millisLeft
(entire remaining time). Depending on random jitter:

  • Low jitter → attempt executes → attempts exhausted
  • High jitter → deadline exhausted before attempt

This makes it non-deterministic which limit is reached first.

Solution

Updated assertions to accept either error message using anyOf matcher,
making tests resilient to backoff randomness.

Affected tests

  • SSLACLRedisClusterClientTest
  • SSLOptionsRedisClusterClientTest
  • SSLRedisClusterClientTest

@ggivo ggivo marked this pull request as draft January 13, 2026 12:04
Tests expecting "No more cluster attempts left" sometimes got "Cluster
retry deadline exceeded" due to randomized backoff jitter on the final
attempt.

On the last retry, backoff can sleep 0 to millisLeft (entire remaining
time). High jitter exhausts the deadline first, low jitter exhausts
attempts first - making it non-deterministic which error occurs.

Fixed by updating assertions to accept either error message using
Hamcrest anyOf matcher.
@ggivo ggivo force-pushed the topic/ggivo/fix-flaky-test-cluster-deadline-exceede branch from e4f3a62 to 1d77ff1 Compare January 13, 2026 12:23
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jan 13, 2026

Test Results

   285 files  ±0     285 suites  ±0   11m 51s ⏱️ -26s
10 469 tests ±0  10 210 ✅  - 201  259 💤 +201  0 ❌ ±0 
 2 774 runs  ±0   2 770 ✅ ±  0    4 💤 ±  0  0 ❌ ±0 

Results for commit 721269b. ± Comparison against base commit 9413149.

This pull request skips 201 tests.
redis.clients.jedis.commands.commandobjects.CommandObjectsStringCommandsTest[1] ‑ testMsetexNx_parametrized(String, MSetExParams)[1]
redis.clients.jedis.commands.commandobjects.CommandObjectsStringCommandsTest[1] ‑ testMsetexNx_parametrized(String, MSetExParams)[2]
redis.clients.jedis.commands.commandobjects.CommandObjectsStringCommandsTest[1] ‑ testMsetexNx_parametrized(String, MSetExParams)[3]
redis.clients.jedis.commands.commandobjects.CommandObjectsStringCommandsTest[1] ‑ testMsetexNx_parametrized(String, MSetExParams)[4]
redis.clients.jedis.commands.commandobjects.CommandObjectsStringCommandsTest[1] ‑ testMsetexNx_parametrized(String, MSetExParams)[5]
redis.clients.jedis.commands.commandobjects.CommandObjectsStringCommandsTest[2] ‑ testMsetexNx_parametrized(String, MSetExParams)[1]
redis.clients.jedis.commands.commandobjects.CommandObjectsStringCommandsTest[2] ‑ testMsetexNx_parametrized(String, MSetExParams)[2]
redis.clients.jedis.commands.commandobjects.CommandObjectsStringCommandsTest[2] ‑ testMsetexNx_parametrized(String, MSetExParams)[3]
redis.clients.jedis.commands.commandobjects.CommandObjectsStringCommandsTest[2] ‑ testMsetexNx_parametrized(String, MSetExParams)[4]
redis.clients.jedis.commands.commandobjects.CommandObjectsStringCommandsTest[2] ‑ testMsetexNx_parametrized(String, MSetExParams)[5]
…

♻️ This comment has been updated with latest results.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request fixes flaky cluster tests by updating test assertions to accept either of two possible error messages that can occur due to non-deterministic backoff behavior during cluster retries.

Changes:

  • Updated test assertions to use Hamcrest's anyOf matcher to accept either "No more cluster attempts left." or "Cluster retry deadline exceeded." error messages
  • Added Hamcrest imports to support the new assertion pattern
  • Applied the fix consistently across three SSL cluster test classes

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
SSLRedisClusterClientTest.java Updated assertions in connectToNodesFailsWithSSLParametersAndNoHostMapping and connectWithCustomHostNameVerifier tests to accept either error message; added Hamcrest imports
SSLOptionsRedisClusterClientTest.java Updated assertions in connectToNodesFailsWithSSLParametersAndNoHostMapping and connectWithCustomHostNameVerifier tests to accept either error message; added Hamcrest imports
SSLACLRedisClusterClientTest.java Updated assertions in connectToNodesFailsWithSSLParametersAndNoHostMapping and connectWithCustomHostNameVerifier tests to accept either error message; added Hamcrest imports

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@ggivo ggivo marked this pull request as ready for review January 13, 2026 13:52
@ggivo ggivo changed the title Fix flaky cluster tests by ensuring maxAttempts limit is reached before deadline Fix flaky cluster tests by accepting either retry limit error (maxAttempts or maxTotalRetriesDuration) Jan 13, 2026
@ggivo ggivo merged commit af6454d into master Jan 14, 2026
18 of 19 checks passed
@ggivo ggivo deleted the topic/ggivo/fix-flaky-test-cluster-deadline-exceede branch January 16, 2026 06:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants