Codestin Search App

barshaul · 2021-10-28T10:20:02Z

When using phpredis with RedisCluster, and a failover takes place on the cluster, the old primary does not get removed from the masters cache.
It means that if a user attempts to ping the masters after a failover, an exception will be thrown.

foreach ($obj_cluster->_masters() as $arr_master) {
    $obj_cluster->ping($arr_master);
}
 
Array PHP Fatal error:  Uncaught RedisClusterException: Unable to send command at the specified node in /home/ec2-user/clusterTest2/test.php:31
Stack trace:
#0 /home/ec2-user/clusterTest2/test.php(31): RedisCluster->ping(Array)
#1 {main}
  thrown in /home/ec2-user/clusterTest2/test.php on line 31

This PR adds support for remapping the cluster's keyspace when a failover occurs:
In the current implementation, when we get a MOVED error, we check to see if the redirected address belongs to an existing primary, and if it doesn't, we create a new node and add it to the cluster's primaries. It will end up with a stale primary on the _masters array.
In this fix, I added a check to detect a failover: if the redirected node was a replica of the master that is currently pointing to this slot, then a failover had occurred. In the case of a failover, the cluster's topology has changed, and we will call cluster_map_keyspace() to reinitialize the cluster's nodes cache.

I ran the following test scenario:

Created a cluster with 2 shards = [[primary=6379], [primary=6378,replica=6377]]
Created a RedisCluster instance:

foreach ($obj_cluster->_masters() as $arr_master) {
    print_r($arr_master);

Output:

Array
(
    [0] => 127.0.0.1
    [1] => 6379
)
Array
(
    [0] => 127.0.0.1
    [1] => 6378
)

Killed primary 6378 & executed failover takeover on replica 6377, so replica 6377 became the new primary of this shard
Ran GET command with a slot of the failed primary (to get the MOVED error)
Tested the _masters array: after the old primary (6378) was killed, the masters array was updated with the new promoted primary (6377), but the old primary hasn’t been removed.

Array
(
    [0] => 127.0.0.1
    [1] => 6379
)

Array
(
    [0] => 127.0.0.1
    [1] => 6377
)

Array
(
    [0] => 127.0.0.1
    [1] => 6378
)

With the changes I suggest in this PR the _masters array is updated correctly after the failover:

Array
(
   [0] => 127.0.0.1
   [1] => 6379
)

Array
(
   [0] => 127.0.0.1
   [1] => 6377
)

Added support for remapping the cluster's keyspace on a failover

12bf3fd

michael-grunder self-requested a review October 31, 2021 18:44

michael-grunder self-assigned this Oct 31, 2021

michael-grunder removed their request for review October 31, 2021 18:55

michael-grunder merged commit bce6929 into phpredis:develop Nov 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Added support for remapping the cluster's keyspace on a failover#2025

Added support for remapping the cluster's keyspace on a failover#2025
michael-grunder merged 1 commit into
phpredis:developfrom
barshaul:develop

barshaul commented Oct 28, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

barshaul commented Oct 28, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants