Add `shard unassign` cmd #312

saterus · 2021-12-15T22:02:47Z

Adds a shard unassign command to gazctl. This allows manually triggering the reallocation of a shard to a different consumer process. This provides a quick way to restart a failed shard or to try to relieve the load on a hot node.

I tried to follow the conventions I saw elsewhere in the codebase. Let me know if I've done anything strange or put responsibilities in the wrong place.

This change is

jgraettinger

Looking great, a few comments below. Thanks

Reviewed 9 of 9 files at r1, all commit messages.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @saterus)

cmd/gazctl/gazctlcmd/gazctl.go, line 74 at r1 (raw file):

}

type unassignConfig struct {

nit: I think this can be inlined into cmdShardsUnassign (or you could re-use pruneConfig?). Other structs in here were all extracted because they're used in multiple places (typically for a journals vs shards flavor of an operation).

for now, or later: What about a boolean Failed which further restricts the selection to primary shards which have a status of FAILED, and leaves others alone ?

cmd/gazctl/gazctlcmd/shards_unassign.go, line 52 at r1 (raw file):

	if len(shards) > 0 {
		log.Infof("Successfully unassigned %v shards: %v", len(shards), shardIds(shards))

supernit: perhaps log a line for each shard? That can also include more information, like its primary status, and its number of route members.

consumer/shard_api.go, line 197 at r1 (raw file):

		return resp, err
	}

You'll need to RLock the KeySpace mutex here, to guard access to Resolver.state

consumer/shard_api.go, line 204 at r1 (raw file):

	}

	etcdResp, err := srv.Etcd.KV.Delete(ctx, allocator.AssignmentKey(state.KS, primaryAssignment))

If the primary shard has status < FAILED, then this should further check that:

The number of assignments matches the desired replication.
All shard assignments are consistent.

These are sanity checks which help sure the required replication invariants of the shard.
If the primary is FAILED, we can of course be more willing to remove it.

See:

Any other potential sanity checks you're aware of?

consumer/protocol/protocol.proto, line 447 at r1 (raw file):

message UnassignRequest {
  // Header may be attached by a proxying consumer peer.

I think you can drop Header, since this isn't a proxied request (any consumer member can serve it).

consumer/protocol/protocol.proto, line 450 at r1 (raw file):

  protocol.Header header = 1;
  // Shard to unassign.
  string shard = 2 [ (gogoproto.casttype) = "ShardID" ];

Should this be repeated string, allowing for a bunch of shards to be assigned as a single RPC ?

consumer/protocol/rpc_extensions.go, line 157 at r1 (raw file):

// Validate returns an error if the UnassignRequest is not well-formed.
func (m *UnassignRequest) Validate() error {
	if err := m.Shard.Validate(); err != nil {

Test in https://github.com/gazette/core/blob/master/consumer/protocol/rpc_extensions_test.go ?

Yes, it's very pedantic :) However the exercise has caught my bugs in the past -- here, I'm noting that Header isn't validated (though I suggested it's removal anyway, above).

saterus · 2021-12-16T17:47:43Z

consumer/protocol/protocol.proto, line 450 at r1 (raw file):

Previously, jgraettinger (Johnny Graettinger) wrote…

Should this be repeated string, allowing for a bunch of shards to be assigned as a single RPC ?

Sure thing. It raises the question of error handling though. Would you expect the assignment removals to be transactional? Or if we encounter an error with one of many, should we just bail?

jgraettinger

Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @saterus)

consumer/shard_api.go, line 204 at r1 (raw file):

Previously, jgraettinger (Johnny Graettinger) wrote…

If the primary shard has status < FAILED, then this should further check that:

The number of assignments matches the desired replication.

All shard assignments are consistent.

These are sanity checks which help sure the required replication invariants of the shard.
If the primary is FAILED, we can of course be more willing to remove it.

See:

https://pkg.go.dev/go.gazette.dev/[email protected]/allocator#IsConsistentFn

https://pkg.go.dev/go.gazette.dev/[email protected]/consumer#ShardIsConsistent

Any other potential sanity checks you're aware of?

Oh, one more I have: checking the modification revision of the assignment hasn't changed out from under you within the Etcd operation. I think this could actually matter if there were raced invocations of gazctl shards unassign, and especially if we only want to remove FAILED shards. It's certainly possible, even likely, that a removed assignment would be re-assigned to the same consumer process -- in which case we might inadvertently remove it twice.

The journal & shards Apply RPCs thread through an ExpectedModRevision which is provided by the client, and informed by an initial list RPC, for this reason. Not sure that makes sense here (the List RPC doesn't return the ModRevision of assignments), but UnassignRequest could represent the desire to remove only FAILED shards, and this would need to guard against the possibility of raced invocations.

Let me know if you want to talk it out in person.

consumer/protocol/protocol.proto, line 450 at r1 (raw file):

Previously, saterus (Alex Burkhart) wrote…

Sure thing. It raises the question of error handling though. Would you expect the assignment removals to be transactional? Or if we encounter an error with one of many, should we just bail?

Transactional. I'd approach it by building up a single Etcd transaction which composes invariant checks and desired deletions.

saterus

Reviewable status: 2 of 11 files reviewed, 7 unresolved discussions (waiting on @jgraettinger and @saterus)

cmd/gazctl/gazctlcmd/shards_unassign.go, line 52 at r1 (raw file):

Previously, jgraettinger (Johnny Graettinger) wrote…

supernit: perhaps log a line for each shard? That can also include more information, like its primary status, and its number of route members.

I've taken a stab at this one. If we used the old listShard results, it'll have the previous state of the shard, which isn't useful. If we query immediately, the primary is empty because the allocator hasn't had a chance to run yet. Presumably, this still shows useful info about other replicas though.

On the other hand, we can sleep for a couple of seconds until the allocator has had a chance to run. Then the primary will be reassigned. This shortcuts needing to run shards list immediately after running shards unassign. Presumably I could do something fancy with watching for the assignment to take place and wait for that to happen, but I'm not sure if that's a good tactic or worth the trouble.

What do you think?

consumer/shard_api.go, line 197 at r1 (raw file):

Previously, jgraettinger (Johnny Graettinger) wrote…

You'll need to RLock the KeySpace mutex here, to guard access to Resolver.state

I had this originally, but I didn't end up accessing the state with the "delete all the assignments" solution I'd come up with. Forgot to add it back when I narrowed it down to the primary. 😬

consumer/shard_api.go, line 204 at r1 (raw file):

Previously, jgraettinger (Johnny Graettinger) wrote…

If the primary shard has status < FAILED, then this should further check that:

The number of assignments matches the desired replication.

All shard assignments are consistent.

These are sanity checks which help sure the required replication invariants of the shard.
If the primary is FAILED, we can of course be more willing to remove it.

See:

https://pkg.go.dev/go.gazette.dev/[email protected]/allocator#IsConsistentFn

https://pkg.go.dev/go.gazette.dev/[email protected]/consumer#ShardIsConsistent

Any other potential sanity checks you're aware of?

Good call. Can an inconsistent shard reach a consistent state if the primary fails?

cmd/gazctl/gazctlcmd/gazctl.go, line 74 at r1 (raw file):

Previously, jgraettinger (Johnny Graettinger) wrote…

nit: I think this can be inlined into cmdShardsUnassign (or you could re-use pruneConfig?). Other structs in here were all extracted because they're used in multiple places (typically for a journals vs shards flavor of an operation).

for now, or later: What about a boolean Failed which further restricts the selection to primary shards which have a status of FAILED, and leaves others alone ?

Yep, it can be. I put this one here since it looked like the other configs were defined here as some sort of easy reference. I'll inline it.

Heh, I had --failed and then removed it. I'll add it back.

saterus · 2021-12-20T17:39:38Z

Thanks for the review. I believe I've addressed everything you spotted in the first round. I think it's ready for another look.

Submit multiple shard ids in a single RPC, rather than submitting multiple RPCs.

jgraettinger

LGTM % comments

Reviewed 8 of 9 files at r2, 1 of 1 files at r3, all commit messages.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @saterus)

cmd/gazctl/gazctlcmd/shards_unassign.go, line 52 at r1 (raw file):

Previously, saterus (Alex Burkhart) wrote…

I've taken a stab at this one. If we used the old listShard results, it'll have the previous state of the shard, which isn't useful. If we query immediately, the primary is empty because the allocator hasn't had a chance to run yet. Presumably, this still shows useful info about other replicas though.

On the other hand, we can sleep for a couple of seconds until the allocator has had a chance to run. Then the primary will be reassigned. This shortcuts needing to run shards list immediately after running shards unassign. Presumably I could do something fancy with watching for the assignment to take place and wait for that to happen, but I'm not sure if that's a good tactic or worth the trouble.

What do you think?

Neat! This seems useful, but the sleep does feel like a smell, and the subcommand is probably doing too much since the user could also do a follow-up list with the same selector to get this. The prior primary & status might be more interesting to log since that state is no longer query-able after the unassign. Again, we're in supernit territory here though.

consumer/shard_api.go, line 204 at r1 (raw file):

Previously, saterus (Alex Burkhart) wrote…

Good call. Can an inconsistent shard reach a consistent state if the primary fails?

Yep: https://github.com/gazette/core/blob/v0.89.0/consumer/key_space.go#L68

A failed primary assignment is always consistent, and a failed replica is consistent if and only if the primary has also failed.

consumer/shard_api.go, line 207 at r3 (raw file):

			return resp, err
		} else if req.OnlyFailed && primaryAssignment.AssignmentValue.(*pc.ReplicaStatus).Code != pc.ReplicaStatus_FAILED {
			return resp, nil

this should be a continue right? We're not skipping all shards, just this one?

saterus

Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @jgraettinger and @saterus)

cmd/gazctl/gazctlcmd/shards_unassign.go, line 52 at r1 (raw file):

Previously, jgraettinger (Johnny Graettinger) wrote…

Neat! This seems useful, but the sleep does feel like a smell, and the subcommand is probably doing too much since the user could also do a follow-up list with the same selector to get this. The prior primary & status might be more interesting to log since that state is no longer query-able after the unassign. Again, we're in supernit territory here though.

I'll swap it out for the prior assignments. That's easy and doesn't require the sleep.

consumer/shard_api.go, line 207 at r3 (raw file):

Previously, jgraettinger (Johnny Graettinger) wrote…

this should be a continue right? We're not skipping all shards, just this one?

Yep, good catch. I'll add a test for this case.

This test helper change caused several failures in tests I did not touch. I believe this is caused by real behavior being triggered when the shard has a "PRIMARY" replica, but the tests do not expect those effects and thus fail. So rather than setting the replica status for every test using the `allocateShard` helper, we'll just be very explicit and set the status in the places where it needs to be set.

The user can always run `shards list` immediately to see the new status.

saterus · 2021-12-21T16:50:17Z

@jgraettinger I believe I've addressed the test failures and your latest concerns. I think this is ready for a final look.

jgraettinger

Reviewed 4 of 4 files at r4, all commit messages.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @saterus)

jgraettinger

Still LGTM

Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @saterus)

Alex Burkhart added 3 commits December 15, 2021 16:51

Add Unassign RPC to consumer protocol

915d216

Implement ShardUnassign RPC

4895660

Add shards unassign command to gazctl

89ed8cc

jgraettinger requested changes Dec 16, 2021

View reviewed changes

Alex Burkhart added 3 commits December 16, 2021 11:21

Inline config definition

1c26c8d

Add more details to command output

ea644f8

Check shard consistency before removing assignment

a3a6fb3

Alex Burkhart added 2 commits December 16, 2021 12:48

Remove Header field from Unassign RPC

1b6fe3c

Add simple RPC validation tests

734cff1

jgraettinger requested changes Dec 16, 2021

View reviewed changes

Add --failed flag to shards unassign

be2b023

saterus commented Dec 20, 2021

View reviewed changes

Batch remove shard assignments

a985cb5

Submit multiple shard ids in a single RPC, rather than submitting multiple RPCs.

saterus force-pushed the alex/add-shard-unassign-cmd branch from 82c512d to a985cb5 Compare December 20, 2021 17:49

jgraettinger approved these changes Dec 20, 2021

View reviewed changes

saterus commented Dec 20, 2021

View reviewed changes

Alex Burkhart added 3 commits December 21, 2021 10:59

Gracefully skip non-failed replicas

d755c18

Output previous shard status after unassign cmd

866970e

The user can always run `shards list` immediately to see the new status.

jgraettinger approved these changes Dec 21, 2021

View reviewed changes

saterus merged commit ba6e457 into gazette:master Dec 21, 2021

jgraettinger deleted the alex/add-shard-unassign-cmd branch May 10, 2023 21:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `shard unassign` cmd #312

Add `shard unassign` cmd #312

Uh oh!

saterus commented Dec 15, 2021 •

edited by jgraettinger

Loading

Uh oh!

jgraettinger left a comment

Uh oh!

saterus commented Dec 16, 2021

Uh oh!

jgraettinger left a comment

Uh oh!

saterus left a comment

Uh oh!

saterus commented Dec 20, 2021

Uh oh!

jgraettinger left a comment

Uh oh!

saterus left a comment

Uh oh!

saterus commented Dec 21, 2021

Uh oh!

jgraettinger left a comment

Uh oh!

jgraettinger left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add shard unassign cmd #312

Add shard unassign cmd #312

Uh oh!

Conversation

saterus commented Dec 15, 2021 • edited by jgraettinger Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jgraettinger left a comment

Choose a reason for hiding this comment

Uh oh!

saterus commented Dec 16, 2021

Uh oh!

jgraettinger left a comment

Choose a reason for hiding this comment

Uh oh!

saterus left a comment

Choose a reason for hiding this comment

Uh oh!

saterus commented Dec 20, 2021

Uh oh!

jgraettinger left a comment

Choose a reason for hiding this comment

Uh oh!

saterus left a comment

Choose a reason for hiding this comment

Uh oh!

saterus commented Dec 21, 2021

Uh oh!

jgraettinger left a comment

Choose a reason for hiding this comment

Uh oh!

jgraettinger left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add `shard unassign` cmd #312

Add `shard unassign` cmd #312

saterus commented Dec 15, 2021 •

edited by jgraettinger

Loading