Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@whitehawk
Copy link

Adbdev 7579 - CI ONLY

bimboterminator1 and others added 26 commits December 20, 2024 05:53
Implement cluster validation possibility

This is the first commit for building an MVP for new rebalance utility -
gprebalance. This utility is intended to be used for the situation, when after
cluster resize (after expand, shrink) is in unbalanced state. Balanced state
is defined very simple: if number of segments per host is equal across all the
hosts, then cluster is balanced. There are a lot of other aspects for proper
implementation of optimal rebalance algorithm, which will be implemented in
the next patches.

This patch adds the skeleton of future utility, providing initial validation
of rebalance possibility. It includes checks, that validate some basic aspects:
whether segments can be distributed uniformly and can target mirroring strategy
be achieved. Decided to provide validation through separate classes, which is
different approach from gpexpand utility. Also, some unit tests have been added.
Validation of available disk space is not implemented since cannot be achieved at
this initial validation step
gprebalance skeleton is complemented with additional
options from mvp specification.
This code proposes the rebalance algorithm. GpRebalance.createPlan() returns a
Plan represented by the list of Moves. The algorithm itself produces an
intiutive greed solution by manual setting the final balanced state.
The proposed code contains main framework for rebalance execution.
Some options are not implemented fully and are expected to be finished in next
tasks.

The code describes the following segment movement approach. Firstly, we creating
a movements plan: simple steps telling which segment to which host to move.
Steps in plan can be different:

Mirror only moves.
Both primary and mirror are moved to different hosts.
Primary only moves.
Primary and mirror are swapped.
For each type of movement we clarify the target dirs and ports at target hosts,
able to contain the size of moved segment. To do that the DiskFree and DiskUsage
commands are used.

The movements, in its turn, are composite and imply extra actions including
segment switching.

Mirror only moves use only single gprecoverseg call to perform movement.
If we move primary and mirror pair, the strategy is following. The mirror is
firstly moved via gprecoverseg to primary's target host. Then the roles are
switched. Then ex-primary (new mirror) is moved to mirror's target host.
Primary only moves imply 2 role switches. Switch.Move.Switch.
Primary mirror swap is executed similar to 2nd type. Mirror is moved to
primary dir in its own host. Switch. Ex-primary is moved to mirror dir in its
own host.
The status management is written in general and may contain errors.

Cleanup is prepared by RekGRpth

Co-authored-by: Georgy Shelkovy <[email protected]>
This PR intoduces the rollback handler in gprebalance MVP. The rollback
function creates new plan of movements by calculating the difference between
current configuration and original state loaded from previously pickled plan.
The changes of this patch provide the prototype for status tracking of mirror moves
during rebalance. Firstly , this patch removes the usage of gpdb table for
whole execution status. Secondly, the status manager is rewritten in order to
track execution process with status file only. If the movement step, presented
by gprecoverseg process, fails, the corresponging status (FAILED) will be
written to the internal status struct first, then will be flushed to disk.

The main purpose of these changes is also implementation of gprecoverseg
determination. The code in analyze_gprecoverseg_states() tries to implement
the SRS diagram for gprecoverseg status definition. It processes the following
scenarios:
1. A mirror move failed after pg_hba conf had been updated at primary. In this case
primary marks the mirror as being down.
2. A mirror move failed after gp_segment_configuration had been updated. Here our code
tries to determine whether pg_basebackup was executed succesfully or not.

Depending on the basebackup state, the algorithm tries to either startup the 
backuped mirror or rollback the configuration changes with recovering old mirror
+ Support table expand from alter rebalance
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants