-
Notifications
You must be signed in to change notification settings - Fork 63
Description
Why are we implementing it? (sales eng)
The documentation is missing currently. By properly documenting migration methods in different scenarios helps users trust using Citus as a large-scale data solution.
What are the typical use cases?
Migrate a self-hosted Citus cluster into new setup (perhaps more powerful hardware, different geographical location or using new networking architecture - hopefully with minimal downtime.
How does this work? (devs)
On idea level
- deploy a new Citus cluster having same number of workers as the original cluster 1..N
- copy the data over and keep it in sync while clients keep on using the old cluster
- switch over the clients to the new cluster
I presume step 2 could be achieved using publications & subscriptions of non-citus-metadata tables ("workload" tables) between each old and new worker pair N. Clusters will eventually become in-sync regarding workload data, but the new cluster would have its unique metadata (mainly pg_dist_node).
Using pubsub would allow installing the new cluster with newer PG version, having the PG version upgrade would be a "free" byproduct.
We've done very successfully a few of this kind of pubsub migrations into a newer PG versions with regular Patroni clusters, with downtime of single digit seconds, so it would be nice if same could be done with Citus as well.
Corner cases, gotchas
Are there relevant blog posts or outside documentation about the concept/feature?
I created an issue in the main Citus repo before knowing there exists this documentation repo as well.
More details about my specific use case there, and in a Citus Slack message I wrote earlier.