This page describes how to apply long-running schema changes in Vitess/MySQL
without disrupting ongoing operations. Examples for long-running changes on
large databases are ALTER TABLE (for example to add a column), OPTIMIZE
TABLE or large-scale data changes (e.g. populating a column or clearing out
values).
If a schema change is not long-running, please use the simpler vtctl ApplySchema instead.
One solution to realize such long-running schema changes is to use a temporary table and keep it in sync with triggers as originally proposed by Shlomi and further refined by others (Percona's pt-online-schema-change, Square's Shift).
Here we describe an alternative solution which uses a combination of MySQL's statement based replication and backups to apply the changes to all tablets. Since the long-running schema changes are applied to an offline tablet, ongoing operations are not affected. We called this process schema swap due to the way it's done, and therefore we refer to it by this name throughout the document.
This tutorial outlines the necessary steps for a schema swap and is based on the Vitess Kubernetes Getting Started Guide.
At the high level, a schema swap comprises the following phases:
You may be wondering: Why does this work?
The key here is that the new schema is backward compatible with respect to statements sent by the app. The replication stream remains backward compatible as well because we use statement based replication. As a consequence, the new schema must not be used until it has been changed on all tablets. If the schema would have been used e.g. when an insert uses a new column, replication would break on tablets which have the old schema. Swapping schema on all tablets first ensures this doesn't happen.
Also note that the changes are applied to only one tablet and then all other tablets are restored from the backup. This is more efficient than applying the long-running changes on every single tablet.
Now let's carry out an actual schema swap based on our Guestbook example schema. We'll add a column to it.
We assume that you have followed the Vitess Kubernetes Getting Started Guide up to and including the step "9. Create a table".
Another way to start the schema swap is to execute vtctlclient command:
vitess/examples/local$ ./lvtctl.sh WorkflowCreate schema_swap -keyspace=test_keyspace -sql='SQL statement'
From this point on all you need to do is watch how the schema swap process is progressing. Try expanding the displayed nodes in vtctld UI and look at the logs of all the actions that process is doing. Once the UI shows "Schema swap is finished" you can start using the new schema, it will be propagated to all tablets.