Add support for xtrabackup, xtrabackup-stream, mydumper, mysqldump streaming methods for slave provisioning #21
Description
- On agent get mysql datadir, log_error, innodb_log_group_home_dir from my.cnf. We can use some type of notifications mechanism (like fsnotify) to subscribe for changes in config files and reload configuration dynamically on change. This will help us to keep in touch with changes in MySQL configuration without a need to restart orchestrator-agent
- [AGENT] New agent function
DeleteDirContents
, which we will use later in order to remove mysql datadir contents and for/api/delete-mysql-backupdir
and/api/delete-mysql-datadir
- [AGENT] New agent API
/api/delete-mysql-backupdir
, which we will use later in order to delete backups after seed will be completed. - [AGENT] Change
/api/delete-mysql-datadir
from using command from config to DeleteDirContents - [AGENT] New agent API –
/api/mysql-backupdir-available-space
Will be used in pre-seed checks. Create new API on agent, add backupDirAvailiable space to type Agent struct - [AGENT] New agent API
/api/available-seed-methods
Agent need to provide information about available seed methods – this depends on binaries we have installed on it. Right now we will support LVM, xtrabackup, xtrabackup-stream, mydumper and mysqldump. Create new API on agent, add seedMethods to type Agent struct - [AGENT] New agent API
/api/mysql-info
Agent need to provide information about MySQL version, MySQL datadir location, if this server is already slave (has output fromSHOW SLAVE STATUS
command, will fail if our targetHost is already a slave), if this server is already master (SHOW SLAVE HOSTS
command, will fail is our targetHost is already a master), if it has currently active connections to MySQL(SELECT COUNT * FROM INFORMATION_SCHEMA.PROCESSLIST WHERE USER NOT IN (‘MySQLUser’, ‘other_system_users’)
, will fail if number of active connections to targetHost is larger than some thresold) We will use it in pre-seed check on targetHost and SourceHost. Create new API on agent, new type MySQLInfo struct and add type MySQLInfo struct to type Agent struct - [AGENT] New agent API
/api/mysql-databases
Agent need to provide information about databases, engines and db sizes, which present on server. We need this because: - Not all engines supports xtrabackup (for example TokuDB)
- Sometimes we want to restore only some user databases and save old mysql db with logins\pass information (partial backup)
- Physical size is the size of database on disk (
du database_folder
) - we will need it for partial xtrabackup in order to check if there is enough space in backup directory on sourceHost and in backup directory\data directory on targetHost - Logical size in the size of data in the database (estimations) - we will need it for partial mydumper\mysqlbackup in order to check if there is enough space in backup directory on sourceHost and in backup directory\data directory on targetHost. We can calculate this use some estimations:
-- if we compress backup, multiply physical size * 0.5
-- if we don't compress, multiply physical size * 0.8
Create Api on agent, new type MySQLDatabases struct, new type MySQLDatabaseInfo struct and add MySQLDatabases to type Agent struct.
As a result we will get JSON like this:
{"MySQLDatabases":{"employees":{"Engines":["InnoDB"],"PhysicalSize":186929946,"LogicalSize":112157967},"log":{"Engines":["InnoDB"],"PhysicalSize":32307770,"LogicalSize":19384662},"test":{"Engines":["InnoDB","MyISAM"],"PhysicalSize":116501,"LogicalSize":69900},"test2":{"Engines":["InnoDB"],"PhysicalSize":106921,"LogicalSize":64152}},"InnoDBLogSize":100663296}
- In order to do this agent will be needed to have an access to MySQL installed on server, so we need to add new configuration variables
MySQLTopologyUser
andMySQLTopologyPassword
for agent config - [AGENT] Add new agent configuration variable
MySQLBackupDir
- path where we will store backups - [AGENT] Add new agent configuration variable
SeedPort
- port we will use in order to stream xtrabackup and copy data - [AGENT] Add new agent configuration variable
MySQLReplicationUser
- user that will be used for replication - [AGENT] Add new agent configuration variable
MySQLReplicationPassword
- password for MySQLReplicationUser - [AGENT] Add new agent configuration variable
MySQLBackupUsersOnTargetHost
. When we perform partial backup, we also need to backup system databases. If MySQLBackupUsersOnTargetHost is empty, before restoring backup on targetHost we will first backup mysql database on it and restore after seed operation completes. IfMySQLBackupUsersOnTargetHost
is set, we will backup only these users and restore them after seed operation completes - [AGENT] Add some agent configuration variables for MyDumper\xtrabackup (like number of threads etc…)
- [ORCHESTRATOR] Change GetAgent function so we get all new data and put it into type Agent struct
- [ORCHESTRATOR] Let’s keep func Seed as main entrance point for all seeding operations, but add additional param seedMethod and make a switch in this function to start different operations for different seed methods. We should also add optional parameter with a list of databases to copy. If this parameter is missing - we copy all databases. Also in case of xtrabackup stream we need to add another bool parameter streamToDatadir – if it is true we will stream backup directly to MySQL datadir and will add some prerequisite actions. If it false – we stream backup to MySQLBackupDir. So API call for seed would look like
agent-seed/:targetHost/:sourceHost/:seedMethod/:streamToDatadir/:optionalListOfDatabases
- [ORCHESTRATOR] Change logic of Submit agent function. Right now when we first time submit the agent we get only basic data from it (hostname, port, token, last_submited) and thus we need to wait for AgentPollIMinutes interval so this agent became outdated and we get additional data using UpdateAgentInfo. This interval by default is 60 min, so we get all data and can start seeding only after 60 min. We can add logic, so if this agent is submitted for the first time and we don’t have information about It in orchestrator db - run UpdateAgentInfo immediately and get all other data
- [ORCHESTRATOR] Change func SubmitSeedEntry function. Add seedMethod as param to it and to agent_seed table
- [ORCHESTRATOR] We will keep updateSeedStateEntry and updateSeedComplete the same as they are and will use them to track all other seed methods
- [AGENT] New agent API
/api/start-local-backup/:seedId/:seedMethod/:optionalListOfDatabases
Will be used on sourceHost to: - create a folder to store backup(possible naming convention using backup date)
- start mysqldump/mydumper/xtrabackup(without streaming) with necessary params (threads, optionalListOfDatabases if they are present)
- Check that replication user exists on sourceHost. If it not exists - create it and GRANT REPLICATION SLAVE ON . TO it
This API call should return path to created folder with backup - [AGENT] New agent API
/api/receive-backup/:seedId/:seedMethod/:backupFolder
(backupFolder must be url encoded) - if backup folder == config.Config.MySQLDatadir:
-- backup users to config.Config.MySQLBackupDir (either all, or only those specified in config.Config.MySQLBackupUsersOnTargetHost)
-- stop MySQL
-- remove everything from MySQL datadir
-- remove ib_logfiles from innodb_log_group_home_dir - else create backup folder
- start netcat to listen on SeedPort port with –-wait parameter = 30 sec in order to terminate netcat after seeding operation complete
- if seedMethod = xtrabackup-stream pipe netcat to xbstream. For others pipe it to tar
- [AGENT] New agent API
/api/send-local-backup/:seedId/:targetHost/:backupFolder
will be used on sourceHost to tar.gz contents of backup folder and send archive to targetHost on SeedPort port using netcat - [AGENT] New agent API
/api/start-streaming-backup/:seedId/:targetHost/:optionalListOfDatabases
Will be used to start xtrabackup stream and send it using netcat to targetHost onSeedPort
.
-- Check that replication user exists on sourceHost. If it not exists - create it and GRANT REPLICATION SLAVE ON . TO it - [AGENT] New agent API
/api/start-restore/:seedId/:seedMethod/:sourceHost/:sourcePort/:backupFolder/:optionalListOfDatabases
will be used on targetHost to: - if optionalListOfDatabases is not empty - add replicate-do-db to targetHost my.cnf and restart MySQL
- backup users to config.Config.MySQLBackupDir (either all, or only those specified in config.Config.MySQLBackupUsersOnTargetHost)
- [MYSQLDUMP]:
-- if backup is compressed gunzip it and execute all backup.sql file inbackupFolder
-- create replication user
-- executeSTART SLAVE
- [MYDUMPER]:
-- runmyloader
with nessesary params
-- create replication user
-- parsemetadata
file in MySQLBackupDir and executeCHANGE MASTER TO
andSTART SLAVE
(2 different cases for GTID and positional replicas – will create separate function for this) - [XTRABACKUP FULL\XTRABACKUP STREAM FULL TO BACKUPDIR](
optionalListOfDatabases
is empty,backupFolder
!=config.Config.MySQLDatadir
,seedMethod
xtrabackup or xtrabackup-stream):
-- runxtrabackup –-prepare
on copied backup inbackupFolder
-- stop MySQL
-- remove everything fromconfig.Config.MySQLDatadir
-- remove ib_logfiles from innodb_log_group_home_dir
-- runxtrabackup –-copy-back -–target-dir=backupdir
(or may be use --move-back??)
-- start MySQL
-- parsextrabackup_binlog_info
in MySQL datadir and executeCHANGE MASTER TO
andSTART SLAVE
- [XTRABACKUP PARTIAL\XTRABACKUP STREAM PARTIAL TO BACKUPDIR](
optionalListOfDatabases
is not empty,backupFolder
!=config.Config.MySQLDatadir
,seedMethod
xtrabackup or xtrabackup-stream):
-- runxtrabackup –-prepare
on copied backup inbackupFolder
-- stop MySQL
-- remove everything fromconfig.Config.MySQLDatadir
-- remove ib_logfiles from innodb_log_group_home_dir
-- runxtrabackup –-copy-back -–target-dir=backupdir
-- start MySQL
-- if partial backup - runmysql_update
to restore system databases
-- create replication user
-- parsextrabackup_binlog_info
in MySQL datadir and executeCHANGE MASTER TO
andSTART SLAVE
- [XTRABACKUP STREAM FULL TO DATADIR](
optionalListOfDatabases
is empty,backupFolder
=config.Config.MySQLDatadir
,seedMethod
xtrabackup-stream):
-- runxtrabackup –-prepare
on config.Config.MySQLDatadir
-- start MySQL
-- parsextrabackup_binlog_info
in MySQL datadir and executeCHANGE MASTER TO
andSTART SLAVE
- [XTRABACKUP STREAM PARTIAL TO DATADIR](
optionalListOfDatabases
is not empty,backupFolder
=config.Config.MySQLDatadir
,seedMethod
xtrabackup-stream):
-- runxtrabackup –-prepare
on config.Config.MySQLDatadir
-- start MySQL
-- runmysql_update
to restore system databases
-- create replication user
-- parsextrabackup_binlog_info
in MySQL datadir and executeCHANGE MASTER TO
andSTART SLAVE
- restore users
- [AGENT] New agent API
/api/cleanup/:seedId
Will be used on targetHost and sourceHost in order to remove contents ofconfig.Config.MySQLBackupdir
after seed process
All this changes are completely Backward compatible (except that for now we will need to add “lvm” param as seedMethod when we use “Seed” button on an agent page in Snapshots area) and won’t affect current agent and orchestrator workflow.
In order to support this new seed methods, we can reuse part of the logic of current executeSeed function - divide backup\restore flow into finite operations and use UpdateSeedStateEntry function in order to track progress.
As a some type of draft, I suppose following scenarios for new seed methods:
Basic checks for all types of seed methods:
- Check that there are no active seeds for targetHost and sourceHost (query agent_seed table)
- Check that MySQL has the same major versions on both targetHost and sourceHost
- Check that binary logging is enabled on sourceHost
- Check that targetHost isn’t master for some other hosts
- Check that targetHost isn't slave for some other hosts
- Check that there are no database connections on targetHost
- [XTRABACKUP ONLY] Check that there is no TokuDB engine in a databases on sourceHost
Draft for mysqldump/mydumper/xtrabackup:
- Call
agent-seed/:targetHost/:sourceHost/:seedMethod/:streamToDatadir/:optionalListOfDatabases
(:streamToDatadir = false) - Run GetAgent function for sourceHost
- Run GetAgent function for targetHost
- Run basic checks
- Get size of backup and check that we have enough space in MySQLBackupDir folders on sourceHost and targetHost:
-- [MYDUMPER/MYSQLDUMP] calculate backup size it as sum of logicalSize of needed databases
-- [XTRABACKUP FULL] (optionalListOfDatabases
is empty) – calculate backup size as a size of MySQL datadir (use agent API/api/mysql-du
) + some space for ib_logfile which will be created duringxtrabackup --prepare
-- [XTRABACKUP PARTIAL] calculate backup size as a sum of physicalSize of needed databases + some space for ib_logfile which will be created duringxtrabackup --prepare
- Check that we have enough space in targetHost MySQL datadir:
-- [MYDUMPER/MYSQLDUMP/XTRABACKUP PARTIAL] - calculate it as a sum of physicalSize of needed databases
-- [XTRABACKUP FULL] (optionalListOfDatabases
is empty) calculate it as a size of MySQL datadir (use agent API/api/mysql-du
) + some space for ib_logfile which will be created duringxtrabackup --prepare
- Check that MySQL on sourceHost is running
- Start backup process on sourceHost
/api/start-local-backup/:seedId/:seedMethod/:optionalListOfDatabases
. This API call will return path to directory with backup - Start receiving on targetHost
/api/receive-backup/:seedId/:seedMethod/:backupFolder
. If:streamToDatadir = true
than useAgent.MySQLInfo.MySQLDatadirPath
as:backupFolder
, else use path returned from/api/start-local-backup/:seedId/:seedMethod/:optionalListOfDatabases
- Start sending backup archive on sourceHost
/api/send-local-backup/:seedId/:targetHost/:backupFolder
- Start restore process on targetHost
/api/start-restore/:seedId/:seedMethod/:sourceHost/:sourcePort/:backupFolder/:optionalListOfDatabases
. If:streamToDatadir = true
than useAgent.MySQLInfo.MySQLDatadirPath
as:backupFolder
, else use path returned from/api/start-local-backup/:seedId/:seedMethod/:optionalListOfDatabases
- Run
/api/cleanup/:seedId
on sourceHost - Run
/api/cleanup/:seedId
on targetHost - Run UpdateSeedComplete
Draft for xtrabackup-stream:
- Call
agent-seed/:targetHost/:sourceHost/:seedMethod/:streamToDatadir/:optionalListOfDatabases
- Run GetAgent function for sourceHost
- Run GetAgent function for targetHost
- Run basic checks
- [IF streamToDataDir = false] Get size of backup and check that we have enough space in MySQLBackupDir folders on sourceHost and targetHost:
-- [XTRABACKUP FULL] (optionalListOfDatabases
is empty) – calculate backup size as a size of MySQL datadir (use agent API/api/mysql-du
) + some space for ib_logfile which will be created duringxtrabackup --prepare
-- [XTRABACKUP PARTIAL] calculate backup size as a sum of physicalSize of needed databases + some space for ib_logfile which will be created duringxtrabackup --prepare
- Check that we have enough space in targetHost datadir:
-- [XTRABACKUP PARTIAL] - calculate it as a sum of physicalSize of needed databases
-- [XTRABACKUP FULL] (optionalListOfDatabases
is empty) calculate it as a size of MySQL datadir (use agent API/api/mysql-du
) + some space for ib_logfile which will be created duringxtrabackup --prepare
- Check that MySQL on sourceHost is running
- Start receiving on targetHost
/api/receive-backup/:seedId/:seedMethod/:backupFolder
. If:streamToDatadir = true
than useAgent.MySQLInfo.MySQLDatadirPath
as:backupFolder
, else useAgent.MySQLInfo.MySQLBackupdirPath\generate_new_folder_name
- Start streaming backup on sourceHost
/api/start-streaming-backup/:optionalListOfDatabases
- Start restore process on targetHost
/api/start-restore/:seedId/:seedMethod/:sourceHost/:sourcePort/:backupFolder/:optionalListOfDatabases
. If:streamToDatadir = true
than useAgent.MySQLInfo.MySQLDatadirPath
as:backupFolder
, else useAgent.MySQLInfo.MySQLBackupdirPath\generate_new_folder_name
- Run
/api/cleanup/:seedId
on sourceHost - Run
/api/cleanup/:seedId
on targetHost - Run UpdateSeedComplete
Also we will need to add these methods to agents page in orchestrator UI, but that’s the thing that I need to research a bit, because I’m not good at all in all this frontend-developers stuff :)
Do not forget to set 700 permissions for orchestrator-agent.conf.json in order to secure passwords