Thanks to visit codestin.com
Credit goes to github.com

Skip to content
This repository was archived by the owner on Jun 21, 2023. It is now read-only.

Add support for xtrabackup, xtrabackup-stream, mydumper, mysqldump streaming methods for slave provisioning #21

Open
21 of 26 tasks
MaxFedotov opened this issue Apr 12, 2018 · 3 comments
Assignees

Comments

@MaxFedotov
Copy link

MaxFedotov commented Apr 12, 2018

  • On agent get mysql datadir, log_error, innodb_log_group_home_dir from my.cnf. We can use some type of notifications mechanism (like fsnotify) to subscribe for changes in config files and reload configuration dynamically on change. This will help us to keep in touch with changes in MySQL configuration without a need to restart orchestrator-agent
  • [AGENT] New agent function DeleteDirContents, which we will use later in order to remove mysql datadir contents and for /api/delete-mysql-backupdir and /api/delete-mysql-datadir
  • [AGENT] New agent API /api/delete-mysql-backupdir, which we will use later in order to delete backups after seed will be completed.
  • [AGENT] Change /api/delete-mysql-datadir from using command from config to DeleteDirContents
  • [AGENT] New agent API – /api/mysql-backupdir-available-space Will be used in pre-seed checks. Create new API on agent, add backupDirAvailiable space to type Agent struct
  • [AGENT] New agent API /api/available-seed-methods Agent need to provide information about available seed methods – this depends on binaries we have installed on it. Right now we will support LVM, xtrabackup, xtrabackup-stream, mydumper and mysqldump. Create new API on agent, add seedMethods to type Agent struct
  • [AGENT] New agent API /api/mysql-info Agent need to provide information about MySQL version, MySQL datadir location, if this server is already slave (has output from SHOW SLAVE STATUS command, will fail if our targetHost is already a slave), if this server is already master (SHOW SLAVE HOSTS command, will fail is our targetHost is already a master), if it has currently active connections to MySQL(SELECT COUNT * FROM INFORMATION_SCHEMA.PROCESSLIST WHERE USER NOT IN (‘MySQLUser’, ‘other_system_users’), will fail if number of active connections to targetHost is larger than some thresold) We will use it in pre-seed check on targetHost and SourceHost. Create new API on agent, new type MySQLInfo struct and add type MySQLInfo struct to type Agent struct
  • [AGENT] New agent API /api/mysql-databases Agent need to provide information about databases, engines and db sizes, which present on server. We need this because:
  • Not all engines supports xtrabackup (for example TokuDB)
  • Sometimes we want to restore only some user databases and save old mysql db with logins\pass information (partial backup)
  • Physical size is the size of database on disk (du database_folder) - we will need it for partial xtrabackup in order to check if there is enough space in backup directory on sourceHost and in backup directory\data directory on targetHost
  • Logical size in the size of data in the database (estimations) - we will need it for partial mydumper\mysqlbackup in order to check if there is enough space in backup directory on sourceHost and in backup directory\data directory on targetHost. We can calculate this use some estimations:
    -- if we compress backup, multiply physical size * 0.5
    -- if we don't compress, multiply physical size * 0.8
    Create Api on agent, new type MySQLDatabases struct, new type MySQLDatabaseInfo struct and add MySQLDatabases to type Agent struct.
    As a result we will get JSON like this:
    {"MySQLDatabases":{"employees":{"Engines":["InnoDB"],"PhysicalSize":186929946,"LogicalSize":112157967},"log":{"Engines":["InnoDB"],"PhysicalSize":32307770,"LogicalSize":19384662},"test":{"Engines":["InnoDB","MyISAM"],"PhysicalSize":116501,"LogicalSize":69900},"test2":{"Engines":["InnoDB"],"PhysicalSize":106921,"LogicalSize":64152}},"InnoDBLogSize":100663296}
  • In order to do this agent will be needed to have an access to MySQL installed on server, so we need to add new configuration variables MySQLTopologyUser and MySQLTopologyPassword for agent config
  • [AGENT] Add new agent configuration variable MySQLBackupDir - path where we will store backups
  • [AGENT] Add new agent configuration variable SeedPort- port we will use in order to stream xtrabackup and copy data
  • [AGENT] Add new agent configuration variable MySQLReplicationUser- user that will be used for replication
  • [AGENT] Add new agent configuration variable MySQLReplicationPassword- password for MySQLReplicationUser
  • [AGENT] Add new agent configuration variable MySQLBackupUsersOnTargetHost. When we perform partial backup, we also need to backup system databases. If MySQLBackupUsersOnTargetHost is empty, before restoring backup on targetHost we will first backup mysql database on it and restore after seed operation completes. If MySQLBackupUsersOnTargetHost is set, we will backup only these users and restore them after seed operation completes
  • [AGENT] Add some agent configuration variables for MyDumper\xtrabackup (like number of threads etc…)

  • [ORCHESTRATOR] Change GetAgent function so we get all new data and put it into type Agent struct
  • [ORCHESTRATOR] Let’s keep func Seed as main entrance point for all seeding operations, but add additional param seedMethod and make a switch in this function to start different operations for different seed methods. We should also add optional parameter with a list of databases to copy. If this parameter is missing - we copy all databases. Also in case of xtrabackup stream we need to add another bool parameter streamToDatadir – if it is true we will stream backup directly to MySQL datadir and will add some prerequisite actions. If it false – we stream backup to MySQLBackupDir. So API call for seed would look like agent-seed/:targetHost/:sourceHost/:seedMethod/:streamToDatadir/:optionalListOfDatabases
  • [ORCHESTRATOR] Change logic of Submit agent function. Right now when we first time submit the agent we get only basic data from it (hostname, port, token, last_submited) and thus we need to wait for AgentPollIMinutes interval so this agent became outdated and we get additional data using UpdateAgentInfo. This interval by default is 60 min, so we get all data and can start seeding only after 60 min. We can add logic, so if this agent is submitted for the first time and we don’t have information about It in orchestrator db - run UpdateAgentInfo immediately and get all other data
  • [ORCHESTRATOR] Change func SubmitSeedEntry function. Add seedMethod as param to it and to agent_seed table
  • [ORCHESTRATOR] We will keep updateSeedStateEntry and updateSeedComplete the same as they are and will use them to track all other seed methods

  • [AGENT] New agent API /api/start-local-backup/:seedId/:seedMethod/:optionalListOfDatabases
    Will be used on sourceHost to:
  • create a folder to store backup(possible naming convention using backup date)
  • start mysqldump/mydumper/xtrabackup(without streaming) with necessary params (threads, optionalListOfDatabases if they are present)
  • Check that replication user exists on sourceHost. If it not exists - create it and GRANT REPLICATION SLAVE ON . TO it
    This API call should return path to created folder with backup
  • [AGENT] New agent API /api/receive-backup/:seedId/:seedMethod/:backupFolder (backupFolder must be url encoded)
  • if backup folder == config.Config.MySQLDatadir:
    -- backup users to config.Config.MySQLBackupDir (either all, or only those specified in config.Config.MySQLBackupUsersOnTargetHost)
    -- stop MySQL
    -- remove everything from MySQL datadir
    -- remove ib_logfiles from innodb_log_group_home_dir
  • else create backup folder
  • start netcat to listen on SeedPort port with –-wait parameter = 30 sec in order to terminate netcat after seeding operation complete
  • if seedMethod = xtrabackup-stream pipe netcat to xbstream. For others pipe it to tar
  • [AGENT] New agent API /api/send-local-backup/:seedId/:targetHost/:backupFolder will be used on sourceHost to tar.gz contents of backup folder and send archive to targetHost on SeedPort port using netcat
  • [AGENT] New agent API /api/start-streaming-backup/:seedId/:targetHost/:optionalListOfDatabases Will be used to start xtrabackup stream and send it using netcat to targetHost on SeedPort.
    -- Check that replication user exists on sourceHost. If it not exists - create it and GRANT REPLICATION SLAVE ON . TO it
  • [AGENT] New agent API /api/start-restore/:seedId/:seedMethod/:sourceHost/:sourcePort/:backupFolder/:optionalListOfDatabases will be used on targetHost to:
  • if optionalListOfDatabases is not empty - add replicate-do-db to targetHost my.cnf and restart MySQL
  • backup users to config.Config.MySQLBackupDir (either all, or only those specified in config.Config.MySQLBackupUsersOnTargetHost)
  • [MYSQLDUMP]:
    -- if backup is compressed gunzip it and execute all backup.sql file in backupFolder
    -- create replication user
    -- execute START SLAVE
  • [MYDUMPER]:
    -- run myloader with nessesary params
    -- create replication user
    -- parse metadata file in MySQLBackupDir and execute CHANGE MASTER TO and START SLAVE (2 different cases for GTID and positional replicas – will create separate function for this)
  • [XTRABACKUP FULL\XTRABACKUP STREAM FULL TO BACKUPDIR](optionalListOfDatabases is empty, backupFolder != config.Config.MySQLDatadir, seedMethod xtrabackup or xtrabackup-stream):
    -- run xtrabackup –-prepare on copied backup in backupFolder
    -- stop MySQL
    -- remove everything from config.Config.MySQLDatadir
    -- remove ib_logfiles from innodb_log_group_home_dir
    -- run xtrabackup –-copy-back -–target-dir=backupdir (or may be use --move-back??)
    -- start MySQL
    -- parse xtrabackup_binlog_info in MySQL datadir and execute CHANGE MASTER TO and START SLAVE
  • [XTRABACKUP PARTIAL\XTRABACKUP STREAM PARTIAL TO BACKUPDIR](optionalListOfDatabases is not empty, backupFolder != config.Config.MySQLDatadir, seedMethod xtrabackup or xtrabackup-stream):
    -- run xtrabackup –-prepare on copied backup in backupFolder
    -- stop MySQL
    -- remove everything from config.Config.MySQLDatadir
    -- remove ib_logfiles from innodb_log_group_home_dir
    -- run xtrabackup –-copy-back -–target-dir=backupdir
    -- start MySQL
    -- if partial backup - run mysql_update to restore system databases
    -- create replication user
    -- parse xtrabackup_binlog_info in MySQL datadir and execute CHANGE MASTER TO and START SLAVE
  • [XTRABACKUP STREAM FULL TO DATADIR](optionalListOfDatabases is empty, backupFolder = config.Config.MySQLDatadir, seedMethod xtrabackup-stream):
    -- run xtrabackup –-prepare on config.Config.MySQLDatadir
    -- start MySQL
    -- parse xtrabackup_binlog_info in MySQL datadir and execute CHANGE MASTER TO and START SLAVE
  • [XTRABACKUP STREAM PARTIAL TO DATADIR](optionalListOfDatabases is not empty, backupFolder = config.Config.MySQLDatadir, seedMethod xtrabackup-stream):
    -- run xtrabackup –-prepare on config.Config.MySQLDatadir
    -- start MySQL
    -- run mysql_update to restore system databases
    -- create replication user
    -- parse xtrabackup_binlog_info in MySQL datadir and execute CHANGE MASTER TO and START SLAVE
  • restore users
  • [AGENT] New agent API /api/cleanup/:seedId Will be used on targetHost and sourceHost in order to remove contents of config.Config.MySQLBackupdir after seed process

All this changes are completely Backward compatible (except that for now we will need to add “lvm” param as seedMethod when we use “Seed” button on an agent page in Snapshots area) and won’t affect current agent and orchestrator workflow. 


In order to support this new seed methods, we can reuse part of the logic of current executeSeed function - divide backup\restore flow into finite operations and use UpdateSeedStateEntry function in order to track progress.

As a some type of draft, I suppose following scenarios for new seed methods:

Basic checks for all types of seed methods:

  • Check that there are no active seeds for targetHost and sourceHost (query agent_seed table)
  • Check that MySQL has the same major versions on both targetHost and sourceHost
  • Check that binary logging is enabled on sourceHost
  • Check that targetHost isn’t master for some other hosts
  • Check that targetHost isn't slave for some other hosts
  • Check that there are no database connections on targetHost
  • [XTRABACKUP ONLY] Check that there is no TokuDB engine in a databases on sourceHost

Draft for mysqldump/mydumper/xtrabackup:

  • Call agent-seed/:targetHost/:sourceHost/:seedMethod/:streamToDatadir/:optionalListOfDatabases (:streamToDatadir = false)
  • Run GetAgent function for sourceHost
  • Run GetAgent function for targetHost
  • Run basic checks
  • Get size of backup and check that we have enough space in MySQLBackupDir folders on sourceHost and targetHost:
    -- [MYDUMPER/MYSQLDUMP] calculate backup size it as sum of logicalSize of needed databases
    -- [XTRABACKUP FULL] (optionalListOfDatabases is empty) – calculate backup size as a size of MySQL datadir (use agent API /api/mysql-du) + some space for ib_logfile which will be created during xtrabackup --prepare
    -- [XTRABACKUP PARTIAL] calculate backup size as a sum of physicalSize of needed databases + some space for ib_logfile which will be created during xtrabackup --prepare
  • Check that we have enough space in targetHost MySQL datadir:
    -- [MYDUMPER/MYSQLDUMP/XTRABACKUP PARTIAL] - calculate it as a sum of physicalSize of needed databases
    -- [XTRABACKUP FULL] (optionalListOfDatabases is empty) calculate it as a size of MySQL datadir (use agent API /api/mysql-du) + some space for ib_logfile which will be created during xtrabackup --prepare
  • Check that MySQL on sourceHost is running
  • Start backup process on sourceHost /api/start-local-backup/:seedId/:seedMethod/:optionalListOfDatabases. This API call will return path to directory with backup
  • Start receiving on targetHost /api/receive-backup/:seedId/:seedMethod/:backupFolder. If :streamToDatadir = true than use Agent.MySQLInfo.MySQLDatadirPath as :backupFolder, else use path returned from /api/start-local-backup/:seedId/:seedMethod/:optionalListOfDatabases
  • Start sending backup archive on sourceHost /api/send-local-backup/:seedId/:targetHost/:backupFolder
  • Start restore process on targetHost /api/start-restore/:seedId/:seedMethod/:sourceHost/:sourcePort/:backupFolder/:optionalListOfDatabases. If :streamToDatadir = true than use Agent.MySQLInfo.MySQLDatadirPath as :backupFolder, else use path returned from /api/start-local-backup/:seedId/:seedMethod/:optionalListOfDatabases
  • Run /api/cleanup/:seedId on sourceHost
  • Run /api/cleanup/:seedId on targetHost
  • Run UpdateSeedComplete

Draft for xtrabackup-stream:

  • Call agent-seed/:targetHost/:sourceHost/:seedMethod/:streamToDatadir/:optionalListOfDatabases
  • Run GetAgent function for sourceHost
  • Run GetAgent function for targetHost
  • Run basic checks
  • [IF streamToDataDir = false] Get size of backup and check that we have enough space in MySQLBackupDir folders on sourceHost and targetHost:
    -- [XTRABACKUP FULL] (optionalListOfDatabases is empty) – calculate backup size as a size of MySQL datadir (use agent API /api/mysql-du) + some space for ib_logfile which will be created during xtrabackup --prepare
    -- [XTRABACKUP PARTIAL] calculate backup size as a sum of physicalSize of needed databases + some space for ib_logfile which will be created during xtrabackup --prepare
  • Check that we have enough space in targetHost datadir:
    -- [XTRABACKUP PARTIAL] - calculate it as a sum of physicalSize of needed databases
    -- [XTRABACKUP FULL] (optionalListOfDatabases is empty) calculate it as a size of MySQL datadir (use agent API /api/mysql-du) + some space for ib_logfile which will be created during xtrabackup --prepare
  • Check that MySQL on sourceHost is running
  • Start receiving on targetHost /api/receive-backup/:seedId/:seedMethod/:backupFolder. If :streamToDatadir = true than use Agent.MySQLInfo.MySQLDatadirPath as :backupFolder, else use Agent.MySQLInfo.MySQLBackupdirPath\generate_new_folder_name
  • Start streaming backup on sourceHost /api/start-streaming-backup/:optionalListOfDatabases
  • Start restore process on targetHost /api/start-restore/:seedId/:seedMethod/:sourceHost/:sourcePort/:backupFolder/:optionalListOfDatabases. If :streamToDatadir = true than use Agent.MySQLInfo.MySQLDatadirPath as :backupFolder, else use Agent.MySQLInfo.MySQLBackupdirPath\generate_new_folder_name
  • Run /api/cleanup/:seedId on sourceHost
  • Run /api/cleanup/:seedId on targetHost
  • Run UpdateSeedComplete

Also we will need to add these methods to agents page in orchestrator UI, but that’s the thing that I need to research a bit, because I’m not good at all in all this frontend-developers stuff :)

Do not forget to set 700 permissions for orchestrator-agent.conf.json in order to secure passwords

@MaxFedotov
Copy link
Author

MaxFedotov commented Apr 13, 2018

@shlomi-noach Hi Shlomi.
Can you please take a look at this draft? Want to know your opinion and maybe some comments\critics before starting to implement it

@colinmollenhour
Copy link

What about the CLONE command added in 8.0.17? It seems this could greatly simplify orchestrator's slave provisioning even making it possible without the use of orchestrator-agent at all?

@ronivay
Copy link

ronivay commented Jan 26, 2022

Any updates on this? Would love to see said seed options being implemented.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants