This is a generalized version of the popular schickling/dockerfiles container images postgres-backup-s3 and mysql-backup-s3, which take SQL backups of the relevant database type, optionally encrypt the backup file, and upload it to Amazon S3 (or an API-compatible data store). Those images work well but have a number of limitations:
- the backing store must be S3 or some compatible datastore supported by the
awsCLI - authentication must use static credentials supplied in non-standard environment variables
S3_ACCESS_KEY_ID, etc. (at least for the postgresql backup image) - the encryption algorithm used by the postgresql image is
aes-256-cbc, which lacks in-built authentication and has other known shortcomings
This tool uses rage for encryption and rclone to upload the backup files rather than the aws CLI, meaning you can use any other supported rclone backend as the data store, including but not limited to:
- S3 or compatible data stores, using any authentication method supported by the AWS Go SDK (static credentials, IAM roles for containers,
AssumeRoleWithWebIdentity, etc.) - Other cloud storage services such as Google Drive or Google Cloud, Azure Blob or File Storage, Dropbox, and many others
- Private cloud object stores such as OpenStack Swift
- An SFTP, SMB or WebDav file server
The ghcr.io/gatenlp/postgresql-backup-rclone and ghcr.io/gatenlp/mariadb-backup-rclone images are designed to defer where possible to the underlying tools' native configuration mechanisms rather than introducing our own configuration mechanism. This makes them slightly more complicated to set up when compared to the original schickling/dockerfiles images, but opens up the full flexibility of the underlying tools. The examples folder has some sample manifests that show how you might deploy a Kubernetes CronJob that does daily backups to a variety of data stores.
There are a small number of environment variables that are interpreted directly by the script:
BACKUP_DATABASES: the names of the databases from the target server that you want to back up, separated by commas. Each named database will be backed up to a different file.- alternatively, you can set
BACKUP_ALL=trueto dump all databases into a single file (the--all-databasesoption tomysqldump, or thepg_dumpalltool for PostgreSQL)
- alternatively, you can set
BACKUP_FILE_NAME: a file name or file name pattern to which the backup will be written. The pattern may includestrftimedate formatting directives to include the date and time of the backup as part of the file name, and may include subdirectories. For example%Y/%m/backup-%Y-%m-%dT%H-%M-%Swould include the full date and time in the file name, and place it in a folder named for the year and month, e.g.2025/08/backup-2025-08-12T13-45-15. The pattern should include only ASCII letters, numbers,_,-,/and.characters, anything else will be changed to a hyphen, and a.sql.gzsuffix will be added if it is not already present.- if not using
BACKUP_ALLmode, theBACKUP_FILE_NAMEshould include a placeholder$DBor${DB}which will be replaced by the name of the database. This is required if more than one database is named byBACKUP_DATABASES
- if not using
REMOTE_NAME: name of therclone"remote" that defines the target datastore - this can be either the name of a remote that is configured with standard rclone environment variables or configuration file, or it can be a connection string starting with:that provides the remote configuration inline, e.g.:s3,env_auth. The default value if not specified isstore, which would then typically be configured with environment variables of the formRCLONE_CONFIG_STORE_{option}.UPLOAD_PREFIX: optional prefix to prepend to the generated file name to give the final location within the rclone remote. For example, if the remote is S3 this could be the name of the bucket.
The parameters for connection to the database are provided using the native methods of each database client. Typically this is either a set of environment variables, command-line options, or a bind-mounted configuration file, or a combination of all these.
The pg_dump tool can take configuration from environment variables, files, and/or command-line parameters. In most cases you will probably use the following environment variables:
PGHOST: hostname of the database serverPGPORT: port number, if not the default 5432PGUSER: username to authenticatePGPASSWORD: password for that username- if you specify
PGPASSWORD_FILEthen the script will read the contents of that file into thePGPASSWORDvariable - alternatively you can provide your own
.pgpassformatted file with credentials, and reference that with thePGPASSFILEenvironment variable
- if you specify
Any additional command-line options passed to the container will be forwarded unchanged to pg_dump or pg_dumpall as appropriate.
The mariadb-dump tool can take configuration from environment variables, option files, and/or command-line parameters. In most cases you will probably use the following environment variables or parameters:
MYSQL_HOSTor--host=...: hostname of the database serverMYSQL_TCP_PORTor--port=...: port number, if not the default 3306MYSQL_PWDor--password=...: password for authentication- if you specify the environment variable
MYSQL_PWD_FILEthen the script will read the contents of that file into theMYSQL_PWDvariable
- if you specify the environment variable
--user=...: username for authentication - note thatmariadb-dumpdoes not provide an environment variable alternative for this option, it can only be supplied on the command line or in an option file.
Any additional command line options passed to the container will be forwarded unchanged to the mysqldump commands`.
Alternatively you can provide the connection and authentication details in a my.cnf-style "options file" bind-mounted into the container at /etc/mysql, or at some other location if you specify an argument of --defaults-extra-file=/path/to/my.cnf.
There are three basic ways to configure rclone to talk to your data store:
- use environment variables
RCLONE_CONFIG_STORE_* - bind-mount an
rclone.conffile into your container, and setRCLONE_CONFIG=/path/to/rclone.conf - set
REMOTE_NAMEto a full connection string starting with:
In most cases option 1 will be the simplest. The following sections provide examples for common datastore types.
Note: There is no way to pass command line parameters through to
rclone, but every parameter torclonehas an environment variable equivalent - take the long option form, replace the leading--withRCLONE_, change the remaining hyphens to underscores and convert to upper-case. E.g.--max-connections 3on the command line becomesRCLONE_MAX_CONNECTIONS=3in the environment.
RCLONE_CONFIG_STORE_TYPE=s3RCLONE_CONFIG_STORE_PROVIDER=AWSRCLONE_CONFIG_STORE_ENV_AUTH=true- If your bucket uses
SSE_KMSserver side encryption then you should also setRCLONE_IGNORE_CHECKSUM=true, since SSE breaks the checksum tests that rclone normally attempts to perform - By default,
rclonewill check whether the bucket exists before uploading to it, and make aHEADrequest after uploading each file to check that the upload was successful. These checks require read access to the bucket, so if your credentials have "write-only" permission (i.e. the IAM policy permitss3:PutObjectbut nots3:GetObject), then you will need to disable these checks by setting:RCLONE_S3_NO_CHECK_BUCKET=trueRCLONE_S3_NO_HEAD=true
You then need to provide the region name and credentials, in some form that the AWS SDK understands. The region is set in the variable AWS_REGION, e.g. AWS_REGION=us-east-1. For credentials, the most common option is AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY for static credentials, but other supported authentication schemes include
AWS_WEB_IDENTITY_TOKEN_FILE,AWS_ROLE_ARNandAWS_ROLE_SESSION_NAMEto assume an IAM role from a JWT tokenAWS_CONTAINER_CREDENTIALS_FULL_URI(andAWS_CONTAINER_AUTHORIZATION_TOKEN_FILE) to set an HTTP/HTTPS endpoint that serves temporary credentials, and an authorization token to use when calling it - this is set up for you automatically when using "pod identity" in EKS
UPLOAD_PREFIX would then be set to the bucketname/prefix where you want to store your backups.
To use server-side encryption with a customer-provided key:
RCLONE_S3_SSE_CUSTOMER_ALGORITHM=AES256RCLONE_S3_SSE_CUSTOMER_KEY=<your key>orRCLONE_S3_SSE_CUSTOMER_KEY_BASE64=<your key in base64>
The same approach works for other services or self-hosted datastores that are compatible with the S3 API, you just need to set RCLONE_CONFIG_STORE_PROVIDER=Minio (or whatever provider you are using) and AWS_ENDPOINT_URL_S3 to point to your provider's endpoint.
RCLONE_CONFIG_STORE_TYPE=azureblob
If you are authenticating using a container-level or account-level SAS token then the only other required environment variable would be
RCLONE_CONFIG_STORE_SAS_URL=https://accountname.blob.core.windows.net/container?<sastoken>
For any other authentication style, you must specify the account name
RCLONE_CONFIG_STORE_ACCOUNT=accountname
and then either RCLONE_CONFIG_STORE_KEY={storage-account-key} for shared key authentication, or RCLONE_CONFIG_STORE_ENV_AUTH=true for Entra ID authentication. The env_auth option will handle authentication with a service principal, workload identity (if running in AKS), or managed service identity as appropriate.
UPLOAD_PREFIX would specify the container name and any prefix within that container - the account name is part of the remote definition and comes from RCLONE_CONFIG_STORE_ACCOUNT or the SAS URL.
RCLONE_CONFIG_STORE_TYPE=sftpRCLONE_CONFIG_STORE_SHELL_TYPE=noneRCLONE_CONFIG_STORE_HOST=sftp-server-hostnameRCLONE_CONFIG_STORE_USER=myuserRCLONE_CONFIG_STORE_KEY_FILE=/path/to/privatekeyRCLONE_CONFIG_STORE_KNOWN_HOSTS_FILE=/path/to/known_hosts
You will need to mount the private key and known_hosts file into your container, set UPLOAD_PREFIX to the path on the server where you want to store the backup files - relative paths are resolved against the home directory of the authenticating user, if you want to store the files elsewhere then set the prefix to an absolute path (starting with /, e.g. UPLOAD_PREFIX=/mnt/backups).
RCLONE_CONFIG_STORE_TYPE=smbRCLONE_CONFIG_STORE_SMB_HOST=smb-server-hostnameRCLONE_CONFIG_STORE_SMB_USER=myuserRCLONE_CONFIG_STORE_SMB_PASS=mypasswordRCLONE_CONFIG_STORE_SMB_DOMAIN=workgroup
The UPLOAD_PREFIX should be of the form sharename/path
By default, the SQL dump files are stored as-is in the remote data store. If this is an off-site backup it may be desirable to have the files encrypted before upload.
These images support encryption using rage, which implements the https://age-encryption.org/v1 spec for file encryption. It encrypts the data stream with a random key using the ChaCha20-Poly1305 cipher, then encrypts the session key using an elliptic curve asymmetric cipher. The rage implementation can use the SSH ed25519 key format, and that is the simplest way to enable encryption:
- Generate a public & private key pair using
ssh-keygen -t ed25519 - Mount the public key into your backup container
- set
ENCRYPT_RECIPIENTS_FILE=/path/to/id_ed25519.pub
This will encrypt all files using the given public key (adding a .age extension to the file name) before uploading them to the data store. If you need to restore from such a file then you can decrypt it using the corresponding private key, e.g. for PostgreSQL:
rage -d -i /path/to/id_ed25519 mydb.sql.gz.age | gunzip | psql -X -d newdbAlternatively you can generate standard age key pairs using rage-keygen and then specify the age1.... identity string directly in the environment variable ENCRYPT_RECIPIENTS, then use the corresponding private keys to decrypt when you need to restore from the backups.
Images are built using docker buildx bake - running this on its own will build both the postgresql and mariadb images for your local architecture and load them into your docker image store to be run on your local machine.
To build just one or the other image, specify the name to the bake command, e.g. docker buildx bake mariadb.
To build multi-platform images and push them to a registry, use:
PROD=true DBBR_REGISTRY=ghcr.io/gatenlp/ docker buildx bake --pushPROD=true enables multi-platform image building (your buildx builder must be capable of generating these), and DBBR_REGISTRY is the registry prefix to which the images should be pushed. By default the images are tagged with both :latest and :rclone-vX.Y.Z for the version of rclone that they include.