pg_rewind Benefits in PostgreSQL
pg_rewind is a PostgreSQL utility designed to synchronize a diverged PostgreSQL instance with
another instance. It is typically used in high availability (HA) and replication setups when a
former primary node needs to be rejoined as a standby after failover.
Instead of performing a full base backup, pg_rewind quickly brings a node back in sync by
identifying and copying only the necessary changes, making the process much faster and more
efficient.
Benefits of pg_rewind
Benefit Description
Unlike a full backup and restore, pg_rewind only copies the
Faster Recovery
changes made after divergence, making it significantly faster.
Instead of transferring the entire database, it transfers only
Efficient Data Synchronization the modified data blocks, reducing the amount of data
copied.
Reduces downtime when reintroducing a failed primary node
Minimal Downtime
as a standby in replication setups.
Reduces Storage and Network Since only changed data is copied, it minimizes storage and
Overhead network bandwidth usage.
Seamless Integration with Works well with streaming replication and automated failover
Replication and Failover tools like Patroni, Repmgr, or Pacemaker.
When to Use pg_rewind?
• After a failover where the original primary has diverged from the new primary.
• When a former primary needs to be reattached as a standby.
• To avoid a full backup and restore when recovering a failed node.
Limitations of pg_rewind
• Only works if WAL (Write-Ahead Log) history is available.
• Requires the target node to have been cleanly shut down.
• May not work with major PostgreSQL version upgrades.
Step-by-Step Guide to Using pg_rewind in PostgreSQL HA Setup
Step 1: Ensure Prerequisites
• PostgreSQL version 9.5 or later.
• The diverged node must have been a part of the same cluster.
• Ensure WAL history is still available on the new primary.
• The target node must be shut down cleanly.
Step 2: Stop the Old Primary Node
systemctl stop postgresql
Step 3: Prepare the Diverged Node
Instead of removing all data, we ensure the data directory is in a clean state for pg_rewind.
mv /var/lib/pgsql/data /var/lib/pgsql/data_old
mkdir -p /var/lib/pgsql/data
chown postgres:postgres /var/lib/pgsql/data
chmod 700 /var/lib/pgsql/data
Step 4: Use pg_rewind to Resync the Diverged Node
pg_rewind --target-pgdata=/var/lib/pgsql/data --source-server="host=NEW_PRIMARY_IP
user=postgres" --progress
Step 5: Start the Node in Standby Mode
• Copy the recovery.conf (or standby.signal in PostgreSQL 12+)
echo "standby_mode = 'on'" > /var/lib/pgsql/data/recovery.conf
echo "primary_conninfo = 'host=NEW_PRIMARY_IP user=replication password=yourpassword'"
>> /var/lib/pgsql/data/recovery.conf
Step 6: Restart the Node
systemctl start postgresql
Step 7: Verify Replication
psql -c "SELECT * FROM pg_stat_replication;"
This guide ensures a failed primary node can be safely reintegrated as a standby in a PostgreSQL
HA setup without requiring a full backup restore.