Migrating a Postgres Database with Replication over WireGuard

Migrating your Postgres application’s database is a stage of your cloud migration that requires careful planning. Unlike flat files that can be transferred with a simple rsync, databases are particularly sensitive to consistency issues as the sequence which they receive writes is relevant to their state.

Previously we wrote about migrating MySQL databases and migrating Redis databases between different environments using master/slave replication. This article discusses the specific considerations of migrating Postgres using streaming replication over a WireGuard point-to-point VPN.

Whether you are moving from on-prem to the cloud or from cloud-to-cloud, there are two main approaches to migrating a database:

Snapshot and Import – The snapshot method involves taking a SQL dump from the source and transferring the snapshot to the destination using a secure tool such as scp or rsync. The state of the database (as of the snapshot) is imported by replaying the SQL statements into the new Postgres server.
Primary/Standby Replication – The replication method involves setting up streaming replication between the source (primary) and the destination (standby) Postgres server. An initial checkpoint (i.e. snapshot) is transferred to the standby. Then, the primary ships its logs to the standby as it receives new writes from the application server.

When the application is ready to be cut over, the streaming replication is stopped, and the standby is promoted to the primary. Once promoted, the standby will cease to merely be a read-only replica. It will become writable by the application server.

To minimize migration downtime and reflect the database’s state up to the moment of the migration, the streaming replication method is the best option for migrating an application that uses a Postgres database.

Although Postgres can be configured with encryption of data across a network, setting up a point-to-point VPN link between the primary & standby servers can also facilitate a secure data transfer for database migration. Using WireGuard, our cloud migration team achieved 20 to 25ms latency between a customer’s site and the public cloud transferring Postgres data on a typical business broadband connection.

Setting up Point-to-Point WireGuard VPN

This article assumes you have two Postgres servers with public-facing IP addresses of 192.0.2.102 (Primary) and 198.51.100.102 (Standby). For the WireGuard point-to-point VPN we will use the subnet 10.0.0.0/24 and assign the servers the VPN IP addresses 10.0.0.1 and 10.0.0.2 respectively. This article discusses setting up a WireGuard VPN in greater detail.

If the 10.0.0.0/24 subnet is already being used for another local network adapter, you can choose a different subnet for the WireGuard network.

The port for WireGuard may be the default port of 51820, or a different, available high port of your choosing, should be open on both servers to the public IP of their counterpart.

Install WireGuard on both servers following the instructions for the applicable distribution.

Generate a keypair on each server using the wg genkey command and configure the config files /etc/wireguard/wg0.conf as follows.

Primary Server (192.0.2.102 / 10.0.0.1)

[Interface]
PrivateKey = generated_private_key
ListenPort = 51820
SaveConfig = true
Address = 10.0.0.1/24

[Peer]
PublicKey = public_key_of_standby_server
AllowedIPs = 10.0.0.2/32
Endpoint = 198.51.100.102:51820
PersistentKeepalive = 10

Standby Server (198.51.100.102 / 10.0.0.2)

[Interface]
PrivateKey = generated_private_key
ListenPort = 51820
SaveConfig = true
Address = 10.0.0.2/24

[Peer]
PublicKey = public_key_of_primary_server
AllowedIPs = 10.0.0.1/32
Endpoint = 192.0.2.102:51820
PersistentKeepalive = 10

The PersistentKeepalive parameter of 10s is important when either server is behind a NAT. Otherwise, the WireGuard connection will intermittently drop.

To start the WireGuard network adapter on each server, run the sudo systemctl start wg-quick@wg0 command. Once the network adapter is enabled, sudo wg can be used to monitor its status. If you observe successful handshakes and can connect to the open ports from one server to another, enable the WireGuard adapter on reboot with the sudo systemctl enable wg-quick@wg0 command.

Preparing the Primary Postgres Server for Replication

Prior to initiating primary/standby replication between the primary and standby Postgres servers, it is necessary to transfer the Postgres checkpoint to the standby. This point-to-point VPN prepared in the previous step will facilitate doing that securely.

This article assumes both Postgres servers are using the default Postgres port of 5432.

First, modify the following values for /etc/postgresql/X.X/postgresql.conf on the primary server – where X.X is the PostgreSQL version number.

listen_addresses = ’127.0.0.1, 10.0.0.1’
…
wal_level = hot_standby
…
synchronous_commit = local
…
archive_mode = on
archive_command = 'cp %p /var/lib/postgresql/X.X/main/archive/%f'
…
max_wal_senders = 2
wal_keep_segments = 10
…
synchronous_standby_names = 'pgstandby001'

Create an archive directory where the logs to be shipped to the standby will be stored.

mkdir -p /var/lib/postgresql/X.X/main/archive/
chmod 700 /var/lib/postgresql/X.X/main/archive/
chown -R postgres:postgres /var/lib/postgresql/X.X/main/archive/

Add the network addresses of the local, primary, and standby server to the end of the pg_hba.conf file.

# Localhost
host    replication     replica          127.0.0.1/32            md5

# Primary IP address
host    replication     replica          10.0.0.1/32            md5

# Standby IP address
host    replication     replica          10.0.0.2/32            md5

Restart Postgres on the primary server with the sudo systemctl restart postgresql command.

Now, create a replication user replicator that will be used for replication.

su - postgres
psql
CREATE USER replicator REPLICATION LOGIN ENCRYPTED PASSWORD 'changeth!s';

Initiating Replication from Standby Postgres Server

The steps below this divider should be undertaken on the standby server.

Next, switch to the shell of the standby server – which should already have Postgres installed. Before proceeding, stop the running Postgres service with the sudo systemctl stop postgresql command.

Modify the following values for /etc/postgresql/X.X/postgresql.conf on the standby server – where X.X is the PostgreSQL version number.

listen_addresses = ’127.0.0.1, 10.0.0.2’
…
wal_level = hot_standby
…
synchronous_commit = local
…
max_wal_senders = 2
wal_keep_segments = 10
…
synchronous_standby_names = 'pgstandby001'
…
hot_standby = on

Now, backup the existing Postgres data directory as copying the checkpoint from the primary server will overwrite it.

su - postgres

Logging in to the Postgres user will take you into the data directory of the Postgres server, /var/lib/postgresql/.

Switch to the directory corresponding to your Postgres server’s version number, X.X.

Then, rename the main directory storing the current Postgres data into a backup directory using the mv command. Finally, make a new main directory using mkdir and set the correct permissions 700 (rwx for owner) with chmod.

cd X.X
mv main main-old
mkdir main/
chmod 700 main/

To kick off the data copy from the primary Postgres server, run the pg_basebackup command from the standby Postgres server.

pg_basebackup -h 10.0.0.1 -p 5432 -U replicator -D /var/lib/postgresql/X.X/main -P --xlog

Depending on how large the Postgres database being replicated is, this may take several minutes to hours. Be patient if the process doesn’t start right away as the standby server has to wait for a checkpoint from the primary.

After the checkpoint has finished copying, create and save the recovery.conf file in the freshly copied data directory /var/lib/postgresql/X.X/main/ of the standby Postgres server.

standby_mode = 'on'
primary_conninfo = 'host=10.0.0.1 port=5432 user=replicator password=changeth!s application_name=pgstandby001'
restore_command = 'cp /var/lib/postgresql/X.X/main/archive/%f %p'
trigger_file = '/tmp/postgresql.trigger.5432'

Since the recovery.conf contains a sensitive password for the replicator user, the permissions should be updated to 600 (rw for owner).

chmod 600 recovery.conf

Start the standby Postgres server using the command sudo systemctl start postgresql and the replication should be underway. To check that streaming replication is functioning, switch to the shell of the primary Postgres server and run the following commands.

su - postgres
psql -c "select application_name, state, sync_priority, sync_state from pg_stat_replication;"
psql -x -c "select * from pg_stat_replication;"

Promoting the Standby Postgres Server to Primary

During replication, the standby Postgres server will be in read-only mode so that its transactions cannot diverge from those of the primary Postgres server, which would break replication.

When ready to promote the standby server to primary and make it writable by clients (e.g. psql console, pgAdmin, and applications), one must create the trigger file at the path specified in recovery.conf. For the recovery.conf denoted above, create the following empty file on the standby Postgres server using touch to cease replicating transactions from the primary, in effect promoting it to the primary.

touch /tmp/postgresql.trigger.5432

However, it is important to note that any new transactions will not be replicated back to the primary server. By design, it will not be possible to resume replication without copying a checkpoint from the server containing the state you’d like to keep.

In order for your PostgreSQL applications to be able to connect to the new primary (formerly standby), it may be necessary to make the following changes on 10.0.0.2.

Remove the lines which were added to pg_hba.conf for the replication.
Update the passwords of the PostgreSQL users with ALTER USER user_name WITH PASSWORD 'new_password';

This article introduces a technique for migrating a PostgreSQL database with continuous replication to minimize the potential downtime of migrating an application from on-prem to cloud, or from cloud-to-cloud. Our cloud migration experts have successfully led a variety of application migration projects for organizations of all sizes. We have the experience to help you minimize disruption to your users while ensuring the necessary data is successfully migrated to the cloud.