Transitioning from Heroku PostgreSQL to AWS EC2: Step-by-Step Guide

We recently helped a client switch from Heroku PostgreSQL to managing their own PostgreSQL on AWS EC2. In this blog post, we’ll look at what Heroku PostgreSQL offers, its constraints, and outline the steps followed during the migration process.

What is Heroku PostgreSQL?

Heroku Postgres is a database as a service(dbaas) platform that simplifies managing your SQL databases. It offers a fully managed database experience, including automated backups, scaling, and maintenance. It offers different plans varying from small projects to large-scale applications. Plus, it integrates with Heroku’s platform, which makes it easy to deploy and manage your databases in the cloud. This means developers can focus on building their applications without worrying about managing the database infrastructure.

What are the limitations of Heroku PostgreSQL?

Heroku PostgreSQL offers a managed environment, but it has restrictions on customization and control over database settings. Some organizations may need specific configurations or extensions not supported by Heroku PostgreSQL.
As data volumes and usage grow, the pricing of Heroku PostgreSQL can increase. Migrating to self-managed PostgreSQL on EC2 has the potential to reduce costs for extensive deployments.
Heroku does not allow the creation of replica instances outside of its cloud infrastructure. Additionally, it lacks features like superuser access and Replication privilege for users.
Heroku PostgreSQL does not support logical replication, which is crucial for executing major version upgrades with near zero downtime.

Migration Process

After evaluating multiple options, we used the wal log shipping method to switch from Heroku PostgreSQL. This involved reaching out to Heroku’s support team to get access to automated backups and WAL files placed on AWS S3. The setup consists of:

Primary Database: This refers to the primary PostgreSQL database on Heroku.
WALFiles: These are transaction logs that record all the changes made to a database. They are needed for recovery in case of a database crash.
Amazon S3: This is an object storage service provided by AWS that offers scalability, data availability, and security. The Heroku PostgreSQL backups and wal archives are stored here.
Standby Database: This refers to a standby or replica PostgreSQL database. In this configuration, the wal files will be fetched from AWS S3 and applied to the standby to keep it synchronized with the primary database.
WAL-E: This is the backup tool used by Heroku to preserve base backups and perform continuous archiving of PostgreSQL WAL files.

Log-Shipping using WAL-E

In this section, we’ll install wal-e backup tool and guide you through the entire process of creating a replica from Heroku PostgreSQL to EC2 PostgreSQL. The following steps will be performed on EC2 instance.

1: Install Lzop

LZOP is a file compression tool similar to gzip. It utilizes the LZO data compression library for compression, offering significantly faster compression and decompression speeds compared to gzip, at cost of compression ratio. Heroku uses LZOP tool for storing WAL files and backups on S3.

sudo apt-get install lzop

2: Wal-e Installation

We have installed python 3.5 to work with wal-e installation. This decision is taken due to reported issues on GitHub, indicating that wal-e might encounter hanging or errors with newer Python versions. Further details can be found here: https://github.com/wal-e/wal-e/issues/322

sudo python3 -m pip install envdir

sudo python3 -m pip install wal-e[aws]

Following errors were faced while installing wal-e on

Error 1:

c/_cffi_backend.c:2:10: fatal error: Python.h: No such file or directory

2 | #include <Python.h>

| ^~~~~~~~~~

compilation terminated.

error: command ‘x86_64-linux-gnu-gcc’ failed with exit status 1

—————————————-

ERROR: Failed building wheel for cffi

Running setup.py clean for cffi

Failed to build cffi

ERROR: launchpadlib 1.10.13 requires testresources, which is not installed.

Installing collected packages: setuptools, Cython, pycparser, cffi, greenlet

Running setup.py install for cffi: started

Running setup.py install for cffi: finished with status ‘error’

Resolution:

Install the following packages:

sudo apt-get install python3.5-dev

sudo apt install python3-testresources

sudo apt-get install libffi-dev

Error 2:

ERROR: Command errored out with exit status 1:

command: /usr/bin/python3 /usr/share/python-wheels/pep517-0.8.2-py2.py3-none-any.whl/pep517/_in_process.py get_requires_for_build_wheel /tmp/tmpbk17_dri

cwd: /tmp/pip-install-1ka0ajfk/gevent

Complete output (1 lines):

/usr/bin/python3: can’t find ‘__main__’ module in ‘/usr/share/python-wheels/pep517-0.8.2-py2.py3-none-any.whl/pep517/_in_process.py’

—————————————-

ERROR: Command errored out with exit status 1: /usr/bin/python3 /usr/share/python-wheels/pep517-0.8.2-py2.py3-none-any.whl/pep517/_in_process.py get_requires_for_build_wheel /tmp/tmpbk17_dri Check the logs for full command output.

Resolution:

Here we need to upgrade pip version

sudo pip install -U pip

Verify Wal-e installation

wal-e --help

WAL-E connectivity with AWS S3

The Heroku support team provided us AWS S3 credentials for accessing backups and wal files. Let’s create secure variables containing the credentials for the S3 bucket within a directory.

mkdir -p /mnt/data/env

ls -l /mnt/data/env

-rw-rw-r-- 1 ec2-user ec2-user 21 Dec 20 17:22 AWS_ACCESS_KEY_ID

-rw-rw-r-- 1 ec2-user ec2-user 10 Dec 20 17:22 AWS_REGION

-rw-rw-r-- 1 ec2-user ec2-user 41 Dec 20 17:22 AWS_SECRET_ACCESS_KEY

-rw-rw-r-- 1 ec2-user ec2-user 92 Dec 20 17:21 WALE_S3_PREFIX

In order to verify above credentials, run the following command to list backups inside S3 bucket.

envdir /mnt/data/env wal-e backup-list

This command will return all the completed physical backups created by Heroku PostgreSQL.

3: Fetch Backups

Use wal-e backup-fetch command to restore the latest base backup of your Heroku Postgres database into the PostgreSQL data directory volume of your EC2 instance.

envdir /mnt/data/env wal-e backup-fetch --blind-restore /mnt/data/database/ LATEST

Blind Restore: If you are unable to reproduce tablespace storage structures prior to running backup-fetch you can set the option flag --blind-restore. This will direct WAL-E to skip the symlink verification process and place all data directly in the PG_DATABASE_DIRECTORY/pg_tblspc path.

Note: Make sure that your PostgreSQL service is stopped and the data directory is empty.

After fetching the backup, we’ll configure standby by enabling WAL recovery and setting parameters accordingly.

4: Configure Standby

postgresql.auto.conf

Check data inside Postgresql.auto.conf and remove any unnecessary parameters.

Incase of heroku, we found this

cat postgresql.auto.conf

# Do not edit this file manually!

# It will be overwritten by the ALTER SYSTEM command.

aiven.pg_security_agent = ‘on’

Comment aiven.pg_security_agent variable and save the above file.

Note: The files listed below are not included in Heroku backups. We must manually create and configure them

Postgresql.conf

Get postgresql default configuration file as per major version. We are using PostgreSQL 13 in our example and the configuration file can be taken from main PostgreSQL repository.

https://github.com/postgres/postgres/blob/REL_13_STABLE/src/backend/utils/misc/postgresql.conf.sample

When creating a read-only standby, certain parameter values must match between the primary and replica database or else the standby will fail to receive wal logs. The following parameters are mentioned here:

Parameter	Value
listen_addresses	‘*’
wal_level	logical
shared_buffers	Should be set to at least 25% of available memory
effective_cache_size	Should be set to at least 70-80% of available memory
max_worker_processes	20 (This parameter must be set to the same or higher value than on the master server)
max_connections	500 (The maximum number of concurrent connections to the database server. This also needs to be set to the same or higher value than on the primary server. At Heroku, this value is set to 500 as per the plan)
hot_standby	on
restore_command	‘envdir /database/env wal-e wal-fetch %f %p’
recovery_target_timeline	‘latest’

restore_command is the main parameter here which defines the command to retrieve an archived WAL file. %f is replaced by the name of the desired WAL file and %p is replaced by the path name to copy the WAL file to(Data directory).

Note: The values of the parameters above may vary depending on the instance tier of Heroku PostgreSQL.

In addition to above parameters, you can set other parameters such as:

log_destination

logging_collector

work_mem

effective_io_concurrency

max_wal_size

Min_wal_size

wal_compression

Note: Ensure that proper logging is enabled on replica so we can detect errors easily.

pg_hba.conf

The basic entries will be like this

# "local" is for Unix domain socket connections only

local   all         postgres    peer

local   all         all         md5

# IPv4 local connections:

host all         all         0.0.0.0/0        md5

# IPv6 local connections:

host all         all         ::1/128          md5

# Allow replication connections from localhost, by a user with the

# replication privilege.

local   replication postgres                             peer

host replication all         127.0.0.1/32        md5

host replication all         ::1/128             md5

standby.signal

Create an empty standby.signal file inside the data directory to start log shipping. If the file standby.signal is present in the data directory, the server will start in standby mode and enter recovery.(Works for PostgreSQL v12 and above)

touch standby.signal

5: Remove old wal folder

Delete the existing pg_wal folder and create a new one. In Heroku Postgres servers, wal files are stored on a separate volume from the main data directory and when restoring a backup from wal-e, the WAL directory will be a symlink.

rm -rf /DATA_DIRECTORY/pg_wal (removes old symlink)

mkdir /DATA_DIRECTORY/pg_wal

6: Remove old tablespace directory

Heroku has separate volumes for temporary tables and operations such as disk based hash and sorting. We need to remove it as our self-hosted PostgreSQL won’t be able to find this directory.

rm -rf /database/pg_tblspc/16386/

Note that the temp tablespace folder might sometimes be named differently.

7: Start database

/usr/lib/postgresql/13/bin/pg_ctl -D /mnt/data/database/ start

The restart process will take some time as it fetches wal file from S3 and attempts to reach a consistent state. During this period, the database will not be accessible.

After a while, once PostgreSQL reaches a consistent state, you can log in using psql. The database will be in read-only mode, and you won’t be able to create any new objects. This can be verified using the following command:

select pg_is_in_recovery();

value should be equal to ‘t’

The log shipping process is now completed. In the next section, we will outline the steps for promoting the standby database.

Promotion Steps

This section is segmented into three parts.

Pre-check Steps

1: Stop Application

2: Run the following command to check for active connections on primary. Let the sessions complete or cancel them as required.

select * from pg_stat_activity where state = ‘active’;

3: Run the following command multiple times on primary to check the current wal file. We need to make sure that all wal files are replicated to standby.

select pg_walfile_name(pg_current_wal_lsn());

Once the walfile name stays constant, switch the walfile so it can send the current wal files content to standby using the following command.

select pg_switch_wal();

Promotion

Once confirmed, promote standby

/usr/lib/postgresql/13/bin/pg_ctl -D /mnt/data/database/ promote

Post-check Steps

1: Check if the promotion was done successfully by running this command on new master:

select pg_is_in_recovery();

value should be equal to ‘f’

2: Now change the required parameters inside the postgresql.conf file as per current hardware. We also need to enable archiving. Make sure that the archiving directory exists with read/write permissions.

vi /mnt/data/database/postgresql.conf

archive_mode = on

archive_command = "cp %p /mnt/data/archives/%f"

restore_command = ""

3: Restart database for these settings to take effect

/usr/lib/postgresql/13/bin/pg_ctl -D /mnt/data/database/ restart

4: Drop temp_Tablespace used in heroku.

Select * from pg_tablespaces; => To check for correct tablespace name

Output:

oid | spcname | spcowner | spcacl | spcoptions
-------+------------+----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------
1663 | pg_default | 10 |

1664 | pg_global | 10 |

16386  | ephemeral  | 10 |

We are going to drop ephemeral tablespace as this is not required for our installation

DROP TABLESPACE ephemeral;

5: Connect applications and start transactions.

6: After promotion, make sure to take full database physical backup with wal files using pg_basebackup utility. A sample command would be like this

/usr/lib/postgresql/13/bin/pg_basebackup -D /mnt/data/backups/fullbkp_$(date +"%d%m%y") --checkpoint=fast -Xs -P

This marks the completion of our migration process.