Creating a high-availability PostgreSQL cluster ensures your database is always online, even during failures.
In this guide, we’ll walk you through setting up a 3-node HA PostgreSQL cluster using Patroni on Ubuntu 24.04. Patroni simplifies managing PostgreSQL clusters and ensures automatic failover. Follow these steps to build a reliable, single-zone cluster for robust database performance and resilience.
Our structure includes 5 dedicated nodes all are running ubuntu 24.04 as the base operating system.
- 3 data nodes with Patroni
- 1 node for ETCD
- 1 node for HAProxy
How ETCD used by Patroni?
Leader Election: It stores information about which PostgreSQL node is the current leader, ensuring only one node acts as the primary database.
Configuration Storage: Patroni uses ETCD to store and retrieve cluster configuration settings.
Health Monitoring: Nodes register their status in ETCD, allowing Patroni to monitor and manage the cluster’s health and trigger failover if needed.
Data Consistency: ETCD ensures that all nodes in the cluster have a consistent view of the leader and configuration data, maintaining synchronization.
Why do we use HAProxy?
Load Balancing: HAProxy distributes client connections across multiple PostgreSQL replicas, optimizing resource use and enhancing performance.
High Availability: It ensures continuous database access by redirecting connections to healthy nodes during failures or leader changes.
Read/Write Splitting: HAProxy can route write queries to the primary node and read queries to replica nodes, improving query performance and scaling read operations.
Connection Management: It handles client connections efficiently, managing timeouts, retries, and queuing, enhancing overall cluster stability and user experience
Set hostnames and update the package manager on all nodes
sudo hostnamectl set-hostname data-node1 && sudo apt-get update
sudo hostnamectl set-hostname data-node2 && sudo apt-get update
sudo hostnamectl set-hostname data-node3 && sudo apt-get update
sudo hostnamectl set-hostname ETCD && sudo apt-get update
sudo hostnamectl set-hostname HAProxy && sudo apt-get update
Restart your terminal after executing the above commands on respective nodes.
Setup data-node1, data-node2 and data-node3
Install PostgreSQL
sudo apt-get install postgresql-common -y
sudo /usr/share/postgresql-common/pgdg/apt.postgresql.org.sh
sudo apt -y install postgresql
sudo systemctl stop postgresql
sudo ln -s /usr/lib/postgresql/16/bin/* /usr/sbin/
Install required system packages
sudo apt-get install net-tools python3-pip python3-dev libpq-dev python3-venv -y
Configure cluster environment and Patroni
We are going to use Postgres user in order to configure Patroni.
sudo su postgres
cd $HOME
python3 -m venv patroni-packages
source patroni-packages/bin/activate
pip3 install --upgrade setuptools pip
pip install psycopg[binary] patroni python-etcd
cd $HOME/patroni-packages
touch patroni.yml
mkdir -p data
chmod 700 data
Place this content inside the patroni.yml file — Where N is number of your current node so update accordingly.
scope: postgres
namespace: Cluster
name: data-nodeN
restapi:
listen: <nodeN_ip>:8008
connect_address: <nodeN_ip>:8008
etcd:
host: <etcdnode_ip>:2379
bootstrap:
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576
postgresql:
use_pg_rewind: true
use_slots: true
parameters:
initdb:
- encoding: UTF8
- data-checksums
pg_hba:
- host replication replicator 127.0.0.1/32 md5
- host replication replicator <node1_ip>/0 md5
- host replication replicator <node2_ip>/0 md5
- host replication replicator <node3_ip>/0 md5
- host all all 0.0.0.0/0 md5
users:
admin:
password: admin
options:
- createrole
- createdb
postgresql:
listen: <nodeN_ip>:5432
connect_address: <nodeN_ip>:5432
data_dir: /var/lib/postgresql/patroni-packages/data
pgpass: /tmp/pgpass
authentication:
replication:
username: replicator
password: '123'
superuser:
username: postgres
password: '123'
parameters:
unix_socket_directories: '.'
tags:
nofailover: false
noloadbalance: false
clonefrom: false
nosync: false
Exit from postgre user and login with a user who have sudo access because we are going to create a service file for Patroni.
sudo touch /etc/systemd/system/patroni.service
# Place folloiwng content in patroni.service file
[Unit]
Description=High availability PostgreSQL Cluster
After=syslog.target network.target
[Service]
Type=simple
User=postgres
Group=postgres
ExecStart=/var/lib/postgresql/patroni-packages/bin/patroni /var/lib/postgresql/patroni-packages/patroni.yml
KillMode=process
TimeoutSec=30
Restart=no
[Install]
WantedBy=multi-user.targ
Setup ETCD node
Since the ETCD package is not available on the ubuntu-24.04 repository we couldn’t install it via apt-get so we had to download the pre-built binaries and use them to set the ETCD node.
wget https://github.com/etcd-io/etcd/releases/download/v3.4.32/etcd-v3.4.32-linux-amd64.tar.gz
tar xzf etcd-v3.4.32-linux-amd64.tar.gz
etcd-v3.4.32-linux-amd64/etcd --listen-peer-urls="http://$ETCD_NODE_PRIVATE_IP:2380" --listen-client-urls="http://localhost:2379,http://$ETCD_NODE_PRIVATE_IP:2379" --initial-advertise-peer-urls="http://$ETCD_NODE_PRIVATE_IP:2380" --initial-cluster="default=http://$ETCD_NODE_PRIVATE_IP:2380," --advertise-client-urls="http://$ETCD_NODE_PRIVATE_IP:2379" --initial-cluster-token="etcd-cluster" --initial-cluster-state="new" --enable-v2=true
Start patroni on data-node1, data-node2 and data-node3
sudo systemctl start patroni
Setup HAProxy node
sudo apt-get install haproxy -y
sudo vim /etc/haproxy/haproxy.cfg
Replace the existing content from haproxy.cfg file with this.
global
maxconn 100
log 127.0.0.1 local2
defaults
log global
mode tcp
retries 2
timeout client 30m
timeout connect 4s
timeout server 30m
timeout check 5s
listen stats
mode http
bind *:7000
stats enable
stats uri /
listen postgres
bind *:5000
option httpchk
http-check expect status 200
default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
server node1 <node1_ip>:5432 maxconn 100 check port 8008
server node2 <node2_ip>:5432 maxconn 100 check port 8008
server node3 <node3_ip>:5432 maxconn 100 check port 8008
Restart the HAProxy service
sudo systemctl restart haproxy
Verify the cluster
From any data node run this command.
postgres@data-node1:~/patroni-packages$ bin/patronictl -c patroni.yml list
+ Cluster: postgres (7374714524234176830) ---------+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+------------+---------------+---------+-----------+----+-----------+
| data-node1 | 172.31.58.171 | Leader | running | 2 | |
| data-node2 | 172.31.50.185 | Replica | streaming | 2 | 0 |
| data-node3 | 172.31.48.62 | Replica | streaming | 2 | 0 |
+------------+---------------+---------+-----------+----+-----------+
Connect from the client with HAProxy
mapmac@192 bin % ./psql -h 35.175.134.23 -p 5000 -d postgres -U postgres
Password for user postgres:
psql (16.0, server 16.3 (Ubuntu 16.3-1.pgdg24.04+1))
Type "help" for help.
postgres=#
postgres=#
postgres=# create database test;
CREATE DATABASE
postgres=# \l
List of databases
Name | Owner | Encoding | Locale Provider | Collate | Ctype | ICU Locale | ICU Rules | Access privileges
-----------+----------+----------+-----------------+---------+---------+------------+-----------+-----------------------
postgres | postgres | UTF8 | libc | C.UTF-8 | C.UTF-8 | | |
template0 | postgres | UTF8 | libc | C.UTF-8 | C.UTF-8 | | | =c/postgres +
| | | | | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | libc | C.UTF-8 | C.UTF-8 | | | =c/postgres +
| | | | | | | | postgres=CTc/postgres
test | postgres | UTF8 | libc | C.UTF-8 | C.UTF-8 | | |
(4 rows)
postgres=#
Testing failover
$ sudo systemctl stop patroni
$ sudo su postgres
$ cd
$ cd patroni-packages/
postgres@data-node1:~/patroni-packages$ bin/patronictl -c patroni.yml list
+ Cluster: postgres (7374714524234176830) ---------+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+------------+---------------+---------+-----------+----+-----------+
| data-node1 | 172.31.58.171 | Replica | stopped | | unknown |
| data-node2 | 172.31.50.185 | Leader | running | 3 | |
| data-node3 | 172.31.48.62 | Replica | streaming | 3 | 0 |
+------------+---------------+---------+-----------+----+-----------+
postgres@data-node1:~/patroni-packages$
mapmac@192 bin % ./psql -h 35.175.134.23 -p 5000 -d postgres -U postgres
Password for user postgres:
psql (16.0, server 16.3 (Ubuntu 16.3-1.pgdg24.04+1))
Type "help" for help.
postgres=# create database test1;
CREATE DATABASE
postgres=# \l
List of databases
Name | Owner | Encoding | Locale Provider | Collate | Ctype | ICU Locale | ICU Rules | Access privileges
-----------+----------+----------+-----------------+---------+---------+------------+-----------+-----------------------
postgres | postgres | UTF8 | libc | C.UTF-8 | C.UTF-8 | | |
template0 | postgres | UTF8 | libc | C.UTF-8 | C.UTF-8 | | | =c/postgres +
| | | | | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | libc | C.UTF-8 | C.UTF-8 | | | =c/postgres +
| | | | | | | | postgres=CTc/postgres
test | postgres | UTF8 | libc | C.UTF-8 | C.UTF-8 | | |
test1 | postgres | UTF8 | libc | C.UTF-8 | C.UTF-8 | | |
(5 rows)
The client can still connect and execute queries with the same IP.
Testing second failover
lets make node 1 down as well.
$ sudo systemctl stop patroni
postgres@data-node1:~/patroni-packages$ bin/patronictl -c patroni.yml list
+ Cluster: postgres (7374714524234176830) -----+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+------------+--------------+--------+---------+----+-----------+
| data-node3 | 172.31.48.62 | Leader | running | 4 | |
+------------+--------------+--------+---------+----+-----------+
postgres@data-node1:~/patroni-packages$
mapmac@192 bin % ./psql -h 35.175.134.23 -p 5000 -d postgres -U postgres
Password for user postgres:
psql (16.0, server 16.3 (Ubuntu 16.3-1.pgdg24.04+1))
Type "help" for help.
postgres=# create database test3;
CREATE DATABASE
postgres=# \l
List of databases
Name | Owner | Encoding | Locale Provider | Collate | Ctype | ICU Locale | ICU Rules | Access privileges
-----------+----------+----------+-----------------+---------+---------+------------+-----------+-----------------------
postgres | postgres | UTF8 | libc | C.UTF-8 | C.UTF-8 | | |
template0 | postgres | UTF8 | libc | C.UTF-8 | C.UTF-8 | | | =c/postgres +
| | | | | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | libc | C.UTF-8 | C.UTF-8 | | | =c/postgres +
| | | | | | | | postgres=CTc/postgres
test | postgres | UTF8 | libc | C.UTF-8 | C.UTF-8 | | |
test1 | postgres | UTF8 | libc | C.UTF-8 | C.UTF-8 | | |
test3 | postgres | UTF8 | libc | C.UTF-8 | C.UTF-8 | | |
(6 rows)
There is still no issue with the client connection.
Nodes recovery
$ sudo systemct start patroni
postgres@data-node1:~/patroni-packages$ bin/patronictl -c patroni.yml list
+ Cluster: postgres (7374714524234176830) ---------+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+------------+---------------+---------+-----------+----+-----------+
| data-node1 | 172.31.58.171 | Replica | running | 2 | 8 |
| data-node2 | 172.31.50.185 | Replica | streaming | 3 | 4 |
| data-node3 | 172.31.48.62 | Leader | running | 4 | |
+------------+---------------+---------+-----------+----+-----------+
Patroni successfully attached the failed nodes back to the cluster once they are recovered.
Testing replication
postgres@data-node3:~$ psql -h 172.31.48.62 -p 5432 -d postgres -U postgres
Password for user postgres:
psql (16.3 (Ubuntu 16.3-1.pgdg24.04+1))
Type "help" for help.
postgres=# \l
List of databases
Name | Owner | Encoding | Locale Provider | Collate | Ctype | ICU Locale | ICU Rules | Access privileges
-----------+----------+----------+-----------------+---------+---------+------------+-----------+-----------------------
postgres | postgres | UTF8 | libc | C.UTF-8 | C.UTF-8 | | |
template0 | postgres | UTF8 | libc | C.UTF-8 | C.UTF-8 | | | =c/postgres +
| | | | | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | libc | C.UTF-8 | C.UTF-8 | | | =c/postgres +
| | | | | | | | postgres=CTc/postgres
test | postgres | UTF8 | libc | C.UTF-8 | C.UTF-8 | | |
test1 | postgres | UTF8 | libc | C.UTF-8 | C.UTF-8 | | |
test3 | postgres | UTF8 | libc | C.UTF-8 | C.UTF-8 | | |
(6 rows)
We can see all the databases that we created from clients are actually available on node3 as well so replication is working you can verify the same on node 1 and node 2 as well.
Resources