Setting Up a High Availability 3-Node PostgreSQL Cluster with Patroni on Ubuntu 24.04

Creating a high-availability PostgreSQL cluster ensures your database is always online, even during failures. 

In this guide, we’ll walk you through setting up a 3-node HA PostgreSQL cluster using Patroni on Ubuntu 24.04. Patroni simplifies managing PostgreSQL clusters and ensures automatic failover. Follow these steps to build a reliable, single-zone cluster for robust database performance and resilience.

Our structure includes 5 dedicated nodes all are running ubuntu 24.04 as the base operating system.

  • 3 data nodes with Patroni
  • 1 node for ETCD
  • 1 node for HAProxy

How ETCD used by Patroni?

Leader Election: It stores information about which PostgreSQL node is the current leader, ensuring only one node acts as the primary database.
Configuration Storage: Patroni uses ETCD to store and retrieve cluster configuration settings.
Health Monitoring: Nodes register their status in ETCD, allowing Patroni to monitor and manage the cluster’s health and trigger failover if needed.
Data Consistency: ETCD ensures that all nodes in the cluster have a consistent view of the leader and configuration data, maintaining synchronization.

Why do we use HAProxy?

Load Balancing: HAProxy distributes client connections across multiple PostgreSQL replicas, optimizing resource use and enhancing performance.
High Availability: It ensures continuous database access by redirecting connections to healthy nodes during failures or leader changes.
Read/Write Splitting: HAProxy can route write queries to the primary node and read queries to replica nodes, improving query performance and scaling read operations.
Connection Management: It handles client connections efficiently, managing timeouts, retries, and queuing, enhancing overall cluster stability and user experience

Set hostnames and update the package manager on all nodes

sudo hostnamectl set-hostname data-node1 && sudo apt-get update
sudo hostnamectl set-hostname data-node2 && sudo apt-get update
sudo hostnamectl set-hostname data-node3 && sudo apt-get update
sudo hostnamectl set-hostname ETCD && sudo apt-get update
sudo hostnamectl set-hostname HAProxy && sudo apt-get update

Restart your terminal after executing the above commands on respective nodes.

Setup data-node1, data-node2 and data-node3 

Install PostgreSQL

sudo apt-get install postgresql-common -y
sudo /usr/share/postgresql-common/pgdg/apt.postgresql.org.sh
sudo apt -y install postgresql
sudo systemctl stop postgresql
sudo ln -s /usr/lib/postgresql/16/bin/* /usr/sbin/

Install required system packages

sudo apt-get install net-tools python3-pip python3-dev libpq-dev python3-venv -y

Configure cluster environment and Patroni

We are going to use Postgres user in order to configure Patroni.

sudo su postgres
cd $HOME
python3 -m venv patroni-packages
source patroni-packages/bin/activate
pip3 install --upgrade setuptools pip
pip install psycopg[binary] patroni python-etcd
cd $HOME/patroni-packages
touch patroni.yml
mkdir -p data
chmod 700 data

Place this content inside the patroni.yml file — Where N is number of your current node so update accordingly.

scope: postgres
namespace: Cluster
name: data-nodeN

restapi:
    listen: <nodeN_ip>:8008
    connect_address: <nodeN_ip>:8008

etcd:
    host: <etcdnode_ip>:2379

bootstrap:
  dcs:
    ttl: 30
    loop_wait: 10
    retry_timeout: 10
    maximum_lag_on_failover: 1048576
    postgresql:
      use_pg_rewind: true
      use_slots: true
      parameters:

  initdb:
  - encoding: UTF8
  - data-checksums

  pg_hba:
  - host replication replicator 127.0.0.1/32 md5
  - host replication replicator <node1_ip>/0 md5
  - host replication replicator <node2_ip>/0 md5
  - host replication replicator <node3_ip>/0 md5
  - host all all 0.0.0.0/0 md5

  users:
    admin:
      password: admin
      options:
        - createrole
        - createdb

postgresql:
  listen: <nodeN_ip>:5432
  connect_address: <nodeN_ip>:5432
  data_dir: /var/lib/postgresql/patroni-packages/data
  pgpass: /tmp/pgpass
  authentication:
    replication:
      username: replicator
      password: '123'
    superuser:
      username: postgres
      password: '123'
  parameters:
      unix_socket_directories: '.'

tags:
    nofailover: false
    noloadbalance: false
    clonefrom: false
    nosync: false

Exit from postgre user and login with a user who have sudo access because we are going to create a service file for Patroni.

sudo touch /etc/systemd/system/patroni.service

# Place folloiwng content in patroni.service file

[Unit]
Description=High availability PostgreSQL Cluster
After=syslog.target network.target

[Service]
Type=simple
User=postgres
Group=postgres
ExecStart=/var/lib/postgresql/patroni-packages/bin/patroni /var/lib/postgresql/patroni-packages/patroni.yml
KillMode=process
TimeoutSec=30
Restart=no

[Install]
WantedBy=multi-user.targ

Setup ETCD node

Since the ETCD package is not available on the ubuntu-24.04 repository we couldn’t install it via apt-get so we had to download the pre-built binaries and use them to set the ETCD node.

wget https://github.com/etcd-io/etcd/releases/download/v3.4.32/etcd-v3.4.32-linux-amd64.tar.gz

tar xzf etcd-v3.4.32-linux-amd64.tar.gz


etcd-v3.4.32-linux-amd64/etcd --listen-peer-urls="http://$ETCD_NODE_PRIVATE_IP:2380" --listen-client-urls="http://localhost:2379,http://$ETCD_NODE_PRIVATE_IP:2379" --initial-advertise-peer-urls="http://$ETCD_NODE_PRIVATE_IP:2380" --initial-cluster="default=http://$ETCD_NODE_PRIVATE_IP:2380," --advertise-client-urls="http://$ETCD_NODE_PRIVATE_IP:2379" --initial-cluster-token="etcd-cluster" --initial-cluster-state="new" --enable-v2=true

Start patroni on data-node1, data-node2 and data-node3

sudo systemctl start patroni

Setup HAProxy node

sudo apt-get install haproxy -y 
sudo vim /etc/haproxy/haproxy.cfg

Replace the existing content from haproxy.cfg file with this.

global

        maxconn 100
        log     127.0.0.1 local2

defaults
        log global
        mode tcp
        retries 2
        timeout client 30m
        timeout connect 4s
        timeout server 30m
        timeout check 5s

listen stats
    mode http
    bind *:7000
    stats enable
    stats uri /

listen postgres
    bind *:5000
    option httpchk
    http-check expect status 200
    default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
    server node1 <node1_ip>:5432 maxconn 100 check port 8008
    server node2 <node2_ip>:5432 maxconn 100 check port 8008
    server node3 <node3_ip>:5432 maxconn 100 check port 8008

Restart the HAProxy service

sudo systemctl restart haproxy

Verify the cluster 

From any data node run this command.

postgres@data-node1:~/patroni-packages$ bin/patronictl -c patroni.yml list
+ Cluster: postgres (7374714524234176830) ---------+----+-----------+
| Member     | Host          | Role    | State     | TL | Lag in MB |
+------------+---------------+---------+-----------+----+-----------+
| data-node1 | 172.31.58.171 | Leader  | running   |  2 |           |
| data-node2 | 172.31.50.185 | Replica | streaming |  2 |         0 |
| data-node3 | 172.31.48.62  | Replica | streaming |  2 |         0 |
+------------+---------------+---------+-----------+----+-----------+

Connect from the client with HAProxy

mapmac@192 bin % ./psql -h 35.175.134.23 -p 5000 -d postgres -U postgres
Password for user postgres:
psql (16.0, server 16.3 (Ubuntu 16.3-1.pgdg24.04+1))
Type "help" for help.
postgres=#
postgres=#
postgres=# create database test;
CREATE DATABASE
postgres=# \l
                                                   List of databases
   Name    |  Owner   | Encoding | Locale Provider | Collate |  Ctype  | ICU Locale | ICU Rules |   Access privileges  
-----------+----------+----------+-----------------+---------+---------+------------+-----------+-----------------------
postgres  | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           |
template0 | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           | =c/postgres          +
           |          |          |                 |         |         |            |           | postgres=CTc/postgres
template1 | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           | =c/postgres          +
           |          |          |                 |         |         |            |           | postgres=CTc/postgres
test      | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           |
(4 rows)
postgres=#

Testing failover

$ sudo systemctl stop patroni
$ sudo su postgres
$ cd
$ cd patroni-packages/
postgres@data-node1:~/patroni-packages$ bin/patronictl -c patroni.yml list
+ Cluster: postgres (7374714524234176830) ---------+----+-----------+
| Member     | Host          | Role    | State     | TL | Lag in MB |
+------------+---------------+---------+-----------+----+-----------+
| data-node1 | 172.31.58.171 | Replica | stopped   |    |   unknown |
| data-node2 | 172.31.50.185 | Leader  | running   |  3 |           |
| data-node3 | 172.31.48.62  | Replica | streaming |  3 |         0 |
+------------+---------------+---------+-----------+----+-----------+
postgres@data-node1:~/patroni-packages$
mapmac@192 bin % ./psql -h 35.175.134.23 -p 5000 -d postgres -U postgres
Password for user postgres:
psql (16.0, server 16.3 (Ubuntu 16.3-1.pgdg24.04+1))
Type "help" for help.
postgres=# create database test1;
CREATE DATABASE
postgres=# \l
                                                   List of databases
   Name    |  Owner   | Encoding | Locale Provider | Collate |  Ctype  | ICU Locale | ICU Rules |   Access privileges  
-----------+----------+----------+-----------------+---------+---------+------------+-----------+-----------------------
postgres  | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           |
template0 | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           | =c/postgres          +
           |          |          |                 |         |         |            |           | postgres=CTc/postgres
template1 | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           | =c/postgres          +
           |          |          |                 |         |         |            |           | postgres=CTc/postgres
test      | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           |
test1     | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           |
(5 rows)

The client can still connect and execute queries with the same IP.

Testing second failover

lets make node 1 down as well.

$ sudo systemctl stop patroni
postgres@data-node1:~/patroni-packages$ bin/patronictl -c patroni.yml list
+ Cluster: postgres (7374714524234176830) -----+----+-----------+
| Member     | Host         | Role   | State   | TL | Lag in MB |
+------------+--------------+--------+---------+----+-----------+
| data-node3 | 172.31.48.62 | Leader | running |  4 |           |
+------------+--------------+--------+---------+----+-----------+
postgres@data-node1:~/patroni-packages$
mapmac@192 bin % ./psql -h 35.175.134.23 -p 5000 -d postgres -U postgres
Password for user postgres:
psql (16.0, server 16.3 (Ubuntu 16.3-1.pgdg24.04+1))
Type "help" for help.
postgres=# create database test3;
CREATE DATABASE
postgres=# \l
                                                   List of databases
   Name    |  Owner   | Encoding | Locale Provider | Collate |  Ctype  | ICU Locale | ICU Rules |   Access privileges  
-----------+----------+----------+-----------------+---------+---------+------------+-----------+-----------------------
postgres  | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           |
template0 | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           | =c/postgres          +
           |          |          |                 |         |         |            |           | postgres=CTc/postgres
template1 | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           | =c/postgres          +
           |          |          |                 |         |         |            |           | postgres=CTc/postgres
test      | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           |
test1     | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           |
test3     | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           |
(6 rows)

There is still no issue with the client connection.

Nodes recovery

$ sudo systemct start patroni 
postgres@data-node1:~/patroni-packages$ bin/patronictl -c patroni.yml list
+ Cluster: postgres (7374714524234176830) ---------+----+-----------+
| Member     | Host          | Role    | State     | TL | Lag in MB |
+------------+---------------+---------+-----------+----+-----------+
| data-node1 | 172.31.58.171 | Replica | running   |  2 |         8 |
| data-node2 | 172.31.50.185 | Replica | streaming |  3 |         4 |
| data-node3 | 172.31.48.62  | Leader  | running   |  4 |           |
+------------+---------------+---------+-----------+----+-----------+

Patroni successfully attached the failed nodes back to the cluster once they are recovered.

Testing replication

postgres@data-node3:~$ psql -h 172.31.48.62 -p 5432 -d postgres -U postgres
Password for user postgres:
psql (16.3 (Ubuntu 16.3-1.pgdg24.04+1))
Type "help" for help.
postgres=# \l
                                                   List of databases
   Name    |  Owner   | Encoding | Locale Provider | Collate |  Ctype  | ICU Locale | ICU Rules |   Access privileges  
-----------+----------+----------+-----------------+---------+---------+------------+-----------+-----------------------
postgres  | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           |
template0 | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           | =c/postgres          +
           |          |          |                 |         |         |            |           | postgres=CTc/postgres
template1 | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           | =c/postgres          +
           |          |          |                 |         |         |            |           | postgres=CTc/postgres
test      | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           |
test1     | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           |
test3     | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           |
(6 rows)

We can see all the databases that we created from clients are actually available on node3 as well so replication is working you can verify the same on node 1 and node 2 as well.

Resources

Leave A Comment