StormaticsStormatics

Setting Up a High Availability 3-Node PostgreSQL Cluster with Patroni on Ubuntu 24.04

Creating a high-availability PostgreSQL cluster ensures your database is always online, even during failures. 

In this guide, we’ll walk you through setting up a 3-node HA PostgreSQL cluster using Patroni on Ubuntu 24.04. Patroni simplifies managing PostgreSQL clusters and ensures automatic failover. Follow these steps to build a reliable, single-zone cluster for robust database performance and resilience.

Our structure includes 5 dedicated nodes all are running ubuntu 24.04 as the base operating system.

  • 3 data nodes with Patroni
  • 1 node for ETCD
  • 1 node for HAProxy

How ETCD used by Patroni?

Leader Election: It stores information about which PostgreSQL node is the current leader, ensuring only one node acts as the primary database.
Configuration Storage: Patroni uses ETCD to store and retrieve cluster configuration settings.
Health Monitoring: Nodes register their status in ETCD, allowing Patroni to monitor and manage the cluster’s health and trigger failover if needed.
Data Consistency: ETCD ensures that all nodes in the cluster have a consistent view of the leader and configuration data, maintaining synchronization.

Why do we use HAProxy?

Load Balancing: HAProxy distributes client connections across multiple PostgreSQL replicas, optimizing resource use and enhancing performance.
High Availability: It ensures continuous database access by redirecting connections to healthy nodes during failures or leader changes.
Read/Write Splitting: HAProxy can route write queries to the primary node and read queries to replica nodes, improving query performance and scaling read operations.
Connection Management: It handles client connections efficiently, managing timeouts, retries, and queuing, enhancing overall cluster stability and user experience

Set hostnames and update the package manager on all nodes

sudo hostnamectl set-hostname data-node1 && sudo apt-get update
sudo hostnamectl set-hostname data-node2 && sudo apt-get update
sudo hostnamectl set-hostname data-node3 && sudo apt-get update
sudo hostnamectl set-hostname ETCD && sudo apt-get update
sudo hostnamectl set-hostname HAProxy && sudo apt-get update

Restart your terminal after executing the above commands on respective nodes.

Setup data-node1, data-node2 and data-node3 

Install PostgreSQL

sudo apt-get install postgresql-common -y
sudo /usr/share/postgresql-common/pgdg/apt.postgresql.org.sh
sudo apt -y install postgresql
sudo systemctl stop postgresql
sudo ln -s /usr/lib/postgresql/16/bin/* /usr/sbin/

Install required system packages

sudo apt-get install net-tools python3-pip python3-dev libpq-dev python3-venv -y

Configure cluster environment and Patroni

We are going to use Postgres user in order to configure Patroni.

sudo su postgres
cd $HOME
python3 -m venv patroni-packages
source patroni-packages/bin/activate
pip3 install --upgrade setuptools pip
pip install psycopg[binary] patroni python-etcd
cd $HOME/patroni-packages
touch patroni.yml
mkdir -p data
chmod 700 data

Place this content inside the patroni.yml file — Where N is number of your current node so update accordingly.

scope: postgres
namespace: Cluster
name: data-nodeN
restapi:
    listen: <nodeN_ip>:8008
    connect_address: <nodeN_ip>:8008
etcd:
    host: <etcdnode_ip>:2379
bootstrap:
  dcs:
    ttl: 30
    loop_wait: 10
    retry_timeout: 10
    maximum_lag_on_failover: 1048576
    postgresql:
      use_pg_rewind: true
      use_slots: true
      parameters:
  initdb:
  - encoding: UTF8
  - data-checksums
  pg_hba:
  - host replication replicator 127.0.0.1/32 md5
  - host replication replicator <node1_ip>/0 md5
  - host replication replicator <node2_ip>/0 md5
  - host replication replicator <node3_ip>/0 md5
  - host all all 0.0.0.0/0 md5
  users:
    admin:
      password: admin
      options:
        - createrole
        - createdb
postgresql:
  listen: <nodeN_ip>:5432
  connect_address: <nodeN_ip>:5432
  data_dir: /var/lib/postgresql/patroni-packages/data
  pgpass: /tmp/pgpass
  authentication:
    replication:
      username: replicator
      password: '123'
    superuser:
      username: postgres
      password: '123'
  parameters:
      unix_socket_directories: '.'
tags:
    nofailover: false
    noloadbalance: false
    clonefrom: false
    nosync: false

Exit from postgre user and login with a user who have sudo access because we are going to create a service file for Patroni.

sudo touch /etc/systemd/system/patroni.service

# Place folloiwng content in patroni.service file

[Unit]
Description=High availability PostgreSQL Cluster
After=syslog.target network.target

[Service]
Type=simple
User=postgres
Group=postgres
ExecStart=/var/lib/postgresql/patroni-packages/bin/patroni /var/lib/postgresql/patroni-packages/patroni.yml
KillMode=process
TimeoutSec=30
Restart=no

[Install]
WantedBy=multi-user.targ

Setup ETCD node

Since the ETCD package is not available on the ubuntu-24.04 repository we couldn’t install it via apt-get so we had to download the pre-built binaries and use them to set the ETCD node.

wget https://github.com/etcd-io/etcd/releases/download/v3.4.32/etcd-v3.4.32-linux-amd64.tar.gz

tar xzf etcd-v3.4.32-linux-amd64.tar.gz

etcd-v3.4.32-linux-amd64/etcd –listen-peer-urls=”http://$ETCD_NODE_PRIVATE_IP:2380″ –listen-client-urls=”http://localhost:2379,http://$ETCD_NODE_PRIVATE_IP:2379″ –initial-advertise-peer-urls=”http://$ETCD_NODE_PRIVATE_IP:2380″ –initial-cluster=”default=http://$ETCD_NODE_PRIVATE_IP:2380,” –advertise-client-urls=”http://$ETCD_NODE_PRIVATE_IP:2379″ –initial-cluster-token=”etcd-cluster” –initial-cluster-state=”new” –enable-v2=true

Start patroni on data-node1, data-node2 and data-node3

sudo systemctl start patroni

Setup HAProxy node

sudo apt-get install haproxy -y 
sudo vim /etc/haproxy/haproxy.cfg

Replace the existing content from haproxy.cfg file with this.

global
        maxconn 100
        log     127.0.0.1 local2
defaults
        log global
        mode tcp
        retries 2
        timeout client 30m
        timeout connect 4s
        timeout server 30m
        timeout check 5s
listen stats
    mode http
    bind *:7000
    stats enable
    stats uri /
listen postgres
    bind *:5000
    option httpchk
    http-check expect status 200
    default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
    server node1 <node1_ip>:5432 maxconn 100 check port 8008
    server node2 <node2_ip>:5432 maxconn 100 check port 8008
    server node3 <node3_ip>:5432 maxconn 100 check port 8008

Restart the HAProxy service

sudo systemctl restart haproxy

Verify the cluster 

From any data node run this command.

postgres@data-node1:~/patroni-packages$ bin/patronictl -c patroni.yml list
+ Cluster: postgres (7374714524234176830) ---------+----+-----------+
| Member     | Host          | Role    | State     | TL | Lag in MB |
+------------+---------------+---------+-----------+----+-----------+
| data-node1 | 172.31.58.171 | Leader  | running   |  2 |           |
| data-node2 | 172.31.50.185 | Replica | streaming |  2 |         0 |
| data-node3 | 172.31.48.62  | Replica | streaming |  2 |         0 |
+------------+---------------+---------+-----------+----+-----------+

Connect from the client with HAProxy

mapmac@192 bin % ./psql -h 35.175.134.23 -p 5000 -d postgres -U postgres
Password for user postgres:
psql (16.0, server 16.3 (Ubuntu 16.3-1.pgdg24.04+1))
Type "help" for help.
postgres=#
postgres=#
postgres=# create database test;
CREATE DATABASE
postgres=# l
                                                   List of databases
   Name    |  Owner   | Encoding | Locale Provider | Collate |  Ctype  | ICU Locale | ICU Rules |   Access privileges  
-----------+----------+----------+-----------------+---------+---------+------------+-----------+-----------------------
postgres  | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           |
template0 | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           | =c/postgres          +
           |          |          |                 |         |         |            |           | postgres=CTc/postgres
template1 | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           | =c/postgres          +
           |          |          |                 |         |         |            |           | postgres=CTc/postgres
test      | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           |
(4 rows)
postgres=#

Testing failover

$ sudo systemctl stop patroni
$ sudo su postgres
$ cd
$ cd patroni-packages/
postgres@data-node1:~/patroni-packages$ bin/patronictl -c patroni.yml list
+ Cluster: postgres (7374714524234176830) ---------+----+-----------+
| Member     | Host          | Role    | State     | TL | Lag in MB |
+------------+---------------+---------+-----------+----+-----------+
| data-node1 | 172.31.58.171 | Replica | stopped   |    |   unknown |
| data-node2 | 172.31.50.185 | Leader  | running   |  3 |           |
| data-node3 | 172.31.48.62  | Replica | streaming |  3 |         0 |
+------------+---------------+---------+-----------+----+-----------+
postgres@data-node1:~/patroni-packages$
mapmac@192 bin % ./psql -h 35.175.134.23 -p 5000 -d postgres -U postgres
Password for user postgres:
psql (16.0, server 16.3 (Ubuntu 16.3-1.pgdg24.04+1))
Type "help" for help.
postgres=# create database test1;
CREATE DATABASE
postgres=# l
                                                   List of databases
   Name    |  Owner   | Encoding | Locale Provider | Collate |  Ctype  | ICU Locale | ICU Rules |   Access privileges  
-----------+----------+----------+-----------------+---------+---------+------------+-----------+-----------------------
postgres  | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           |
template0 | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           | =c/postgres          +
           |          |          |                 |         |         |            |           | postgres=CTc/postgres
template1 | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           | =c/postgres          +
           |          |          |                 |         |         |            |           | postgres=CTc/postgres
test      | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           |
test1     | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           |
(5 rows)

The client can still connect and execute queries with the same IP.

Testing second failover

lets make node 1 down as well.

$ sudo systemctl stop patroni
postgres@data-node1:~/patroni-packages$ bin/patronictl -c patroni.yml list
+ Cluster: postgres (7374714524234176830) -----+----+-----------+
| Member     | Host         | Role   | State   | TL | Lag in MB |
+------------+--------------+--------+---------+----+-----------+
| data-node3 | 172.31.48.62 | Leader | running |  4 |           |
+------------+--------------+--------+---------+----+-----------+
postgres@data-node1:~/patroni-packages$
mapmac@192 bin % ./psql -h 35.175.134.23 -p 5000 -d postgres -U postgres
Password for user postgres:
psql (16.0, server 16.3 (Ubuntu 16.3-1.pgdg24.04+1))
Type "help" for help.
postgres=# create database test3;
CREATE DATABASE
postgres=# l
                                                   List of databases
   Name    |  Owner   | Encoding | Locale Provider | Collate |  Ctype  | ICU Locale | ICU Rules |   Access privileges  
-----------+----------+----------+-----------------+---------+---------+------------+-----------+-----------------------
postgres  | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           |
template0 | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           | =c/postgres          +
           |          |          |                 |         |         |            |           | postgres=CTc/postgres
template1 | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           | =c/postgres          +
           |          |          |                 |         |         |            |           | postgres=CTc/postgres
test      | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           |
test1     | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           |
test3     | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           |
(6 rows)

There is still no issue with the client connection.

Nodes recovery

$ sudo systemct start patroni 
postgres@data-node1:~/patroni-packages$ bin/patronictl -c patroni.yml list
+ Cluster: postgres (7374714524234176830) ---------+----+-----------+
| Member     | Host          | Role    | State     | TL | Lag in MB |
+------------+---------------+---------+-----------+----+-----------+
| data-node1 | 172.31.58.171 | Replica | running   |  2 |         8 |
| data-node2 | 172.31.50.185 | Replica | streaming |  3 |         4 |
| data-node3 | 172.31.48.62  | Leader  | running   |  4 |           |
+------------+---------------+---------+-----------+----+-----------+</pre >

Patroni successfully attached the failed nodes back to the cluster once they are recovered.

Testing replication

postgres@data-node3:~$ psql -h 172.31.48.62 -p 5432 -d postgres -U postgres
Password for user postgres:
psql (16.3 (Ubuntu 16.3-1.pgdg24.04+1))
Type "help" for help.
postgres=# l
                                                   List of databases
   Name    |  Owner   | Encoding | Locale Provider | Collate |  Ctype  | ICU Locale | ICU Rules |   Access privileges  
-----------+----------+----------+-----------------+---------+---------+------------+-----------+-----------------------
postgres  | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           |
template0 | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           | =c/postgres          +
           |          |          |                 |         |         |            |           | postgres=CTc/postgres
template1 | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           | =c/postgres          +
           |          |          |                 |         |         |            |           | postgres=CTc/postgres
test      | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           |
test1     | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           |
test3     | postgres | UTF8     | libc            | C.UTF-8 | C.UTF-8 |            |           |
(6 rows)

We can see all the databases that we created from clients are actually available on node3 as well so replication is working you can verify the same on node 1 and node 2 as well.

Resources

Frequently Asked Questions (FAQs)

Q. What are the core components required for a Patroni-based high availability PostgreSQL cluster?

A robust architecture typically consists of three data nodes managed by Patroni for replication, a dedicated ETCD node for distributed consensus and state management, and an HAProxy node to route client traffic and perform load balancing across the available replicas.

ETCD functions as the Distributed Configuration Store (DCS) that manages leader election mechanics, stores dynamic cluster configurations, and continuously monitors node health to ensure every component maintains a consistent view of the database state.

HAProxy abstracts the physical database IP addresses from the application layer, automatically detecting state changes via health checks and redirecting traffic from failed or promoting nodes to the new primary to minimize downtime during leader elections.

Patroni detects the primary node’s failure through missing heartbeat updates in ETCD, immediately initiates a leader election among the available replicas, and promotes the most up-to-date replica to primary while updating the cluster connection information.

Upon restarting, Patroni identifies the node’s divergent timeline, automatically reconfigures it as a replica, and synchronizes missing data from the current leader (often leveraging pg_rewind) to restore full cluster redundancy without manual intervention.

Leave A Comment