PostgreSQL High Availability on OCI: Why Your Failover Passes Every Test But Breaks in Production

Your PostgreSQL HA cluster promotes a new primary. Patroni says everything is healthy. But your application is still talking to the old, dead node. Welcome to the OCI VIP problem.

If you have built PostgreSQL high availability clusters on AWS or Azure, you have probably gotten comfortable with how virtual IPs work. You assign a VIP, your failover tool moves it, and your application reconnects to the new primary. Clean. Simple. Done.

Then you try the same thing on Oracle Cloud Infrastructure and something quietly goes wrong.

The cluster promotes. Patroni (or repmgr, or whatever you are using) does its job. The standby becomes the new primary. But the VIP does not follow. Your application keeps sending traffic to the old node — the one that just failed. From the outside, it looks like the database is down. From the inside, everything looks green.

This is one of the more frustrating failure modes we have worked through in production. Not because it is hard to fix, but because it is hard to catch. It passes every test you throw at it right up until the moment it matters.

Let me walk you through why this happens, how to fix it, and how to pick the right approach for your environment.

Why OCI Handles VIPs Differently

On AWS, a secondary private IP can float between instances within a subnet. You call assign-private-ip-addresses and it moves. On Azure, you update a NIC’s IP configuration. In both cases, your failover tool can handle this natively, or with a small callback script.

OCI does not work that way.

On OCI, a virtual IP (implemented as a secondary private IP on a VNIC) is explicitly bound to a specific instance’s Virtual Network Interface Card. It cannot float between instances the way it does on AWS or Azure. When your primary fails and the standby gets promoted, the VIP stays attached to the old instance’s VNIC. It does not move on its own, and the standard failover tooling does not know it needs to make OCI API calls to move it.

Here is what is happening under the hood. On OCI, each compute instance has a primary VNIC with a primary private IP. You can add secondary private IPs to that VNIC, and those secondary IPs function as your VIPs. But reassigning that secondary IP to a different instance requires an explicit API call to detach it from one VNIC and attach it to another. The networking layer will not do this for you just because the IP was brought down on one host and brought up on another.

This is worth repeating: the IP address being “up” at the OS level on the new node does not mean OCI’s networking fabric is routing traffic to it. You have to tell OCI to move the secondary IP assignment at the cloud control plane level. Otherwise, packets continue to arrive at the old VNIC — which is attached to an instance that is either down or no longer the primary.

This is not a PostgreSQL limitation. It is not a Patroni limitation. It is an OCI networking behavior that breaks assumptions baked into most HA tooling.

The Silent Failure: What It Actually Looks Like

Here is the scenario that catches teams off guard.

You set up a three-node Patroni cluster on OCI. Primary on node-1, synchronous standby on node-2, async standby on node-3. You configure a VIP on node-1. You test failover. Patroni promotes node-2. You check patronictl list and everything looks correct — node-2 is the leader, node-1 is a replica (or down).

But the application never reconnects.

The VIP is still registered on node-1’s VNIC inside OCI’s networking fabric. Even though node-2 brought the VIP up at the OS level (using ip addr add), OCI is not routing traffic to node-2. The packets are still going to node-1’s VNIC.

If node-1 is completely down, the application gets connection timeouts. If node-1 is up but demoted to a replica, something worse happens — the application connects successfully but hits a read-only node and starts throwing errors on every write.

Either way, your failover did not actually fail over from the application’s perspective. And unless you are specifically testing application connectivity after failover (not just checking cluster state), you will miss this during testing.

Two Approaches That Work in Production

We have deployed both of these in production on OCI with PostgreSQL clusters targeting 99.99% availability. They solve the same problem from different angles, and the right choice depends on your architecture and your team’s operational preferences.

Approach 1: HAProxy with Health Checks (Skip the VIP Entirely)

The most direct way to sidestep the OCI VIP problem is to stop relying on a VIP for routing. Instead, put HAProxy in front of your PostgreSQL cluster and let it figure out which node is the primary.

Here is how it works. HAProxy sits between your application and the PostgreSQL nodes. It performs health checks against each node’s Patroni REST API. Patroni exposes endpoints that tell you exactly what role each node is playing:

GET /primary returns HTTP 200 only on the current primary
GET /replica returns HTTP 200 only on replicas
GET /health returns HTTP 200 when PostgreSQL is up and running

You configure HAProxy to use these endpoints as health checks. When a failover happens, Patroni promotes the new primary. HAProxy’s next health check cycle detects the role change and starts routing traffic to the new leader. No VIP movement required. No OCI API calls. No callback scripts.

Here is a stripped-down HAProxy configuration for this pattern:

global
    maxconn 1000
defaults
    mode tcp
    timeout connect 5s
    timeout client 30s
    timeout server 30s
listen postgresql_primary
    bind *:5432
    option httpchk GET /primary
    http-check expect status 200
    default-server inter 3s fall 3 rise 2 maxconn 500
    server pg-node1 10.0.1.10:5432 check port 8008
    server pg-node2 10.0.1.11:5432 check port 8008
    server pg-node3 10.0.1.12:5432 check port 8008
listen postgresql_replicas
    bind *:5433
    balance roundrobin
    option httpchk GET /replica
    http-check expect status 200
    default-server inter 3s fall 3 rise 2 maxconn 500
    server pg-node1 10.0.1.10:5432 check port 8008
    server pg-node2 10.0.1.11:5432 check port 8008
    server pg-node3 10.0.1.12:5432 check port 8008

Port 8008 is the default Patroni REST API port. HAProxy checks that endpoint every 3 seconds. If a node fails the health check 3 times in a row, HAProxy removes it from the pool. Once it passes 2 consecutive checks, it comes back in.

The postgresql_primary listener only routes to whichever node responds with HTTP 200 on /primary. After a failover, only the new primary will return 200 on that endpoint. The switchover happens automatically on the next health check cycle.

What to watch out for:

HAProxy itself becomes a component you need to keep available. Run it on at least two nodes with Keepalived (or behind an OCI load balancer) to avoid making it a single point of failure.
Health check intervals determine your detection speed. With inter 3s and fall 3, you are looking at roughly 9 seconds before HAProxy pulls a failed node. For most production workloads targeting 99.99%, that is well within acceptable RTO.
Your application connection string points to HAProxy, not to individual PostgreSQL nodes. This is a change from VIP-based setups, so plan your connection string management accordingly.

When this approach makes sense:

This is the simpler path if you are building fresh or if you already use HAProxy elsewhere in your stack. You eliminate the VIP problem entirely rather than working around it. You also get read/write splitting for free by running separate listeners for primary and replica traffic.

Approach 2: Post-Promotion Hooks with OCI API Calls (Keep the VIP)

If you want to preserve VIP-based failover — maybe because your application connection strings are already hardcoded to a VIP, or because your network architecture depends on a stable IP — you can make it work on OCI. You just need to teach your failover manager how to talk to OCI’s API.

The idea is straightforward. When Patroni promotes a new primary, it runs a callback script. That script uses the OCI CLI to detach the VIP (secondary private IP) from the old primary’s VNIC and attach it to the new primary’s VNIC. At the OS level, the script also adds the IP to the new node’s network interface.

This is a two-step operation:

OCI control plane: Reassign the secondary private IP to the new node’s VNIC using oci network vnic assign-private-ip
OS level: Add the IP address to the local network interface using ip addr add

Both steps matter. The OCI API call moves the routing at the cloud networking level. The ip addr add makes the IP reachable at the OS level. Skip either one and you are back to the silent failure.

Here is a reference callback script:

#!/bin/bash
# OCI VIP failover callback for Patroni
# Called with: action role cluster_name
ACTION=$1
ROLE=$2
CLUSTER=$3
VIP="10.0.1.100"
CIDR="24"
INTERFACE="ens3"
# VNIC OCIDs for each node - populate these with your actual values
NODE1_VNIC="ocid1.vnic.oc1.REGION.NODE1_VNIC_OCID"
NODE2_VNIC="ocid1.vnic.oc1.REGION.NODE2_VNIC_OCID"
NODE3_VNIC="ocid1.vnic.oc1.REGION.NODE3_VNIC_OCID"
HOSTNAME=$(hostname -s)
LOG="/var/log/patroni/vip-failover.log"
log_msg() {
    echo "$(date '+%Y-%m-%d %H:%M:%S') [$ACTION] [$ROLE] $1" >> "$LOG"
}
get_local_vnic() {
    case "$HOSTNAME" in
        pg-node1) echo "$NODE1_VNIC" ;;
        pg-node2) echo "$NODE2_VNIC" ;;
        pg-node3) echo "$NODE3_VNIC" ;;
        *)
            log_msg "ERROR: Unknown hostname $HOSTNAME"
            exit 1
            ;;
    esac
}

assign_vip() {
    local VNIC_ID=$(get_local_vnic)

    log_msg "Assigning VIP $VIP to VNIC $VNIC_ID"
    # Reassign at the OCI control plane
    # --unassign-if-already-assigned handles detaching from the old VNIC
    /usr/bin/oci network vnic assign-private-ip \
        --unassign-if-already-assigned \
        --vnic-id "$VNIC_ID" \
        --ip-address "$VIP" \
        --auth instance_principal \
        >> "$LOG" 2>&1
    if [ $? -ne 0 ]; then
        log_msg "ERROR: OCI API call failed"
        exit 1
    fi
    # Add the IP at the OS level
    ip addr add "${VIP}/${CIDR}" dev "$INTERFACE" label "${INTERFACE}:vip" 2>/dev/null

    # Send a gratuitous ARP to update the local network
    arping -c 3 -U -I "$INTERFACE" "$VIP" >> "$LOG" 2>&1
    log_msg "VIP $VIP successfully assigned"
}
remove_vip() {
    log_msg "Removing VIP $VIP from local interface"
    ip addr del "${VIP}/${CIDR}" dev "${INTERFACE}:vip" 2>/dev/null
    log_msg "VIP $VIP removed"
}
# Main logic
case "$ACTION" in
    on_start|on_role_change)
        if [ "$ROLE" = "primary" ]; then
            assign_vip
        else
            remove_vip
        fi
        ;;
    on_stop)
        remove_vip
        ;;
esac
exit 0

And here is the relevant section of your patroni.yml:
postgresql:
  callbacks:
    on_start: /opt/patroni/scripts/oci_vip_callback.sh
    on_stop: /opt/patroni/scripts/oci_vip_callback.sh
    on_role_change: /opt/patroni/scripts/oci_vip_callback.sh

Patroni passes three arguments to every callback script: the action (on_start, on_stop, on_role_change), the current role (primary or replica), and the cluster name. The script uses these to decide whether to grab or release the VIP.

Prerequisites you cannot skip:

OCI CLI installed on every node. The script calls the oci command directly. Make sure it is in your PATH and accessible to the user running Patroni.
Instance principal authentication. Instead of storing API keys on each node, configure a dynamic group in OCI that includes your PostgreSQL instances, and attach a policy that allows them to manage VNICs. The –auth instance_principal flag in the script handles this. The policy statement you need looks something like:

Allow dynamic-group pg-cluster-nodes to use vnics in compartment YOUR_COMPARTMENT

Allow dynamic-group pg-cluster-nodes to use private-ips in compartment YOUR_COMPARTMENT

VNIC OCIDs for every node. You can find these in the OCI console under Compute → Instance → Attached VNICs, or programmatically via the instance metadata endpoint.
The arping utility. After reassigning the IP, you want to send gratuitous ARPs so that other hosts on the subnet update their ARP caches. This speeds up convergence. Install arping (it is in the iputils package on most distributions).

What to watch out for:

The OCI API call takes a few seconds. This is time added to your failover window on top of what Patroni takes to detect the failure, acquire the leader lock, and promote. Plan for roughly 5–15 seconds total depending on your Patroni configuration.
If the API call fails (network issue, permission error, OCI service hiccup), the VIP does not move. Your script should log aggressively so you can diagnose this after the fact. Consider adding a retry loop with a short backoff.
Test the OCI API call in isolation first. SSH into each node and run the oci network vnic assign-private-ip command manually. Confirm that the VIP actually moves in the OCI console. Do this before you wire it into Patroni.

When this approach makes sense:

This is the right path if you need to keep your application connection strings pointing at a stable IP address and you do not want to introduce HAProxy into your architecture. It preserves the traditional VIP model — the IP moves, the application reconnects to the same address. The tradeoff is added complexity in the callback script and a dependency on OCI API availability during failover.

Which Approach Should You Pick?

Both work. Both are production-validated. Here is how to decide.

Go with HAProxy if you are deploying a new cluster, you want the simplest operational model, or you want read/write splitting without extra infrastructure. HAProxy removes the VIP problem entirely and gives you health-check-based routing as a bonus. The tradeoff is that HAProxy becomes a component you need to keep available, and your application needs to connect through it rather than directly to a VIP.

Go with OCI API callbacks if you have existing applications that connect to a VIP and you cannot change the connection strings, or if your network security model requires traffic to go directly to a specific IP rather than through a proxy. The tradeoff is more moving parts during failover (OCI API call latency, authentication setup, callback script maintenance).

In either case, test failover with your application connected, not just by watching patronictl list. The whole point of this blog is that cluster state can look perfect while your application is talking to the wrong node. The only test that matters is: “After failover, does my application successfully write to the new primary?”

A Few Things We Have Learned the Hard Way

Do not rely on the OS-level IP alone. We have seen teams add the VIP with ip addr add on the new primary without making the OCI API call. The IP shows up on the interface, ping works from the same host, but traffic from other hosts in the subnet does not arrive. OCI’s virtual network routes packets based on the VNIC assignment in the control plane, not based on what the OS thinks.

Test OCI API permissions before your first failover drill. Instance principal authentication requires a dynamic group and a policy. If the policy is wrong, the OCI CLI call fails silently (or with an error that only shows up in the script log). Run the oci network vnic assign-private-ip command by hand on each node and verify it works before you trust it in a callback.

Log everything in your callback script. When failover does not work at 2 AM, the first thing you want is a log that shows exactly what the callback tried to do and what OCI responded with. Timestamp it. Include the action, the role, the VNIC ID, and the API response. Future-you will be grateful.

If you choose HAProxy, make it redundant. A single HAProxy instance in front of your PostgreSQL cluster is a single point of failure. Run two HAProxy instances with Keepalived (or use OCI’s Network Load Balancer) so that losing one proxy does not take down access to your database. This is basic hygiene but we still see it skipped more often than you would expect.

Plan for the OCI API being slow or temporarily unavailable. The assign-private-ip call is usually fast, but cloud APIs have bad days. If you are using the callback approach, add retry logic with exponential backoff. Two or three retries with 2-second intervals covers most transient issues without significantly extending your failover window.

Closing Thought

The VIP problem on OCI is not exotic. It is just different from what most PostgreSQL HA guides assume. If you are following a tutorial written for AWS or Azure and deploying on OCI, this is the specific place where things quietly break.

The fix is not complicated once you know what to look for. Either route around the VIP with HAProxy, or teach your failover tool to speak OCI’s API language. Both paths get you to production-grade failover with RPO and RTO close to zero.

What takes longer is finding the problem in the first place — because the cluster looks healthy the whole time. If this blog saves you that debugging session, it did its job.