When designing a highly available PostgreSQL cluster, two popular tools often come into the conversation: Pgpool-II and Patroni. Both are widely used in production environments, offer solid performance, and aim to improve resilience and reduce downtime; however, they take different approaches to achieving this goal.
We often get questions during webinars/talks and customer calls about which tool is better suited for production deployments. So, we decided to put together this blog to help you understand the differences and guide you in choosing the right solution based on your specific use case.
Before we dive into comparing these two great tools for achieving high availability, let’s first take a quick look at some of the key components involved in building a highly available and resilient setup.
Load Balancing
Load balancing helps distribute incoming SELECT queries evenly across read replicas. By offloading read traffic from the primary node, it can focus on handling write operations more efficiently. Heavy read workloads like reporting queries or dashboards can be directed to standby nodes, reducing the burden on the primary. This can also increase the overall transactions per second (TPS)
Connection Pooling
Connection pooling helps manage database connections efficiently, especially in high-concurrency environments. PostgreSQL has a connection limit per server, and opening/closing connections is expensive. Connection poolers can maintain a pool of persistent connections and reuse them for incoming client requests, which reduces overhead and boosts performance. This is equally important when many clients or applications interact with the database simultaneously.
Auto Failover
Auto failover ensures that when the primary database goes down, another healthy standby is automatically promoted to take its place, minimizing manual intervention. Auto failover is a central component to achieving true high availability and reducing downtime during node failures.
Consensus-Based Voting
This mechanism prevents split-brain scenarios, where two nodes might incorrectly assume they are the primary. Before promoting a new primary, the cluster members vote and reach an agreement through this process. This ensures there’s always only one active primary at any time and the cluster state remains consistent.
Replication
Replication keeps data in sync across multiple PostgreSQL nodes. PostgreSQL natively supports streaming replication, where changes from the primary are continuously sent to standby nodes. This can be synchronous (safer but slightly slower) or asynchronous (faster but with some risk of data loss during failover). Replication forms the backbone of any HA setup; without it, failover wouldn’t be possible.
Downtime
Downtime is the period when the database becomes unavailable due to failures, maintenance, or network disruptions. In high availability (HA) systems, the primary goal is to minimize this unavailability. For mission-critical applications, the Recovery Time Objective (RTO) is usually predefined to ensure that downtime stays within acceptable limits and recovery can be planned accordingly.
Recovery Time Objective (RTO)
RTO is the maximum acceptable amount of time your system can remain unavailable after a failure. For databases, this means how quickly you can detect an issue, fail over to a standby, and resume service. With proper planning and the help of HA tools, RTO can often be brought down to a matter of seconds.
Recovery Point Objective (RPO)
RPO defines how much data you are willing to lose in the event of a failure. In PostgreSQL, this often depends on the replication method. With synchronous replication, RPO can be zero (no data loss), while asynchronous replication may result in a small lag, meaning some transactions could be lost if a failover happens.
Replication Lag
Replication lag measures the delay between when a transaction is committed on the primary and when it becomes visible on the standby. High lag can compromise HA guarantees, especially if failover occurs while lag is present.
Now that we have covered the key concepts involved in designing high availability and resilient systems, let’s explore the main differences between Pgpool and Patroni.
Pgpool vs. Patroni: Key Differences
The table below highlights the key differences between these two HA tools.
Feature | Pgpool | Patroni |
Open source | Yes – pgpool-II License | Yes – MIT License |
Supported platforms | Unix-compatible platforms | Unix-compatible platforms and Windows |
Load balancer | yes (built in) | No |
Connection pooling | Yes (built in) | No |
Proxy server | Yes (built in) | No |
Consensus mechanism | Yes (built in) | Yes requires DCS |
Setup complexity for the cluster | High | Medium to High |
Auto-failover type | Script based | DCS driven |
Query caching | Yes (built in) | No |
Split-Brain prevention method | Watchdog and heartbeat | DCS |
Requires witness node | No | Yes |
Tool Available to Automate Deployment | pg_cirrus | No |
Conclusion
Both Pgpool-II and Patroni are powerful tools for achieving high availability in PostgreSQL. While HA can be achieved with either, Pgpool-II stands out for its built-in features, such as load balancing, connection pooling, and query caching, making it a solid standalone choice if your application relies heavily on these capabilities. In contrast, Patroni is purpose-built for robust, automatic failover and consistency, leveraging a distributed consensus system to maintain cluster health and integrity.
Although features like load balancing, connection pooling, and proxy server support aren’t built into Patroni, they can still be achieved by integrating other tools available within the PostgreSQL ecosystem.
This blog highlights how to use Patroni to set up a 3-node PostgreSQL cluster.
You can also explore tools like pg_cirrus, which simplify the deployment of 3-node HA clusters with Pgpool by eliminating the need for manual setup and configuration, saving both time and effort.
Ultimately, there’s no one-size-fits-all answer. Your application’s architecture, traffic patterns, and failover requirements should guide your decision.