
On July 16, 2025, Stormatics hosted an insightful webinar titled “Designing Highly Available Postgres for Continuous Operations”, featuring our Boriss Mejías, PostgreSQL Solutions Architect at EDB.
The session offered a deep dive into strategies for designing highly available Postgres for continuous operations. Packed with actionable insights and real-world examples, the webinar covered replication, auto-failover, and clustering, giving attendees practical tools to build reliable and resilient database environments.
The Q&A session featured several pressing questions from attendees, with detailed explanations provided by Boriss included in this blog post:
1) How can I filter or differentiate read and write queries in PostgreSQL?
The only way to do this is to parse the query to know if it is read-only or if the query writes something. But even when parsing the query there are some problems, for instance, `SELECT f(42);` looks like a read query because it is a simple SELECT of a function, but we don’t know if the function f(x) runs some write operations internally.
If what you are searching for is transparent load-balancing of read queries to the standby nodes, then pgpool-ii is the most popular tool, with its advantages and disadvantages. Nowadays, the idea of having two types of channels to communicate to the database, one for read-writes, and one type for read queries, is increasing its popularity, because basically, the application knows which are the read only queries. This approach prevents parsing the query twice. This is usually achieved by providing two connections in pgboucer. In more modern deployments with kubernetes, the CloudNativePG operator publishes a read-write service pointing to the primary, a read-only service redirecting to any standby, and a read service just connecting to any database node.
2) What is the recommended approach for implementing a multi-primary (read/write) setup in PostgreSQL?
EDB offers Postgres Distributed (PGD), based on logical bi-directional replication (originally known as BDR), which is the reference software for other multi-primary extensions such as pg-edge. The important thing is to identify if the use-case matches a mutli-primary architecture. If that is the case, when choosing the software, make sure it handles DDL replication automatically, and that it offers automatic conflict-resolution. More information about PGD can be found here: https://www.enterprisedb.com/docs/pgd/latest/
3) When upgrading with pg_upgrade in a 3-node HA cluster, which node should be upgraded first, and what is the proper sequence?
You actually need to stop all notes before running pg_upgrade, and before stopping the nodes, you need to make sure the standby nodes are fully synchronized with the primary. Once all nodes are stopped, first upgrade the primary, and only then the standby nodes (this order is the contrary to minor upgrades). Note also that there is no full guarantee that standby will be correctly upgraded, and the process requires the usage of other tools such as rsync. The steps are documented in here: https://www.postgresql.org/docs/current/pgupgrade.html
4) Does synchronous replication cause a noticeable performance penalty during commits?
This is an interesting case, because each transaction in itself will be much slower, because it requires the roundtrip to a replica. However, since queries in the primary node are all running concurrently, and streaming replication sends multiple queries at once, the measurement of transactions per second (tps) is usually decreased by less than 10%, at least in the majority of cases we have observed. This value will depend on your workload and network quality. We definitely recommend testing and measuring.
5) Do write-once disks significantly reduce the risk of backup corruption?
Cheksums are the common solution to reduce the risk of backup corruption. Write-once actually targets prevention against ransomware, because once the backup is written, it cannot be encrypted by the malicious software. Now, if the data is not verified with a checksum, it can still be written with corruption, and stays like that forever. Note that write-one devices will increase your cost of storage.
Missed the session? Watch the recording here.