StormaticsStormatics

Understanding Disaster Recovery in PostgreSQL

System outages, hardware failures, or accidental data loss can strike without warning. What determines whether operations resume smoothly or grind to a halt is the strength of the disaster recovery setup. PostgreSQL is built with powerful features that make reliable recovery possible. This post takes a closer look at how these components work together behind the scenes to protect data integrity, enable consistent restores, and ensure your database can recover from any failure scenario.
Read More

Understanding PostgreSQL WAL and optimizing it with a dedicated disk

If you manage a PostgreSQL database with heavy write activity, one of the most important components to understand is the Write-Ahead Log (WAL). WAL is the foundation of PostgreSQL’s durability and crash recovery as it records every change before it’s applied to the main data files. But because WAL writes are synchronous and frequent, they can also become a serious performance bottleneck when they share the same disk with regular data I/O.
Read More

The Hidden Bottleneck in PostgreSQL Restores and its Solution

In July 2025, during the PG19-1 CommitFest, I reviewed a patch targeting the lack of parallelism when adding foreign keys in pg_restore. Around the same time, I was helping a client with a large production migration where pg_restore dragged on for more than 24 hours and crashed multiple times. In this blog, I will talk about the technical limitations in PostgreSQL, the proposed fix, and a practical workaround for surviving large restores.
Read More

Cold, Warm, and Hot Standby in PostgreSQL: Key Differences

When working with customers, a common question we get is: “Which standby type is best for our HA needs?” Before answering, we ensure they fully understand the concepts behind each standby type and provide the necessary guidance A standby server is essentially a copy of your primary database that can take over if the primary fails. There are different types of standby setups, each with its own use cases, pros, and cons. In this blog, we will discuss the three types: Cold Standby, Warm Standby, and Hot Standby.
Read More

Achieving High Availability in PostgreSQL: From 90% to 99.999%

When you are running mission-critical applications, like online banking, healthcare systems, or global e-commerce platforms, every second of downtime can cost millions and damage your business reputation. That’s why many customers aim for four-nines (99.99%) or five-nines (99.999%) availability for their applications n this post, we will walk through what those nines really mean and, more importantly, which PostgreSQL cluster setup will get you there.
Read More

A Guide to Deploying Production-Grade Highly Available Systems in PostgreSQL

In today’s digital landscape, downtime isn’t just inconvenient, it’s costly. No matter what business you are running, an e-commerce site, a SaaS platform, or critical internal systems, your PostgreSQL database must be resilient, recoverable, and continuously available. So in short: High Availability (HA) is not a feature you enable; it’s a system you design.
Read More

Replication Types and Modes in PostgreSQL

Data is a key part of any mission-critical application. Losing it can lead to serious issues, such as financial loss or harm to a business’s reputation. A common way to protect against data loss is by taking regular backups, either manually or automatically. However, as data grows, backups can become large and take longer to complete.
Read More

Disaster Recovery Guide with pgbackrest

Recently, we worked with a client who was manually backing up their 800GB PostgreSQL database using pg_dump, which was growing rapidly and had backups stored on the same server as the database itself. This setup had several critical issues: - Single point of failure: If the server failed, both the database and its backups would be lost. - No point-in-time recovery: Accidental data deletion couldn’t be undone. - Performance bottlenecks: Backups consumed local storage, impacting database performance. To address these risks, we replaced their setup with pgBackRest, shifting backups to a dedicated backup server with automated retention policies and support for point-in-time recovery (PITR). This guide will walk you through installing, configuring, and testing pgBackRest in a real-world scenario where backups will be configured on a dedicated backup server, separate from the data node itself.
Read More

From 99.9% to 99.99%: Building PostgreSQL Resilience into Your Product Architecture

Most teams building production applications understand that “uptime” matters. I am writing this blog to demonstrate how much difference an extra 0.09% makes. At 99.9% availability, your system can be down for over 43 minutes every month. At 99.99%, that window drops to just over 4 minutes. If your product is critical to business operations, customer workflows, or revenue generation, those 39 extra minutes of downtime each month can be the difference between trust and churn.
Read More