StormaticsStormatics

3 Features I am Looking Forward to in PostgreSQL 18

It is that time of the year again. The first release candidate of PostgreSQL 18 is out, and things look promising. We should expect General Availability in the next 2-4 weeks. Exciting times! Over the past many years and as many releases, the PostgreSQL community has done a phenomenal job of being disciplined about the annual release process. And we have done so averaging 150+ new features with each release!
Read More

PostgreSQL Database SLAs: Why Hidden Issues Often Break Customer Commitments

SLAs feel reassuring when signed—but their substance lies in what happens behind the scenes. Often, the most damaging breaches don’t stem from cloud outages or server failures, but from invisible issues hidden in how PostgreSQL was initially set up and configured. Increasingly sluggish queries, split-brain scenarios, silent backup failures, any of these can suddenly explode into customer-facing crises. 1. Slow Queries: The Sneaky SLA Saboteur The Hidden Cost of Delayed Queries A seemingly minor tuning oversight, like a missing index or outdated statistics, can turn a 200 ms query into a 10-second slog. It might not seem urgent initially, but as concurrency increases, cascading delays build up. A Slow Query Accelerated 1000× In one case study, an engineer faced a painfully slow query that scanned 50 million rows through a sequential scan—even though it was a simple query filtering on two columns (col_1, col_2) and selecting by id. After creating an index using those columns plus an INCLUDE (id) clause, the query performance improved dramatically: what had taken seconds dropped to just milliseconds, representing up to a 1,000× improvement in the worst-case runtime. [Ref: Learnings from a slow query analysis in PostgreSQL] This shows how even a simple query, if not indexed properly, can pose an SLA risk as data volume increases.
Read More

When PostgreSQL performance slows down, here is where to look first

PostgreSQL is built to perform. However, as workloads increase and systems evolve, even the most robust setups can begin to show signs of strain. Whether you are scaling a product or supporting enterprise SLAs, performance slowdowns tend to surface when you least want them to. If you are a technology leader overseeing a team of developers who manage PostgreSQL as part of a broader application stack, or you are responsible for uptime and customer satisfaction at scale, knowing where to look first can make all the difference.
Read More

DBA as a Service for PostgreSQL: Expert-Led Support for Databases That Power Your Business

Let us start with some simple math. 24/7 coverage means 168 hours a week. A full-time engineer typically works a 40-hour week. That means you need 4.2 people just to ensure round-the-clock presence — and that is before factoring in weekends, public holidays, personal days, or unplanned absences. Realistically, you need a team of six to ensure someone is available at all times to look after your database. That is why we built DBA as a Service at Stormatics. It is a managed PostgreSQL operations partnership, designed specifically for high-growth teams that need reliable, expert-led care for their database layer, allowing them to stay focused on product delivery.
Read More

From 99.9% to 99.99%: Building PostgreSQL Resilience into Your Product Architecture

Most teams building production applications understand that “uptime” matters. I am writing this blog to demonstrate how much difference an extra 0.09% makes. At 99.9% availability, your system can be down for over 43 minutes every month. At 99.99%, that window drops to just over 4 minutes. If your product is critical to business operations, customer workflows, or revenue generation, those 39 extra minutes of downtime each month can be the difference between trust and churn.
Read More

Checklist: Is Your PostgreSQL Deployment Production-Grade?

One of the things I admire most about PostgreSQL is its ease of getting started. I have seen many developers and teams pick it up, launch something quickly, and build real value without needing a DBA or complex tooling. That simplicity is part of what makes PostgreSQL so widely adopted. However, over time, as the application grows and traffic increases, new challenges emerge. Queries slow down, disk usage balloons, or a minor issue leads to unexpected downtime. This is a journey I have witnessed unfold across many teams. I don’t think of it as a mistake or an oversight; it is simply the natural progression of a system evolving from development to production scale. The idea behind this blog is to help you assess your current situation and identify steps that can enhance the robustness, security, and scalability of your PostgreSQL deployment.
Read More

Step by Step Guide on Setting Up Physical Streaming Replication in PostgreSQL

Physical streaming replication in PostgreSQL allows you to maintain a live copy of your database on a standby server, which continuously receives updates from the primary server’s WAL (Write-Ahead Log). This standby (or hot standby) can handle read-only queries and be quickly promoted to primary in case of failover, providing high availability and disaster recovery. In this guide, I will walk through provisioning a primary PostgreSQL 16 server and a standby server on Linux, configuring them for streaming replication, and verifying that everything works. I assume you are an experienced engineer familiar with Linux, but new to PostgreSQL replication, so I will keep it friendly and straightforward. Figure: Real-time data streaming from a primary PostgreSQL server (left) to a standby server (right). The standby constantly applies WAL records received from the primary over a network connection, keeping an up-to-date copy of the database ready for failover. Step 1: Prepare Two Linux Servers and Install PostgreSQL 16 Before diving into PostgreSQL settings, set up two Linux servers (virtual or physical). One will act as the primary database server, and the other as the standby (read replica). For a smooth replication setup, both servers should be as similar as possible in OS, hardware, and PostgreSQL version. In particular, ensure the following prerequisites: PostgreSQL 16 is installed on both servers via the official PostgreSQL repositories. Both servers must run the same major PostgreSQL version and architecture (mixing different versions won’t work for physical replication). If you haven’t installed PostgreSQL yet, do so now (e.g., on Ubuntu: sudo apt install postgresql-16, or on RHEL/CentOS: use the PostgreSQL Yum repository). Make sure the PostgreSQL service is running on the primary server. Network connectivity: The standby must be able to reach the primary on the PostgreSQL port (default 5432). If the servers are in a cloud environment like AWS EC2, configure the security group or firewall to allow the standby’s IP to connect to the primary on port 5432. For example, in AWS you’d add an inbound rule permitting the standby’s private IP address (or subnet) access to port 5432 on the primary. It is best to use private network interfaces for replication to reduce latency and avoid exposing the database publicly. System settings: Ensure your servers have the necessary OS user and permissions for PostgreSQL. The installation usually creates a postgres UNIX user that owns the data directories. You will run many commands as this postgres user. Also, verify that important prerequisites like consistent time sync (NTP) are in place, as it is generally good practice for database servers (though not specific to replication). With the infrastructure ready, let’s proceed to configure the primary PostgreSQL server to accept replication connections.
Read More

Improving PostgreSQL Performance with Partitioning

My recommended methodology for performance improvement of PostgreSQL starts with query optimization. The second step is architectural improvements, part of which is the partitioning of large tables. Partitioning in PostgreSQL is one of those advanced features that can be a powerful performance booster. If your PostgreSQL tables are becoming very large and sluggish, partitioning might be the cure. The Big Table Problem Large tables tend to grow uncontrollably, especially in OLTP or time-series workloads. As millions or billions of rows accumulate, you begin to notice: Slow queries due to full table scans or massive indexes. Heavy I/O usage, especially when indexes cannot fit in memory. Bloated memory during operations like sorting or joining. Increased maintenance cost, with longer VACUUM, ANALYZE, and REINDEX times. Hard-to-manage retention policies, as purging old rows becomes expensive. These problems are amplified in cloud-hosted databases, where every IOPS, GB, or CPU upgrade increases cost.
Read More

SELECT FOR UPDATE – Reduce Contention and Avoid Deadlocks to Improve Performance in PostgreSQL

Relational databases are at the heart of countless applications around the world, from high-traffic e-commerce websites to enterprise resource planning (ERP) systems and financial services. Concurrency management—where multiple database transactions operate on the same data simultaneously—is critical to getting good performance and avoiding problems like deadlocks or data inconsistencies. When multiple transactions need to modify the same rows, ensuring data consistency can become tricky. A single wrong approach to locking can lead to suboptimal performance or even bring your application to a standstill as numerous transactions block one another. One tool in PostgreSQL’s arsenal to handle concurrency is SELECT FOR UPDATE. It allows you to lock specific rows before updating them, preventing other transactions from modifying those rows until your transaction completes. In this blog, we will dive deep into SELECT FOR UPDATE in PostgreSQL. We will explore how it helps in reducing contention, avoiding deadlocks, and ultimately boosting performance when dealing with highly concurrent applications.
Read More