StormaticsStormatics

Automating PostgreSQL Index Tuning Using AI

If you have a slow query, one of the obvious moves is to add an index. So you look at the WHERE clause, pick a column, run CREATE INDEX, and test again. Sometimes it helps, often it doesn't. And now you have an index sitting there, not helping reads, but slowing down every write, because INSERT, UPDATE, and DELETE all have to maintain it. And it gets worse as your system grows.Five queries are manageable. You can reason about column choices, test combinations, and check EXPLAIN output. When you are dealing with fifty queries across a dozen tables, you are evaluating hundreds of possible column combinations manually, each one potentially breaking something in production if you get the locking wrong.
Read More

Critical PMM Alerts Every PostgreSQL DBA Must Track

Have you ever left for home on a Friday evening feeling confident about your work for the day, at peace knowing your system would survive the coming weekend? We’ve all felt that way at some point. Meanwhile, the disk on the server had quietly reached 90% utilization. Write-Ahead Log (WAL) files had accumulated enormously, one long-running query had been running for over an hour, and nobody noticed because, some time earlier, the dashboard had looked fine.
Read More

Long-running transactions, job queues, and the cascade that wreaks havoc

A scheduled PostgreSQL migration step held an open transaction snapshot for hours during the initial data copy. A job queue running at production write velocity began to slow down. Twelve hours later, the queue was seven million rows deep, the primary was pinned at 100 percent CPU across 24 cores, and customer support was fielding complaints about delayed transactions.
Read More

Before You Move a Single Row, Plan Your Cutover

You have deployed your new cluster. Now comes the work of moving your data and cutting over to it. Reading that sentence, you might assume cutover is something you figure out at the end, after the migration is done. And in practice, that is the order in which things happen. But technically, it is your cutover strategy that decides how you migrate, not the other way around. The strategy you pick determines how you configure replication, how many slots you provision, how you handle schema changes, and what your rollback path looks like. So before you touch replication, decide how you want to cut over. In this post, I will walk through the two most common cutover strategies, what each one costs you, and what each one gives you back. Approach 1: Cut Over One Database at a Time Imagine you are moving an entire office to a new location. One way to do it is to move one team at a time, finance this week, engineering next week. Each team settles in before the next one arrives. If something goes wrong with finance’s move, it doesn’t affect engineering. You fix the problem, learn from it, and continue. That’s exactly how this approach works with databases. You pick one database, migrate it, test it, cut it over, confirm everything is fine, and then move to the next. Why This Approach Makes Sense 1. It’s Easier to Manage When you are watching one database go through a cutover, you know exactly where to look if something breaks. Your team isn’t juggling ten things at once. Attention is focused, and problems surface quickly. 2. Issues Show Up Early The first database you cut over is like a fire drill. You discover what your runbook missed, what monitoring didn’t catch, what your rollback steps actually look like in practice, all with limited impact. By the time you reach database number five, your team is smooth and confident. 3. Replication Slots Don’t Pile Up During logical replication, the publisher (your old cluster) has to keep a replication slot open for each active subscriber. These slots hold WAL data, so the subscriber doesn’t miss anything. If you keep many databases in sync at the same time, those slots pile up, resulting in more disk usage, more memory pressure, and more risk. When you cut over one database at a time, you drop the slot as soon as that cutover is done. The publisher breathes easier. 4. Rolling Back Is Simple If something goes wrong after cutting over a single database, you only need to revert one database. Your reverse replication path is short and clean. You are not trying to undo a dozen cutovers at once. 5. CPU Stays Calm Managing replication for many databases at the same time puts a real load on the publisher. WAL senders, replication workers, and the constant work of tracking changes across many slots can spike your CPU at the worst possible moment, right when you need the system to be stable. One database at a time keeps the load predictable. 6. Downtime Can Be Just Seconds A well-prepared cutover for a single database is fast. You pause writes, confirm the subscriber has caught up (lag is zero), update your connection strings, and resume. 7. Less Room for Human Error Cutover is a high-stress, time-sensitive operation. The fewer things your team has to do at once, the fewer mistakes happen. One database at a time means one checklist, one confirmation, one rollback plan, not ten of them running in parallel. 8. The Migration Can Be Done in Hours or a Few Days Each database is self-contained. You don’t have to wait for every database in the cluster to be ready before you can start cutting over. Even if a database is huge, the migration window is limited to just that one.
Read More

PostgreSQL’s Growing Role in AI Infrastructure

Around 80% of LLMs and AI-powered applications choose either PostgreSQL or Supabase when selecting a database. That level of adoption says something important about where engineering teams are placing their trust. Supabase has become one of the most common starting points for AI products. Most AI frameworks support PostgreSQL and pgvector directly. For many teams, PostgreSQL is already part of the stack before the AI layer is even introduced.
Read More

SSL in PostgreSQL

SSL stands for Secure Sockets Layer. Its successor is TLS (Transport Layer Security), but in the PostgreSQL world, and in most documentation, people still call it SSL out of old habit. Don’t let that confuse you. When someone says “SSL” in a Postgres context, they mean modern TLS-based encryption. Here’s the problem it solves. By default, when your application connects to PostgreSQL, everything travels across the network in plain text. Usernames. Passwords. Every query you run. Every row of data that comes back. If anyone can intercept that traffic, someone on the same network, a compromised internal service, they can read all of it. A basic packet sniffer is enough. No special skills needed.
Read More

You have a Patroni leader election. You are only halfway to PostgreSQL high availability.

A PostgreSQL primary loses power at 2am. Writes resume in under thirty seconds. The on-call engineer reads the alert in the morning, sees that the cluster healed itself, and goes back to coffee. That is the outcome PostgreSQL high availability is supposed to deliver.A working Patroni cluster, on its own, gets you partway there. The leader election runs. A standby gets promoted. The cluster state in etcd stays consistent. Then the application keeps trying to reach an IP address that points at the wrong node now, the old primary needs a manual rejoin, and the on-call engineer is on a conference bridge instead of in bed.
Read More