PostgreSQL High Availability on OCI: Why Your Failover Passes Every Test But Breaks in Production

If you have built PostgreSQL high availability clusters on AWS or Azure, you have probably gotten comfortable with how virtual IPs work. You assign a VIP, your failover tool moves it, and your application reconnects to the new primary. Clean. Simple. Done.Then you try the same thing on Oracle Cloud Infrastructure and something quietly goes wrong.The cluster promotes. Patroni (or repmgr, or whatever you are using) does its job. The standby becomes the new primary. But the VIP does not follow. Your application keeps sending traffic to the old node — the one that just failed. From the outside, it looks like the database is down. From the inside, everything looks green.
Read More

pgNow Instant PostgreSQL Performance Diagnostics in Minutes

pgNow is a lightweight PostgreSQL diagnostic tool developed by Redgate that provides quick visibility into database performance without requiring agents or complex setup. It connects directly to a PostgreSQL instance and delivers real-time insights into query workloads, active sessions, index usage, configuration health, and vacuum activity, helping DBAs quickly identify performance bottlenecks. Because it runs as a simple desktop application.
Read More

Thinking of PostgreSQL High Availability as Layers

High availability for PostgreSQL is often treated as a single, big, dramatic decision: “Are we doing HA or not?”That framing pushes teams into two extremes:- a “hero architecture” that costs a lot and still feels tense to operate, or - a minimalistic architecture that everyone hopes will just keep running.A calmer way to design this is to treat HA and DR as layers. You start with a baseline, then add specific capabilities only when your RPO/RTO and budget justify them.Let us walk through the layers from “single primary” to “multi-site DR posture”.Start with outcomesBefore topology, align on three things:1. Failure scope a. A database host fails b. A zone or data center goes away c. A full region outage happens d. Human error2. RPO (Recovery Point Objective) a. We can tolerate up to 15 minutes of data loss b. We want close to zero3. RTO (Recovery Time Objective) a. We can be back in 30 minutes b. We want service back in under 2 minutesHere is my stance (and it saves money!): You get strong availability outcomes by layering in the right order.
Read More

How PostgreSQL Scans Your Data

To understand how PostgreSQL scans data, we first need to understand how PostgreSQL stores it. A table is stored as a collection of 8KB pages (by default) on disk. Each page has a header, an array of item pointers (also called line pointers), and the actual tuple data growing from the bottom up. Each tuple has its own header containing visibility info: xmin, xmax, cmin/cmax, and infomask bits.
Read More

Fixing ORM Slowness by 80% with Strategic PostgreSQL Indexing

Modern applications heavily rely on ORMs (Object-Relational Mappers) for rapid development. While ORMs accelerate development, they often generate queries that are not fully optimized for database performance. In such environments, database engineers have limited control over query structure, leaving indexing and database tuning as the primary performance optimization tools.
Read More

PostgreSQL Materialized Views: When Caching Your Query Results Makes Sense (And When It Doesn’t)

Your dashboard queries are timing out at 30 seconds. Your BI tool is showing spinners. Your users are refreshing the page, wondering if something's broken. You've indexed everything. You've tuned shared_buffers. You've rewritten the query three times. The problem isn't bad SQL - it's that you're forcing PostgreSQL to aggregate, join, and scan millions of rows every single time someone opens that report.
Read More

Unlocking High-Performance PostgreSQL: Key Memory Optimizations

PostgreSQL can scale extremely well in production, but many deployments run on conservative defaults that are safe yet far from optimal. The crux of performance optimization is to understand what each setting really controls, how settings interact under concurrency, and how to verify impact with real metrics.This guide walks through the two most important memory parameters : - shared_buffers - work_mem
Read More

Unused Indexes In PostgreSQL: Risks, Detection, And Safe Removal

Indexes exist to speed up data access. They allow PostgreSQL to avoid full table scans, significantly reducing query execution time for read-heavy workloads.From real production experience, we have observed that well-designed, targeted indexes can improve query performance by 5× or more, especially on large transactional tables. However, indexes are not free. And in this blog, we are going to discuss what issues unused indexes can cause and how to remove them from production systems with a rollback plan, safely
Read More

PostgreSQL Column Limits

If you’ve ever had a deployment fail with “tables can have at most 1600 columns”, you already know this isn’t an academic limit. It shows up at the worst time: during a release, during a migration, or right when a customer escalation is already in flight. But here’s the more common reality: most teams never hit 1,600 columns; they hit the consequences of wide tables first
Read More