Cost of PostgreSQL performance issues

PostgreSQL is widely adopted because it removes licensing constraints and gives companies like OpenAI, Lovable, and Supabase, a reliable foundation for running production systems at scale. However, once deployed, the cost conversation of PostgreSQL shifts away from licensing and toward how efficiently the database supports the workload it is running.

The initial optimizations, such as indexes, query design, and access paths, are often left unchanged as systems evolve. Over time, these outdated choices lead to inefficiencies, causing slowdowns and resource exhaustion. Instead of addressing the root cause, teams often scale infrastructure and add operational overhead. As a result, the system continues to function, but at a significantly higher cost than necessary. If you understand where the inefficiency is coming from, you can fix it properly instead of just throwing more resources at it.

In practice, this often becomes visible through signals such as instance sizes increasing over time, recurring performance issues, slower reporting as data grows, or more engineering time spent investigating database behaviour.

Query behaviour

Query execution is one of the primary drivers of resource usage in PostgreSQL. Each query determines how data is accessed, how much of it is read, and how much processing is required before a result is returned.

As workloads grow, even small inefficiencies in query design or indexing can lead to disproportionate increases in CPU and I/O usage. A query that reads more rows than required or follows a less efficient access path continues to do so at scale, which increases total database time and drives infrastructure demand.

Looking at cumulative execution time provides a clear view of where resources are being used. Queries that execute frequently, even if individually fast, often contribute the most to overall load. This is where optimization efforts tend to have the most impact.

What to do

Rank queries by total database time to identify where CPU and I/O are being spent. This requires the pg_stat_statements extension, which must be added to shared_preload_libraries, is not enabled by default, and needs a server restart to activate.
Review execution plans and buffer usage for the highest-impact queries.
Focus on improving access paths and index coverage to reduce unnecessary reads before scaling infrastructure.

The goal is to reduce unnecessary work before adding more infrastructure.

Data and storage efficiency

PostgreSQL’s concurrency model retains previous versions of rows to ensure consistent reads across transactions. As data is updated over time, these older versions remain on disk until they are reclaimed.

In high-write environments, the rate of change can outpace cleanup if not tuned to match the workload. Tables grow beyond their logical size as dead rows accumulate, and indexes develop their own bloat through page splits and retained entries that are no longer reachable. Both increase storage usage and the amount of data each query must process, though they respond to different remediation strategies.

As a result, operations that interact with these tables require more I/O, take longer to complete, and contribute to a higher overall resource footprint. Backup size and restore time also increase for physical backups, which copy the full data directory including dead tuples. Logical backups using pg_dump are largely unaffected, as they export live data only.

What to do

Track table growth and dead tuples as part of regular operations – a table growing faster than its live data is an early signal that cleanup is falling behind
Tune autovacuum behaviour for high-write tables so settings keep pace with change.The defaults are conservative and rarely sufficient for tables with sustained update or delete volume.
Use partitioning or retention strategies where data growth needs to remain controlled.

This gives you early visibility into where cleanup is falling behind and where storage growth is no longer explained by live data alone.

Connection patterns

PostgreSQL allocates a backend process for each connection, and each process reserves a baseline of memory regardless of whether it is actively executing a query, and that footprint grows significantly when queries are actively running.

As applications scale, connection counts often grow in line with application instances rather than actual query concurrency. The result is a large share of available memory dedicated to maintaining idle connections rather than serving the active workload.

When a significant portion of memory is used in this way, less is available for caching and query execution. This shifts more work to disk and increases both latency and I/O demand. Aligning connection count with actual workload concurrency allows the system to use memory more effectively and maintain stable performance as demand changes.

What to do

Compare connection count with actual query concurrency to understand how much memory is being held by idle backends
Use a connection pooler to align backend connections with workload demand.
Ensure memory is used for query execution and caching rather than idle sessions.

This allows memory to serve caching and query execution rather than idle backend processes, which improves both stability and cost efficiency.

Memory configuration

PostgreSQL is designed to start safely on a wide range of systems, which is why its default configuration is conservative. In production, this typically means available memory is underutilized, and more work ends up on disk than necessary.

Memory plays a central role in performance. It determines how much data can be served from cache, how efficiently operations such as sorting and hashing are performed, and how often the database relies on disk.

When configuration reflects the actual workload and hardware, more work is handled in memory and the need for disk access is reduced. This improves both performance and cost efficiency by lowering I/O demand and stabilizing query execution times.

What to do

Adjust memory settings based on hardware and workload characteristics. The two settings with the most direct impact are shared_buffers, which controls how much memory PostgreSQL uses for its own cache, and work_mem, which sets the memory available per sort or hash operation before spilling to disk. Both have workload-dependent optimal values that differ significantly from their defaults.
Validate buffer hit ratios to confirm that frequently accessed data is being served from cache rather than disk.
Watch for temporary file activity. A consistent spill to disk during query execution is a direct signal that work_mem is too low for the operations being run.
Treat I/O increases that are not explained by data growth as a signal that memory configuration may need revisiting.

This work should be grounded in workload behaviour as the objective is to reduce avoidable disk activity and make performance more consistent under load.

Operational effort

Infrastructure usage is visible in metrics and billing. The effort required to maintain performance is reflected in how teams spend their time.

As systems grow, recurring investigation into performance behaviour can become part of regular operations. Queries are reviewed, execution plans are analyzed, and configuration is adjusted to maintain stability. With the right visibility, these activities remain controlled and predictable.

Monitoring that surface changes in workload, data growth, and resource usage allows teams to act early and keep systems aligned with demand. This reduces the effort required to maintain performance and keeps operational work focused on improvement rather than recovery.

What to do

Establish visibility across query performance, table health, and system resource usage as a baseline.
Treat recurring issues as structured work rather than isolated incidents.
Act on early signals before they turn into production pressure.

The goal is to give engineers enough visibility to address issues while they are still small and keep operational time focused on improvement rather than interruption.

Infrastructure decisions

Infrastructure scaling is an important part of running PostgreSQL in production. As workloads grow, additional compute, storage, or replicas may be required to maintain performance.

The most effective scaling decisions are those that align closely with workload requirements. When infrastructure is added to support real demand, it improves capacity and stability. When it is added to absorb inefficiencies, it increases cost without addressing the underlying behaviour of the system.

A clear understanding of where resources are being used allows teams to scale with confidence and ensure that infrastructure reflects actual workload needs.

What to do

Before increasing capacity, verify that resource usage is explained by workload volume rather than query inefficiency, bloat, or misconfiguration.
Use query performance data, storage metrics, and I/O patterns to distinguish between demand-driven pressure and avoidable overhead.
Where inefficiency is the driver, address it first. Infrastructure added on top of an inefficient system scales the problem alongside the capacity.
When workload demand is genuinely the constraint, scale with confidence .

This keeps infrastructure aligned with actual workload needs and prevents cost from drifting upward without a matching improvement in performance.

Conclusion

PostgreSQL provides a cost advantage by removing licensing constraints and allowing teams to focus investment on infrastructure and operations. It rarely becomes expensive because of scale limits. In most cases, cost increases come from inefficiencies that were never revisited as the system evolved or as the workload changed, often because they remain hidden without dedicated review.

Efficiency is determined by how well the database is aligned with the workload it serves. Query behaviour, data growth, connection patterns, and configuration all contribute to how efficiently the system runs. When these areas are actively managed, PostgreSQL continues to deliver stable performance with predictable cost as the systems scale.

FAQs

Does PostgreSQL performance cost money even though the license is free?

Yes. The license is free, but poor performance has direct infrastructure costs. Slow queries force instance upgrades. Table bloat inflates storage bills. Idle connections hold memory that pushes systems into larger instance tiers. Misconfigured memory drives unnecessary disk I/O charges. These costs are preventable through tuning, not through purchasing anything.

Is it cheaper to scale up the PostgreSQL instance or tune performance?

Tuning is almost always cheaper, but requires expertise. Scaling up an instance is immediate and requires no diagnostic work, which is why it is the default response to performance pressure. However, scaling up an untuned database typically costs more per tier upgrade, recurringly, without addressing the root cause. Performance tuning resolves the root cause and the cost reverts. The investment is expertise time upfront; the return is ongoing infrastructure savings.