PostgreSQL ANALYZE: A Simple Fix for Slow Queries

Don’t Skip ANALYZE: A Real-World PostgreSQL Story

Recently, we worked on a production PostgreSQL database where a customer reported that a specific SELECT query was performing extremely slowly. The issue was critical since this query was part of a daily business process that directly impacted their operations.

Identifying the Problem

During execution, we noticed that whenever this query ran along with others, CPU usage spiked dramatically — sometimes reaching as high as 80% utilization. This clearly indicated that PostgreSQL was doing a lot of extra work behind the scenes, likely due to poor query planning or inefficient execution paths.

Understanding the Query

When we began debugging, the first step was to analyze the query itself. It turned out to be a large and highly complex query, joining more than six tables and referencing some of them multiple times throughout. These joins appeared in several parts of the query, making it even harder to follow.

Given its size and intricate structure, rewriting or restructuring the query wasn’t a practical option for optimization — there was a real risk of breaking the existing business logic.

Discovering the Root Cause

Upon further investigation, we found that the tables involved in the joins had never been analyzed before. This meant PostgreSQL had no up-to-date statistics about data distribution, leading the query planner to make suboptimal decisions and choose inefficient execution plans.

The Turning Point: Running ANALYZE

We decided to manually run the ANALYZE command on all tables involved in the query and then re-executed the same SELECT statement.

To our surprise, the results were remarkable:

The query, which previously failed to complete even after 10 minutes, now finished in under 20 seconds — a 96.7% improvement in performance.

The overall CPU utilization dropped from over 60% to under 10%, representing an 83% reduction in system load.

Why This Happened

The root cause was that PostgreSQL’s query planner relies heavily on table statistics to choose the most efficient execution plan. Since the statistics were outdated (or missing entirely), PostgreSQL made incorrect assumptions about data size and distribution. As a result, it performed CPU-intensive operations and couldn’t choose the optimal plan.

Once the tables were analyzed, PostgreSQL recalculated the statistics, allowing the planner to pick the most efficient path — leading to a drastic drop in both query time and CPU load.

Key Takeaway

This real-world case reinforces a critical point:

Keeping your PostgreSQL statistics up to date is not optional — it’s essential for performance and stability. If you notice queries running slower than expected or the CPU usage unusually high, check when ANALYZE was last run.

PostgreSQL should automatically run ANALYZE as part of its autovacuum process, but if it isn’t, you may need to review your autovacuum settings or adjust threshold values to ensure it runs as intended.

Final Thoughts

A single command — ANALYZE — turned a system from near-stalling to lightning fast. Regular maintenance, monitoring, and tuning of PostgreSQL statistics can save you hours of debugging and significantly improve performance.

Don’t Skip ANALYZE: A Real-World PostgreSQL Story

Identifying the Problem

Understanding the Query

Discovering the Root Cause

The Turning Point: Running ANALYZE

Why This Happened

Key Takeaway

Final Thoughts

Leave A Comment Cancel Comment

Our Projects

Quick Links

Contact Info

Follow us at

Archives

Categories