Finding Bottlenecks and Avoiding Over-Optimization via Explain Plans

Performance optimization in a production database is crucial, but over-optimizing can complicate things without real improvements.

In this blog post, I’ll share two very basic EXPLAIN ANALYZE outputs from a production system. A user asked us to help optimize these plans (I’ve changed the table and column names for privacy).

We will look at how to spot slow parts of a query, improve performance the right way, and avoid unnecessary tuning.

Plan A: Identifying and Resolving a Bottleneck

Execution Plan A (Before Optimization)

Nested Loop  (cost=1000.42..25607.38 rows=1 width=129) (actual time=78.521..90.445 rows=0 loops=1)
  ->  Gather  (cost=1000.00..25598.95 rows=1 width=65) (actual time=78.520..90.443 rows=0 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        ->  Parallel Seq Scan on <table_1> e  (cost=0.00..24598.85 rows=1 width=65) (actual time=75.351..75.351 rows=0 loops=3)
              Filter: ((<column_1>) = '<date_value>'::date) AND ((<column_2>)::text = '<event_type>'::text)
              Rows Removed by Filter: <number_removed_rows>
  ->  Index Scan using <index_name> on <table_2> a  (cost=0.42..8.43 rows=1 width=41) (never executed)
        Index Cond: ((<column_3>)::text = (<column_4>)::text)
        Filter: ((<column_5>)::text = '<default_value>'::text)
Planning Time: 0.466 ms
Execution Time: 90.580 ms

Analysis

Execution Time: The total execution time is 90.580 ms, which is relatively high.

Parallel Sequential Scan: The Parallel Seq Scan on <table_1> takes 75.351 ms, which is more than 80% of the total execution time that should be our main focus area for optimization

Index Scan Never Executed: The Index Scan on <table_2> was planned but never executed, indicating it’s not contributing to the problem for now.

Optimization Approach

Create an Index on the Sequential Scan Column(s): The query filters on <column_1> and <column_2>, so adding an index on these columns can significantly improve performance.

CREATE INDEX idx_table1_composite ON <table_1>(<column_1>, <column_2>);

If <column_1> has low cardinality (few distinct values), a B-tree index might not help much, and a partitioning strategy could be considered.

If <column_2> is frequently filtered, a partial index could be more efficient.

Vacuum and Analyze: If an index already exists but isn’t being used, running VACUUM ANALYZE might help PostgreSQL choose a better plan.

VACUUM ANALYZE <table_1>;

Plan B: Avoiding Over-Optimization

Execution Plan B (Already Optimized)

Nested Loop  (cost=0.14..18.32 rows=1 width=642) (actual time=0.003..0.004 rows=0 loops=1)
  Join Filter: ((<column_1>)::text = (<column_2>)::text)
  ->  Index Scan using <index_name> on <table_1> e  (cost=0.14..8.15 rows=1 width=546) (actual time=0.003..0.003 rows=0 loops=1)
        Index Cond: (<column_3> = '<date_value>'::date)
  ->  Seq Scan on <table_2> ps  (cost=0.00..10.16 rows=1 width=318) (never executed)
        Filter: ((<column_4>)::text = '<default_value>'::text)
Planning Time: 0.424 ms
Execution Time: 0.032 ms

Analysis

Execution Time: The total execution time is 0.032 ms, which is extremely fast.

Index Scan is Efficient: The Index Scan on <table_1> takes 0.003 ms, which is negligible.

Unexecuted Sequential Scan: There is a Seq Scan on <table_2>, but it was never executed.

Optimization Decision

No Need to Optimize Further: The execution time is already less then even 1 ms, so any additional indexing or tuning wouldn’t yield a noticeable improvement.

Avoid Creating Unnecessary Indexes: Adding an index on <table_2> would be pointless since the Seq Scan never executed.

Monitor Query Performance Trends: Instead of optimizing blindly, observe whether performance degrades over time due to increasing data volume.

Key Takeaways

Identify the Right Bottleneck: Not all slow queries need optimization; focus on the operations consuming the most execution time.

Index Wisely: Adding indexes can help, but unnecessary indexes increase storage usage and slow down writes.

Use the Right Index Type: Composite indexes, BRIN indexes, and partial indexes can be more effective than simple B-tree indexes in certain scenarios.

Avoid Over-Optimization: If execution time is already sub-millisecond, further optimization might not be needed.

Regular Maintenance is Crucial: VACUUM, ANALYZE, and EXPLAIN ANALYZE should be part of your routine database health checks.

By following these principles, you can optimize queries effectively without introducing unnecessary complexity. Always let the EXPLAIN ANALYZE output guide your decisions rather than making assumptions about performance issues.