Understanding PostgreSQL Fill Factor: Benefits, Drawbacks, and Best Practices

One of the optimization strategies in PostgreSQL involves configuring the fill factor for tables appropriately. In this article, we’ll explore the concept of fill factor, including its definition, default settings, methods for adjustment, advantages, disadvantages, recommended approaches along with benchmarking using pgbench utility.

What is Fill Factor?

In PostgreSQL, the fill factor decides how much of a data page can be filled with rows before it’s considered full. It determines the amount of space that should be left free on each page for future updates.

The fill factor ranges from 10 to 100, with 100 being the default for packing as many rows as possible. When a lower fill factor is specified, INSERT operations leave some space on each page for updates, improving efficiency by allowing updated rows to remain on the same page. However, for tables with few updates, keeping the fill factor at 100 is best. Remember, adjusting the fill factor can create space for updates, but setting it too low may waste space and increase disk usage for scanning tables.

Default Fill Factor in Indexes and Tables

By default, tables in PostgreSQL have a fill factor of 100%, meaning PostgreSQL attempts to fill each page with as many tuples as possible, leaving no free space.

However, indexes in PostgreSQL utilize a default fill factor of 90%. This means that during the initial index build and when extending the index at the right (with the largest key values), leaf pages are filled up to 90% capacity. If pages become completely full afterward, they will split, which can result in degradation of the index performance..

How to set the Fill Factor?

The following commands allow you to manage the fill factor for tables and indexes in PostgreSQL.

To specify the fill factor during table creation, you can use the following syntax:

CREATE TABLE test(

id INT,

name TEXT

) WITH (fillfactor = 70);

For an existing table, you can adjust the fill factor using the ALTER TABLE statement:

ALTER TABLE test SET (fillfactor = 70);

Make sure to execute VACUUM FULL to rewrite the table and apply the fill factor on existing data:

VACUUM FULL test;

For indexes, you can set the fill factor with the ALTER INDEX statement:

ALTER INDEX idx_test SET (fillfactor = 70);

Then, you need to reindex the index concurrently:

REINDEX INDEX CONCURRENTLY idx_test;

Benefits of Setting Fill Factor

Setting the fill factor for tables in PostgreSQL provides several advantages. Firstly, when existing rows are updated, it reduces the need for page splits on indexes by leaving free space on each page. This also helps in minimizing index fragmentation.

Additionally, in PostgreSQL, a lower fill factor may increase the occurrence of HOT (Heap-Only Tuple) updates. This method optimizes performance by storing updated data on the same page as the original row, potentially eliminating the need for extra index updates and table bloat. Hot updates also lead to fewer WAL (Write-Ahead Logging) files being generated.

Further Reading: PostgreSQL Internals Part 2: Understanding Page Structure



Disadvantages of Setting Inappropriate Fill Factor
  • When a lower fill factor is set, it results in the creation of more pages within the database. Consequently, during table scans, the process tends to slow down compared to scans conducted on tables with a higher fill factor. This is because the database system needs to traverse through a larger number of pages to retrieve the required data, impacting the overall scan performance negatively.
  • Assigning more space per page for future updates can lead to higher disk usage, especially for large databases with limited storage capacity.
  • Choosing a lower fill factor may also result in space wastage if the table does not require frequent updates.

Best Practices Related to Fill Factor

  • Determine the appropriate fill factor for your tables and indexes by evaluating your workload and data access patterns. Take into account the UPDATEs pattern on the table and how much space a block might need for a new version of tuples.
  • Monitor key performance metrics of your database, including disk I/O and the ratio of hot updates. Keep a close eye on the n_tup_hot_upd and n_tup_dead column in the pg_stat_all_tables view to track the occurrence of HOT updates.

Example

We utilized the pgbench utility to create tables with fill factors of 100 and 70 in order to evaluate the table size, hot tuple updates and fragmentation.

- Pgbench scale: 1000 (equivalent to 100 million rows)

– Pgbench use case: simple update

– Clients: 8

Commands:

- For initialization:

/usr/lib/postgresql/16/bin/pgbench -i -s 1000
/usr/lib/postgresql/16/bin/pgbench -i -s 1000 -F 70

– For benchmark:

/usr/lib/postgresql/16/bin/pgbench -b simple-update -c 8 -j 8 -T 600

Results:

Tablename Size after InsertionHot updatesdead_tuples
pgbench_accounts(fillfactor = 100)13GB8199351809733
pgbench_accounts(fillfactor = 70)18GB17538711031998

Observations

  • The table with a fill factor of 70 is approximately 5GB larger than the table with a fill factor of 100.
  • However, the hot update ratio for the table with a fill factor of 70 is nearly twice that of the table with a fill factor of 100, while the dead tuple ratio is almost halved.

Concluding Thoughts

Optimizing the fill factor is important for fine-tuning PostgreSQL performance. By grasping the concepts of how fill factor affects storage, fragmentation, and write performance, you can tune your PostgreSQL database to suit your workload effectively with proper experimentation and monitoring.



Continue reading:
Learn how Whop made a seamless database infrastructure migration from Heroku to AWS with the help of team Stormatics – leading to a strong foundation for high availability and effective disaster management.

Comments (4)

  • Fabrice Chapuis

    July 8, 2024 - 1:44 pm

    Ok, that’s right, after an update, no “old row” are added in the index. Thanks for your clarification.

  • Fabrice

    July 5, 2024 - 6:36 pm

    Hello, Thanks for this great article. I have a question about what is mentioned under “Benefits of Setting Fill Factor”
    “This method optimizes performance by storing updated data on the same page as the original row, potentially eliminating the need for extra index updates and table bloat”
    why “eliminating table bloat” if fillfactor is less than 100%, hot update will be more frequent and old version of row will point on the new one. The row with old version would not be vacuumed, then page bloat will increase?
    What is your opinion on this point?

    • Umair Shahid

      July 5, 2024 - 7:09 pm

      Hello Fabrice, and thank you for raising the point for clarification.

      The way HOT (Heap-only Tuples) updates work is that PostgreSQL simply creates a pointer from the old row to the new row in the index, hence the need to update the index is eliminated. Because the index is not updated, there is no ‘old version’ that requires to be vacuumed. Please note that this comment is specific to the space taken up by the index. It is not applicable to entire data rows.

      I hope this clarifies your concern. You can also read more about Heap-only Tuples (HOT) here: https://stormatics.tech/umairs-planet-postgresql/improving-update-query-performance-using-heap-only-tuples-hot

Leave A Comment