Stormatics
(678) 345-3456
380 Albert St, Melbourne, Australia

Blog Details

Understanding PostgreSQL Fill Factor: Benefits, Drawbacks, and Best Practices

One of the optimization strategies in PostgreSQL involves configuring the fill factor for tables appropriately. In this article, we’ll explore the concept of fill factor, including its definition, default settings, methods for adjustment, advantages, disadvantages, recommended approaches along with benchmarking using pgbench utility.

What is Fill Factor?

In PostgreSQL, the fill factor decides how much of a data page can be filled with rows before it’s considered full. It determines the amount of space that should be left free on each page for future updates.

The fill factor ranges from 10 to 100, with 100 being the default for packing as many rows as possible. When a lower fill factor is specified, INSERT operations leave some space on each page for updates, improving efficiency by allowing updated rows to remain on the same page. However, for tables with few updates, keeping the fill factor at 100 is best. Remember, adjusting the fill factor can create space for updates, but setting it too low may waste space and increase disk usage for scanning tables.

Default Fill Factor in Indexes and Tables

By default, tables in PostgreSQL have a fill factor of 100%, meaning PostgreSQL attempts to fill each page with as many tuples as possible, leaving no free space.

However, indexes in PostgreSQL utilize a default fill factor of 90%. This means that during the initial index build and when extending the index at the right (with the largest key values), leaf pages are filled up to 90% capacity. If pages become completely full afterward, they will split, which can result in degradation of the index performance..

How to set the Fill Factor?

The following commands allow you to manage the fill factor for tables and indexes in PostgreSQL.

To specify the fill factor during table creation, you can use the following syntax:

CREATE TABLE test(

id INT,

name TEXT

) WITH (fillfactor = 70);

For an existing table, you can adjust the fill factor using the ALTER TABLE statement:

ALTER TABLE test SET (fillfactor = 70);

Make sure to execute VACUUM FULL to rewrite the table and apply the fill factor on existing data:

VACUUM FULL test;

For indexes, you can set the fill factor with the ALTER INDEX statement:

ALTER INDEX idx_test SET (fillfactor = 70);

Then, you need to reindex the index concurrently:

REINDEX INDEX CONCURRENTLY idx_test;

Benefits of Setting Fill Factor

Setting the fill factor for tables in PostgreSQL provides several advantages. Firstly, when existing rows are updated, it reduces the need for page splits on indexes by leaving free space on each page. This also helps in minimizing index fragmentation.

Additionally, in PostgreSQL, a lower fill factor may increase the occurrence of HOT (Heap-Only Tuple) updates. This method optimizes performance by storing updated data on the same page as the original row, potentially eliminating the need for extra index updates and table bloat. Hot updates also lead to fewer WAL (Write-Ahead Logging) files being generated.

Further Reading: PostgreSQL Internals Part 2: Understanding Page Structure



Disadvantages of Setting Inappropriate Fill Factor
  • When a lower fill factor is set, it results in the creation of more pages within the database. Consequently, during table scans, the process tends to slow down compared to scans conducted on tables with a higher fill factor. This is because the database system needs to traverse through a larger number of pages to retrieve the required data, impacting the overall scan performance negatively.
  • Assigning more space per page for future updates can lead to higher disk usage, especially for large databases with limited storage capacity.
  • Choosing a lower fill factor may also result in space wastage if the table does not require frequent updates.

Best Practices Related to Fill Factor

  • Determine the appropriate fill factor for your tables and indexes by evaluating your workload and data access patterns. Take into account the UPDATEs pattern on the table and how much space a block might need for a new version of tuples.
  • Monitor key performance metrics of your database, including disk I/O and the ratio of hot updates. Keep a close eye on the n_tup_hot_upd and n_tup_dead column in the pg_stat_all_tables view to track the occurrence of HOT updates.

Example

We utilized the pgbench utility to create tables with fill factors of 100 and 70 in order to evaluate the table size, hot tuple updates and fragmentation.

- Pgbench scale: 1000 (equivalent to 100 million rows)

– Pgbench use case: simple update

– Clients: 8

Commands:

- For initialization:

/usr/lib/postgresql/16/bin/pgbench -i -s 1000
/usr/lib/postgresql/16/bin/pgbench -i -s 1000 -F 70

– For benchmark:

/usr/lib/postgresql/16/bin/pgbench -b simple-update -c 8 -j 8 -T 600

Results:

Tablename  Size after Insertion Hot updates dead_tuples
pgbench_accounts(fillfactor = 100) 13GB 819935 1809733
pgbench_accounts(fillfactor = 70) 18GB 1753871 1031998

Observations

  • The table with a fill factor of 70 is approximately 5GB larger than the table with a fill factor of 100.
  • However, the hot update ratio for the table with a fill factor of 70 is nearly twice that of the table with a fill factor of 100, while the dead tuple ratio is almost halved.

Concluding Thoughts

Optimizing the fill factor is important for fine-tuning PostgreSQL performance. By grasping the concepts of how fill factor affects storage, fragmentation, and write performance, you can tune your PostgreSQL database to suit your workload effectively with proper experimentation and monitoring.



Continue reading:
Learn how Whop made a seamless database infrastructure migration from Heroku to AWS with the help of team Stormatics – leading to a strong foundation for high availability and effective disaster management.

Leave A Comment