Big Data and PostgreSQL – Scaling out by partitioning tables

Database Concurrency: Two phase Locking (2PL) to MVCC – Part 2

In the first part of our blog series titled “Database Concurrency: Two phase Locking (2PL) to MVCC“, we explored the fundamentals of concurrency, its benefits, common anomalies, and the concept of Two-Phase Locking (2PL). 

In part 2, we’re diving deeper into Multi-Version Concurrency Control (MVCC) – a pivotal aspect of PostgreSQL’s approach to handling transactions. This piece will uncover how PostgreSQL employs MVCC across its various transaction isolation levels: Read Committed, Repeatable Read, and Serializable.

What is MVCC in PostgreSQL?

Multi-Version Concurrency Control (MVCC) is used in PostgreSQL to manage transaction isolation and concurrency effectively. This method keeps various versions of rows inside the table, to reduce blockages due to concurrent access. However, as these versions accumulate, a cleanup process known as VACUUM is necessary to keep the tables in order. It allows multiple transactions to access the same data concurrently without conflicts, while still preserving transaction isolation.

Here’s a comparison between MVCC and 2PL

Differences between 2PL and MVCC

Aspect Multi-Version Concurrency Control (MVCC) Two-Phase Locking (2PL)
Locking Mechanism Uses multiple versions of data to manage concurrent access without locking. Uses locks on data items to manage access, blocking other transactions until the lock is released.
Performance in High-Concurrency Typically performs better as it avoids locking, reducing contention. May suffer from lock contention, reducing performance under high concurrency.
Handling Read Operations Read operations do not usually block writes, as each transaction sees a snapshot of data. Read operations require shared locks, which can block other transactions.
Write Operations Generates new versions of data, allowing concurrent transactions to proceed without waiting. Holds locks until the work is completed(check 2PL types), potentially delaying concurrent transactions.
Phases of Operation Continuously creates new data versions as transactions proceed, without distinct phases. Operates in two phases – acquiring locks and then releasing them.
Isolation Level Management Managed by controlling which data version is visible to a transaction using snapshot,commit logs, hintbits and other visibility rules. Managed through the granularity and duration of locks.
Risk of Deadlocks Lower risk as it avoids locking data items. Higher risk due to the locking of data items.
Overhead Overhead from maintaining multiple versions of data and periodic cleanup processes like vacuuming. Significant resource usage for managing locks, particularly in highly concurrent databases. However, since it doesn’t create new rows, there’s no requirement for additional cleanup. 

How MVCC Works in PostgreSQL?

Transaction ID

In PostgreSQL, each transaction is given a unique Transaction ID (XID) by its transaction manager. These XIDs are essential for monitoring transactions and ensuring both the visibility of data and the consistency of the database.

To find the current transaction ID, we can use the following SQL command:

SELECT txid_current();id_current();

Additionally, PostgreSQL records transaction information in every row of each table. This data helps determine if a row is visible to a particular transaction. 

  • xmin field contains the XID of the transaction that inserted the row, 
  • xmax holds the XID of the transaction that either deleted or updated the row. If a row hasn’t been deleted or updated, xmax is set to 0, indicating it is INVALID.

To understand other system columns in depth, please go through the following link:

https://andreas.scherbaum.la/blog/archives/1117-ctid-and-other-PostgreSQL-table-internals.html

We can also check these values by specifying xmin and xmax in the select query.

select xmin,xmax,id from a order by id;

   xmin | xmax | id

-----------+------+-----

 679276257 | 0 |   1

 679276257 | 0 |   2

 679276257 | 0 |   3

Snapshot in PostgreSQL

PostgreSQL utilizes a form of Multi-Version Concurrency Control known as Snapshot Isolation (SI). In this approach, each transaction operates on a unique snapshot of data, which contains a view of committed data before the snapshot was created.

The core concept of multiversion concurrency is that multiple versions of the same row can coexist inside a table. The snapshot acquired by a transaction is crucial for determining which tuples (rows) are visible. Each transaction is designed to see only a single version, or none, of each row, thereby creating a consistent view of the data in line with the ACID principles at a specific point in time.

To view the current transaction’s snapshot, PostgreSQL provides the pg_current_snapshot function. 

SELECT pg_current_snapshot();

 pg_current_snapshot 

---------------------

 100:107:100,102,105

(1 row)

In this representation, denoted as ‘xmin:xmax:xip_list’, the components signify:

  • xmin: The lower boundary of the snapshot, which is represented by the id of the oldest active transaction. Transactions IDs lower than xmin are either committed(thus their changes are part of the snapshot) or aborted (so their changes are excluded).
  • xmax: The upper boundary of the snapshot, represented by the first unassigned transaction ID. This value is one greater than the transaction ID of the most recently committed transaction. Transaction IDs equal to or greater than xmax are either in progress or non-existent, meaning their changes are not yet visible.
  • xip_list (list of active transaction IDs at the time of the snapshot): This list includes only the active transaction IDs that fall between xmin and xmax.

For example, in the snapshot ‘100:107:100,102,105‘, We can say that:

  • xmin is ‘100’ which is the oldest active transaction id.
  • Xip_list is ‘100,102,105’ which shows currently in progress transaction ids and their changes will be invisible. Transactions 101,103,104,106 are committed or aborted and their changes are visible.  
  • xmax is 107 which shows the upper boundary.

What is the difference between snapshot xmin and table header xmin?

While both instances of xmin relate to transaction IDs and data visibility, the xmin in pg_current_snapshot defines the lower boundary of a transaction snapshot for visibility checks across the database, and the xmin in a table’s tuple header identifies the transaction that created that specific tuple, playing a crucial role in row-level visibility and version control.

How snapshots work with isolation levels?

  • In the Read Committed mode, a new snapshot is obtained every time a query is executed during a transaction. Consequently, if changes are committed to the database by another transaction while the first transaction is in progress, any queries that follow in the initial transaction will see these new changes. This prevents dirty reads but still prone to Non-repeatable reads and phantom reads.
  • In the Repeatable Read mode, PostgreSQL takes a snapshot at the beginning of the transaction, which is then applied to all queries executed within that transaction. This guarantees that every read operation within the transaction will observe the database in the state it was at the start of the transaction, unaffected by concurrent modifications made by other transactions. 
  • In the Serializable mode, which represents the most strict level of isolation, PostgreSQL employs a sophisticated approach known as Serializable Snapshot Isolation (SSI). This method involves actively detecting potential serialization conflicts and, if necessary, aborting transactions to ensure data consistency. This process emulates a scenario where transactions are executed one after another in a sequential order.

What happens when two transactions are updating the same row?

A lost update happens when two concurrent transactions read the same row and then both try to update it. In such a case, the update made by one transaction might be overwritten by the other, resulting in the first update being lost. 

PostgreSQL implements row-level locking to handle this situation. When a transaction initiates an update on a row, PostgreSQL locks that specific row. If another transaction attempts to update the same row while it’s locked, it must wait until the initial transaction either commit or rollback. PostgreSQL uses the first-updater-win scheme to avoid lost update anomalies.

Behavior in Different Isolation Levels

  • Read Committed
    • If T1 updates a row, T2 cannot see this change until T1 commits.
    • If T2 tries to update the same row, it will wait for T1 to commit or rollback. If Transaction A commits, Transaction B will see the committed data and can update it.
  • Repeatable Reads
    • If T1 updates a row, T2 won’t see this change if it has already begun.
    • If T2 tries to update the same row, it will be blocked until T1 commits or rolls back. If T1 commits, T2 attempts to update the row will fail with a serialization error like this
      ERROR:  could not serialize access due to read/write dependencies among transactions
  • Serializable
    • Both T1 and T2 would see a consistent snapshot of the database as it was at the start of their transactions.
    • If both try to update the same row, PostgreSQL’s SSI mechanism detects potential serialization anomalies. If both transactions try to commit, the other will be checked for serialization conflict.
    •  If conflict occurs, only the first-committed transaction is committed and the other transactions are aborted (by the first-committer-win scheme) with same serialization error

DML Operations Versioning

Inserts

In PostgreSQL, when inserting a tuple, the Free Space Map (FSM) of the respective table or index is checked to find a suitable page for insertion. The FSM keeps track of the available space on each page within the table file. The inserted tuple is assigned an xmin value corresponding to the transaction ID of the inserting transaction. Initially, xmax is set to 0 for this tuple, indicating that it hasn’t yet been deleted or updated.

insert into table values(999);

select xmin,xmax,id from a order by id;

   xmin | xmax | id

-----------+------+-----

 679276257 | 0 |   1

 679276257 | 0 |   2

 679276257 | 0 |   3

 680463046 | 0 | 999 => New row

Deletes

Upon deleting a row, PostgreSQL doesn’t immediately remove it from the disk. Instead, the row is flagged for deletion. This is accomplished by assigning the tuple’s xmax field the transaction ID of the current transaction that is performing the deletion. Here, xmax reflects the ID of the first transaction that deleted the tuple. 

T1
Begin;
– Current txid = 680463328

Delete from a where id = 999;

—------------------
We are not committing T1 for now. Let’s open a new session and check this value.

T2
select xmin,xmax,id from a order by id;

   xmin |   xmax | id

-----------+-----------+-----

 679276257 |      0 |   1

 679276257 |      0 |   2

 679276257 |      0 |   3

 680463319 | 680463328 | 999

Once this transaction(T1) is committed, the row becomes obsolete, invisible to other running transactions(Based on their isolation level) and is classified as a dead row.

Updates:

When a row is updated in PostgreSQL, the system doesn’t simply overwrite the existing data. Instead, PostgreSQL executes an update as a combination of a delete and an insert operation. The original row version is marked for deletion (its xmax is updated to the current transaction ID), and concurrently, a new version of the row is created (with its xmin set to the current transaction ID). 

This method allows other transactions to continue accessing the old versions based on their snapshots. This strategy is crucial for ensuring that write operations do not block read operations and vice versa.

T1
Update a set id = 64 where id = 2;
– Current txid = 680463328

To explore the details mentioned, you can refer to the pageinspect module which shows the raw contents of a table and its hidden columns. For more information, please visit the provided link:

https://www.cybertec-postgresql.com/en/whats-in-an-xmax/
select xmin,xmax,id from a;

   xmin |       xmax    | id

-----------+-----------+-----

 679276257 |      0     |   1

 679276257 | 680463328 |   2

 679276257 |      0     |   3

 680463328|      0    |   64

In this scenario, the xmax value has been assigned the transaction ID (680463328) for the tuple with id = 2, indicating that this particular tuple was deleted by the transaction with the specified ID. Subsequently, a new row has been created, and its xmin value is set to the Transaction ID (TXID) with a value of 64.

Transaction Commit Log 

In PostgreSQL, the clog stands for Commit Log which is a fundamental component in the system’s Multi-Version Concurrency Control (MVCC) mechanism. It tracks transaction statuses, playing a key role in implementing MVCC by assisting in the visibility of tuples across various transactions.

The clog records the state of every transaction in the database, with potential statuses being “in progress”, “committed”, “sub-committed”, or “aborted”. While the meanings of most of these statuses are self-explanatory, the term “sub-committed” specifically refers to subtransactions within a larger transaction process.

Hint Bits in PostgreSQL

Hint bits in PostgreSQL are an optimization technique used to optimize the process of determining the visibility of tuples. They are metadata attached to table tuples and are instrumental in enhancing the efficiency of visibility checks.

Setting of Hint Bits

  • When a tuple is accessed for the first time after being modified by a transaction, PostgreSQL needs to check the commit log (clog) to confirm the transaction’s status – whether it’s committed or aborted.
  • If it is determined that the transaction has been committed, PostgreSQL marks the tuple with a hint bit that signifies it was inserted by a committed transaction. Conversely, if the transaction was aborted, a different hint bit is set on the tuple.

Row visibility

  • During the process of visibility checks, PostgreSQL initially examines these hint bits.
  • If the hint bits reveal that the tuple was created by a transaction that has since been committed, the tuple is considered visible without needing to access the clog..
  • This approach is especially advantageous for tuples that are accessed repeatedly, as it substantially decreases the frequency of clog lookups, thereby improving system performance.

Visibility Rules

Visibility check rules in PostgreSQL are a set of criteria used to determine the visibility or invisibility of each tuple. These rules utilize the xmin and xmax values of the tuple, the commit log (clog), and the transaction snapshot obtained. These rules, while complex due to their coverage of various scenarios and edge cases, can be broadly outlined as follows:

  • Changes made by transactions are visible in a snapshot if those transactions were committed prior to the creation of the snapshot.
  • Transactions have the ability to see their own changes even if they are uncommitted. However, if a transaction is aborted, its changes will not be visible in any snapshot.
  • A tuple is considered visible in a snapshot if the snapshot includes the changes of the committed transaction indicated by xmin, and if xmax is invalid (set to 0), indicating that the tuple has been inserted but not deleted.
  • A tuple is visible in a snapshot if its xmax is not equal to the current transaction ID and is in progress (uncommitted).

For a more detailed understanding, including examples, you can refer to this comprehensive guide: PostgreSQL Row Visibility Rules.

These rules collectively help in determining the visibility of tuples based on their transactional statuses. They are fundamental for maintaining database consistency and integrity in the context of transactional operations.

MVCC Drawbacks

Here are some of the key disadvantages:

  • Increased Disk Space Usage: MVCC works by maintaining multiple versions of data rows. This leads to more disk space being consumed, as older versions of rows are retained until they can be removed safely.
  • Vacuum Overhead: In PostgreSQL, the MVCC system requires periodic vacuuming to remove old, unneeded row versions. This vacuuming can be resource-intensive and might impact performance if not appropriately scheduled and managed.
  • Transaction ID Wraparound Risk: PostgreSQL uses a limited set of transaction IDs (XIDs). Reaching the upper limit of available XIDs necessitates a resource-heavy wraparound process. Although regular maintenance can mitigate this issue, it still poses a potential risk.
  • Challenges with Long-Running Transactions: Transactions that run for extended periods can prevent the removal of outdated row versions. This can lead to increased storage use, known as table or index bloat, which may in turn affect the overall performance of the database.
  • Increased Complexity in Query Planning: With multiple versions of rows present, PostgreSQL’s query planner has to work harder to optimize queries. This increased complexity can sometimes lead to less efficient query plans due to incorrect row estimations. 

Vacuum Process

In PostgreSQL, old row versions that are no longer visible to any transactions are periodically purged by the vacuum process. This cleanup is vital for several reasons:

  • The vacuum process reclaims space occupied by these obsolete row versions, freeing up storage for new data.
  • Regular vacuuming is crucial to avoid the issue of transaction ID wraparound. Since PostgreSQL uses a finite number of transaction IDs, vacuuming ensures that old IDs are recycled, thereby preventing potential data integrity issues.

For a more detailed exploration of the vacuum process in PostgreSQL, including its functionalities, types and significance, you can visit the following resource: Understanding Vacuum in PostgreSQL.

Comments (2)

  • Abraham Dombroski

    February 26, 2024 - 1:19 am

    Certainly completely with your conclusions and think that you’ve made some excellent points. Also, I like design of your respective site plus the easier navigation. I’ve bookmarked your website and may return often!

  • George Gettinger

    February 25, 2024 - 5:55 pm

    Nice site, nice and easy on the eyes and great content too. How long have you been blogging for?

Leave A Comment