Readers don’t block writers, and writers don’t block readers

Readers don’t block writers, and writers don’t block readers
Photo by Alain Bonnardeaux / Unsplash
💡
This blog post will explain how Multi-Version Concurrency Control (MVCC) works in PostgreSQL to manage concurrent transactions.

Multi-Version Concurrency Control (MVCC) is a concurrency control technique that allows multiple transactions to read and write data at the same time by keeping different versions of the data. Unlike lock-based techniques (like Two-Phase Locking), which use locks to prevent concurrent data access, MVCC allows transactions to operate on different versions of the data without blocking each other.

MVCC also solves issues such as preventing dirty reads and minimizing lock contention and deadlocks, which enhances concurrency and performance. By tackling these challenges, MVCC ensures that data stays consistent, isolated, and accurate.

Let's see how PostgreSQL implements MVCC.

PostgreSQL's MVCC relies on four internal fields (columns) — txid, xmin, xmax, and ctid—to handle concurrency, enabling transactions to operate on various data versions without conflicts.

  • txid - is a unique identifier (transaction ID) that PostgreSQL assigns when a transaction starts
  • xmin - stores the txid of the transaction that inserted the row
  • xmax - stores the txid of the transaction that deleted the row
  • ctid - is the physical location of the row within its table

Let's explore how these fields are used to control the visibility of rows during insert, delete, and update operations. For this example, we’ll use an E-Commerce database, specifically focusing on the Products table.

Let's start by creating a database:

CREATE DATABASE eCommerce;

Create a table Products:

CREATE TABLE PRODUCTS (ID SERIAL, NAME VARCHAR, PRICE DECIMAL)

Load the pageinspect extension into the current database. This extension allows you to inspect the contents of the database pages:

CREATE EXTENSION pageinspect;

Insert Scenario

A transaction inserts a new row into the Products table

xmin is assigned a transaction ID of 1364.
xmax is set to 0 (the row has not been deleted).
ctid is set to point to the physical storage location.
A separate transaction reads the row
The second transaction can see the row since this transaction's ID is a later version compared to the transaction ID of the insert transaction (1364).xmax is set to 0 (the row has not been deleted).
🔔
Until the transaction is committed, xmin will reference the transaction that created or modified the row. However, the row remains invisible to other transactions until the commit occurs.

Delete Scenario

A transaction inserts a new row into the Products table

xmin is assigned a transaction ID of 1366.
xmax is set to 0 (the row has not been deleted).
A separate transaction deletes a row from the Products table

xmin remains unchanged (1366).
xmax is set to the transaction ID of the transaction that invoked the delete operation (1367).
ctid remains unchanged.

The row becomes invisible to any transaction that begins after this transaction was committed.

Update Scenario

A transaction inserts a new row into the Products table

xmin is assigned a transaction ID of 1368.
xmax is set to 0 (the row has not been deleted).
ctid is set to point to the physical storage location.
A separate transaction updates a row in the Products table

Phase 1 (mark the row as deleted):
xmin remains unchanged (1368).
xmax is set to the transaction ID of the transaction that invoked the delete operation (1369).
ctid is set to point to the new location of the updated row.

Phase 2 (create a new row):
xmin is assigned a transaction ID of 1369.
xmax is set to 0 (the row has not been deleted).
ctid is set to point to the new location of the updated row.

Pros and Cons of MVCC

The advantages of MVCC in PostgreSQL are enhanced concurrency, non-blocking reads, and robust transaction isolation. However, its drawbacks include storage overhead from row versioning, the necessity for frequent vacuuming to avoid performance issues, and potential slowdowns in write-heavy environments. MVCC is well-suited for read-heavy applications but demands careful attention to vacuuming and storage management.

References