melanny20 Mar 4 at 11:00

The Future of PostgreSQL: How a 64-bit Transaction Counter Solves Scaling Issues

Medium

5 min

637

Postgres Professional corporate blogPostgreSQL*

Review

Translation

Original author: @Loxmatiymamont

What is the Transaction Counter And a Brief Overview of MVCC in PostgreSQL

One of the standout features of PostgreSQL is how it handles concurrent access to data by multiple users. For example, read transactions don’t block write transactions, which is an obvious benefit. But even more interestingly, write transactions don’t block read transactions. This is all made possible by the Multi-Version Concurrency Control (MVCC) mechanism. In simple terms, MVCC allows each transaction to work with its own version of the data. At any given time, the database can have multiple versions of the same data, with visibility rules determining which version a given transaction sees.

For MVCC to work, PostgreSQL needs to track which transactions are still active and which have finished. To do this, each transaction is assigned a unique identifier, called a transaction ID (xid), which you can retrieve using txid_current() . The transaction counter is responsible for managing these IDs.

It’s important to note that we don’t use terms like ‘larger’ or ‘smaller’ when talking about transaction IDs. Instead, we use ‘older’ or ‘younger’. This is because transaction IDs are compared modulo 2^32, meaning they form a circular structure. Transactions that are behind the current one by 2^31 (roughly 2 billion) are considered ‘older’ (in the past), and those ahead by the same amount are considered ‘younger’ (in the future).

Additionally, each database record has system fields xmin and xmax, which indicate the transaction ID that inserted the record (xmin) and the one that deleted it (xmax). There are other fields, such as flags for the record, which we’ll discuss later.

A Brief History of the Transaction Counter

Until version 8.2, PostgreSQL would crash once the transaction counter reached its maximum value — around 4 billion transactions. In 2006, reaching this limit was a difficult problem, even if theoretically possible. The only way to continue after counter overflow was to dump the database and recreate it. Needless to say, this wasn’t a user-friendly solution. In version 8.2, PostgreSQL introduced a mechanism to reset the transaction counter, or rather, to make it cycle. From that point on, to avoid database stoppages, PostgreSQL required constant background vacuuming.

When successful, this mechanism ensured the transaction counter would wrap around, rather than the DBMS shutting down. Old transaction IDs were assigned a special FrozenTransactionId value, ensuring the number of active transactions never exceeded the 4 billion limit. Additionally, each time the counter wrapped around, a ‘epoch counter’ was incremented to track how many times this reset had occurred.

In theory, this worked well, but in practice, it created a bottleneck. Delayed vacuuming or long-running transactions posed a real threat of the server stopping and needing to run in single-user mode to perform vacuuming, resulting in unpredictable downtime.

What’s the Problem?

There’s nothing wrong with the basic mechanism of the transaction counter. It works well, and has been in place for as long as PostgreSQL has existed. The problem is that this mechanism was designed when a 4-byte unsigned integer (around 4.29 billion) seemed like an enormous number — almost unbreakable. But, as many of our clients have pointed out, 49,710 transactions per second isn’t some far-fetched fantasy. Retailers, factories, and even government agencies face transactional workloads where thousands of active sessions is normal. For them, the transaction counter needs to wrap around at least once a day. In other words, they live on a ticking time bomb, hoping nothing goes wrong when the vacuum cycle triggers.

But that’s not the only problem. As mentioned earlier, records also contain flags, and here’s where the VACUUM FREEZE process comes in. This aggressive ‘freezing’ of tuples is meant to ensure that older records are visible to all transactions and have a special FrozenTransactionId. So, for the system to remain stable, two conditions must be met:

The difference between the current transaction ID and the oldest transaction ID doesn’t exceed 2^32.
The VACUUM FREEZE process is running regularly, ensuring condition 1 is met.

This sounds straightforward, but in reality, systems with high transactional loads expose several challenges related to 32-bit transaction IDs:

Long-running transactions can prevent freezing from happening. The gap between the oldest record and the current transaction continues to grow.
On modern hardware, you can hit the transaction ID limit in a single day. At what point should you start worrying that the vacuuming process won’t complete in time? In five minutes? An hour? During peak business hours?
How do you define a ‘long-running’ transaction? What if it’s a heavy report generation process that runs for hours? And if the transaction is stuck, how do you decide when to terminate it?

In practice, DBAs must be ready to take drastic measures if the gap between the oldest transaction and the current transaction ID grows too large. Even with proper monitoring, alerts, and regular drills, it ultimately falls on the DBA’s shoulders to mitigate any potential losses the business might face — even if their actions are technically correct.

In other words, if you can afford to stop your DB cluster at any time, run it in single-user mode for VACUUM FREEZE, and wait for hours, then this problem may not concern you. But if you live in the real world of high-load systems where Postgres is part of a critical infrastructure, you’ll eventually face these issues.

What Did We Do? Developed the 64-bit xid!

At this point, it may seem like a simple fix: just change the data type of the transaction counter to a 64-bit integer, push it to GitHub, and the problem is solved. But it’s not that simple. First, Postgres’s entire codebase expects the counter to be a 32-bit value, which would require a massive refactor of the code. But that’s not the main issue.

As mentioned earlier, each tuple (or record) stores xmin and xmax. If we switch to 64-bit transaction IDs, each record would need 16 bytes for these IDs, rather than the current 8 bytes. This could cause the size of the database to increase significantly.

That’s why we, at Postgres Professional, decided to take the lead in developing a 64-bit transaction ID. This process took several years, but we managed to create an implementation that expands the transaction ID space to a staggering 18,446,744,073,709,551,615 (2^64 — 18 quintillion). This implementation is now used in most PostgreSQL forks, whether commercial or open source.

We’ve been using 64-bit transaction IDs for over five years now, and we can confidently say that the solution is stable and reliable. The problem of transaction counter overflow has become more hypothetical than real. With a 64-bit counter, even if your workload causes a 32-bit overflow every day, you’d have around 400 years before running out of IDs.

This solution was detailed in an article by Maxim Orlov, and the patch was proposed for inclusion in vanilla PostgreSQL. However, the community rejected it due to its size and complexity. So, for the past several years, we’ve been working on gradually integrating this patch into the vanilla version of PostgreSQL. Maxim presented an update on this work at a conference this summer.

The delay in patch acceptance hasn’t stopped developers of other PostgreSQL forks, though. In fact, almost every fork now supports 64-bit transaction IDs. But we understand this and approached the situation with the intention of raising awareness about the issue and giving other developers a chance to evaluate and contribute their own solutions. Our position isn’t to push our implementation as the only solution, but rather to encourage the community to focus on the 64-bit transaction ID idea and welcome improvements and suggestions.

Conclusion

The first part of the problem (raising awareness) is definitely a success. The idea has garnered the attention it deserves, but implementing such a revolutionary change takes time — and not just for technical reasons. As digitalization speeds up and transactional loads continue to grow globally, the PostgreSQL community is seeing how important and timely this issue is. On the other hand, we’ve accumulated many success stories from customers who’ve deployed Postgres Pro with 64-bit transaction IDs in production systems. Since the new transaction counter is the default, it has proven to be effective and stable.

So, we continue our work, hoping that PostgreSQL will be able to handle the growing load and expand its market share even further.

Hubs: