Hello, Habr! With this article I start a set of series (or a series of sets? — In a word, the idea is grandiose) about the internal structure of PostgreSQL.
The material will be based on
training courses (in Russian) on administration that Pavel
pluzanov and I are creating. Not everyone likes to watch video (I definitely do not), and reading slides, even with comments, is no good at all.
Unfortunately, the only course available in English at the moment is 2-Day Introduction to PostgreSQL 11.
Of course, the articles will not be exactly the same as the content of the courses. I will talk only about how everything is organized, omitting the administration itself, but I will try to do it in more detail and more thoroughly. And I believe that the knowledge like this is as useful to an application developer as it is to an administrator.
I will target those who already have some experience in using PostgreSQL and at least in general understand what is what. The text will be too difficult for beginners. For example, I will not say a word about how to install PostgreSQL and run psql.
The stuff in question does not vary much from version to version, but I will use the current, 11th vanilla PostgreSQL.
The first series deals with issues related to isolation and multiversion concurrency, and the plan of the series is as follows:
- Isolation as understood by the standard and PostgreSQL (this article).
- Forks, files, pages — what is happening at the physical level.
- Row versions, virtual transactions and subtransactions.
- Data snapshots and the visibility of row versions; the event horizon.
- In-page vacuum and HOT updates.
- Normal vacuum.
- Autovacuum.
- Transaction id wraparound and freezing.
Off we go!
And before we start, I would like to thank Elena Indrupskaya for translating the articles to English.