It's official! PostgreSQL 15 is out, and the community is abuzz discussing all the new features of the fresh release.
Meanwhile, the October CommitFest for PostgreSQL 16 had come and gone, with its own notable additions to the code.
If you missed the July CommitFest, our previous article will get you up to speed in no time.
Here are the patches I want to talk about:
SYSTEM_USER function
Frozen pages/tuples information in autovacuum's server log
pg_stat_get_backend_idset returns the actual backend ID
Improved performance of ORDER BY / DISTINCT aggregates
Faster bulk-loading into partitioned tables
Optimized lookups in snapshots
Bidirectional logical replication
pg_auth_members: pg_auth_members: role membership granting management
pg_auth_members: role membership and privilege inheritance
pg_receivewal and pg_recvlogical can now handle SIGTERM
commit: 0823d061
The SYSTEM_USER
function is a part of the SQL standard which has not been implemented in PostgreSQL so far. It displays information about the system user connected to the database server and the authentification method used:
SELECT session_user, system_user;
session_user | system_user
--------------+--------------
alice | peer:student
The example shows that alice
has connected to the database as student
(OS username) using peer authentification. For trust authentification, the function always returns NULL.
These connection details are also obtainable from the server log (if the parameter log_connections
is on).
Frozen pages/tuples information in autovacuum's server log
commit: d977ffd9
The autovacuum log and the VACUUM (verbose)
output now display additional information on the number of frozen pages and tuples in a given table.
CREATE TEMP TABLE t AS SELECT 42;
VACUUM (freeze, verbose) t;
INFO: aggressively vacuuming "demo.pg_temp_3.t"
INFO: finished vacuuming "demo.pg_temp_3.t": index scans: 0
pages: 0 removed, 1 remain, 1 scanned (100.00% of total)
tuples: 0 removed, 1 remain, 0 are dead but not yet removable
removable cutoff: 945, which was 0 XIDs old when operation ended
new relfrozenxid: 945, which is 1 XIDs ahead of previous value
frozen: 1 pages from table (100.00% of total) had 1 tuples frozen
index scan not needed: 0 pages from table (0.00% of total) had 0 dead item identifiers removed
avg read rate: 91.912 MB/s, avg write rate: 0.000 MB/s
buffer usage: 5 hits, 4 misses, 0 dirtied
WAL usage: 1 records, 0 full page images, 188 bytes
system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
VACUUM
See the line that starts with frozen:
.
pg_stat_get_backend_idset returns the actual backend ID
commit: d7e39d72
Consider a system that runs several concurrent sessions. The sessions work with temporary tables, which are stored in temporary schemas. The list of temporary schemas:
SELECT n.nspname,
regexp_replace(n.nspname, '^pg_temp_', '')::int AS backend_id
FROM pg_namespace n
WHERE n.nspname ~ '^pg_temp_';
nspname | backend_id
-----------+------------
pg_temp_5 | 5
pg_temp_3 | 3
pg_temp_4 | 4
(3 rows)
Let me explain this query. The backend_id
column lists backend IDs. The system uses them to name temporary schemas. Backend IDs may be reused, and a new backend may get the same name as a recently terminated one. When a backend terminates, its temporary schemas don't get deleted, but simply remain unused until a new backend with the same ID is created, at which point they are assigned to it.
But how do you draw a line between a temporary schema and a backend process? You may need to do that, for example, to terminate a hung up session when objects in a temporary schema prevent freezing, potentially leading to transaction ID wraparound.
The updated function pg_stat_get_backend_idset
and related statistics functions can help you out with that. Before, the function returned indexes from the array of open sessions as the backend IDs. However, an index value for a session may change after opening or terminating a session and rebuilding statistics. The new pg_stat_get_backend_idset
now reliably returns actual backend session IDs, which can be linked to backend_id from the query above.
This allows us to retrieve a list of backend IDs and temporary schemas:
WITH tmp AS (
SELECT n.nspname,
regexp_replace(n.nspname, '^pg_temp_', '')::int AS backend_id
FROM pg_namespace n
WHERE n.nspname ~ '^pg_temp_'
)
SELECT pg_stat_get_backend_pid(backend_id) AS pid,
nspname
FROM tmp
LEFT JOIN pg_stat_get_backend_idset() AS b(backend_id)
USING (backend_id);
pid | nspname
--------+-----------
336661 | pg_temp_3
349872 | pg_temp_4
| pg_temp_5
(3 rows)
Also note that the pg_temp_5
schema is not currently used.
Improved performance of ORDER BY / DISTINCT aggregates
commit: 1349d279
Aggregates with ORDER BY
and DISTINCT
can now use indexes to skip sorting tuples when the inputs are known to be pre-sorted.
Example:
SELECT flight_no,
array_agg(status ORDER BY status),
array_agg(DISTINCT status)
FROM flights
GROUP BY flight_no;
In both cases, array_agg
has to sort the statuses for each flight. Now let's create an index:
CREATE INDEX ON flights (flight_no, status);
The PostgreSQL 16 planner knows that the input from the index comes in pre-sorted. Therefore, the statuses from each group can be passed to the transition function right away, without the need to wait for them all to be fetched and sorted.
As a result, the operation executes on my laptop 1.5 times faster in PostgreSQL 16 than an identical operation in PostgreSQL 15.
Faster bulk-loading into partitioned tables
commit: 3592e0ff
The patch increases bulk-loading and update speeds for partitioned tables. Operations that load records into the same partition will spend less time looking for specific partitions for each record. When 16 (the value is hard-coded) records in a row are loaded into the same partition, the partition's name is cached, and if the following records must go to the same partition, they are processed faster.
The optimization works for RANGE and LIST partitions, but it's of no use for HASH partitioned tables.
Developer testing shows a 15-20% decrease in execution time.
Optimized lookups in snapshots
commit: 37a6e5df, b6ef1675, 82739d4a
The patch optimizes linear searches of arrays (first commit) for x86-64 (second commit) and ARM (third commit). The new algorithm using SIMD instructions was applied to snapshot->xip
array search. At a large number (hundreds) of concurrent writers, it significantly increases the visibility check speed, which is an overall performance boost.
Bidirectional logical replication
commit: 36628396
Bidirectional replication occurs when server A subscribes to table changes from server B, and server B subscribes to the same table changes from server A. As a result, all changes made on either server will be applied to both.
Previous PostgreSQL releases didn't support this replication setup. Originally, a subscriber replicated all changes made to the publisher, be it SQL commands or replicated changes. When server A runs an SQL command, server B receives and runs the command as well. The same changes, then, are sent to the first server, resulting in an infinite replication loop. To avoid it, PostgreSQL returns an error when attempting to create a second subscription that will result in a loop.
To implement bidirectional replication, the developers needed a way to apply only changes made by SQL commands but not those from replication. And they found one!
PostgreSQL can distinguish between the two sources of changes, as all logical replication changes are written into WAL together with their origins. Now, when you create a subscription, you can set the publisher to not publish any changes with replication origins.
Let's set up bidirectional replication between node_1
and node_2
, both located on the same server, on ports 5401 and 5402.
wal_level
must allow logical replication on both nodes:
node_1=# SHOW wal_level;
wal_level
-----------
logical
(1 row)
Let's create a table and a publication. The node column in the primary key may look redundant, but it's helpful for avoiding replication conflicts.
node_1=# CREATE TABLE test (
node text DEFAULT 'node_1',
id int,
PRIMARY KEY (node, id)
);
node_1=# CREATE PUBLICATION pub1 FOR TABLE test;
Next, let's create a table with an identical structure on the second node and a publication for it:
node_2=# CREATE TABLE test (
node text DEFAULT 'node_2',
id int,
PRIMARY KEY (node, id)
);
node_2=# CREATE PUBLICATION pub2 FOR TABLE test;
Subscribe node_1
to the node_2
publication, and vice versa.
node_1=# CREATE SUBSCRIPTION sub1_pub2
CONNECTION 'port=5402 user=postgres dbname=postgres'
PUBLICATION pub2
WITH (copy_data = false, origin = none);
node_2=# CREATE SUBSCRIPTION sub2_pub1
CONNECTION 'port=5401 user=postgres dbname=postgres'
PUBLICATION pub1
WITH (copy_data = false, origin = none);
A few things to pay attention to:
origin = none
― this is a new subscription parameter that handles replication cycles. It tells the server to only send out the changes that don't originate from replication (with a nonempty origin), i.e. changes made by SQL commands. The second possible parameter value,any
, is the default. It enables the behavior as seen in the previous versions.
copy_data = false
― disable automatic synchronization. Also set when both tables are empty, which is the case in our example.
The tables must not be modified until we complete the setup.
Let's add a row into each table:
node_1=# INSERT INTO test (id) VALUES (1);
node_2=# INSERT INTO test (id) VALUES (1);
Verify the contents.
node_1=# SELECT * FROM test ORDER BY 1,2;
node | id
--------+----
node_1 | 1
node_2 | 1
(2 rows)
node_2=# SELECT * FROM test ORDER BY 1,2;
node | id
--------+----
node_1 | 1
node_2 | 1
(2 rows)
The replication works!
Let's add another node at port 5403. The tables must remain unchanged during setup.
Create a table and a publication:
node_3=# CREATE TABLE test (
node text DEFAULT 'node_3',
id int,
PRIMARY KEY (node, id)
);
node_3=# CREATE PUBLICATION pub3 FOR TABLE test;
Subscribe the first two nodes to the publication.
node_1=# CREATE SUBSCRIPTION sub1_pub3
CONNECTION 'port=5403 user=postgres dbname=postgres'
PUBLICATION pub3
WITH (copy_data = false, origin = none);
node_2=# CREATE SUBSCRIPTION sub2_pub3
CONNECTION 'port=5403 user=postgres dbname=postgres'
PUBLICATION pub3
WITH (copy_data = false, origin = none);
Subscribe the third node to the first two.
node_3=# CREATE SUBSCRIPTION sub3_pub1
CONNECTION 'port=5401 user=postgres dbname=postgres'
PUBLICATION pub1
WITH (copy_data = true, origin = none);
WARNING: subscription "sub3_pub1" requested copy_data with origin = NONE but might copy data that had a different origin
DETAIL: Subscribed publication "pub1" is subscribing to other publications.
HINT: Verify that initial data copied from the publisher tables did not come from other origins.
NOTICE: created replication slot "sub3_pub1" on publisher
CREATE SUBSCRIPTION
Note the subscription parameters: copy_data = true
, origin = none
. CREATE SUBSCRIPTION
returns a warning that the third node is going to copy all data from the first one, including the changes with the replication origin. This is exactly what we want for the initial synchronization of the third node: to copy the two records.
There is no need to repeat the initial synchronization when subscribing to the second node.
node_3=# CREATE SUBSCRIPTION sub3_pub2
CONNECTION 'port=5402 user=postgres dbname=postgres'
PUBLICATION pub2
WITH (copy_data = false, origin = none);
The setup is complete. Let's add a record to the third node:
node_3=# INSERT INTO test (id) VALUES (1);
Verify the contents on all nodes.
node_1=# SELECT * FROM test ORDER BY 1,2;
node | id
--------+----
node_1 | 1
node_2 | 1
node_3 | 1
(3 rows)
node_2=# SELECT * FROM test ORDER BY 1,2;
node | id
--------+----
node_1 | 1
node_2 | 1
node_3 | 1
(3 rows)
node_3=# SELECT * FROM test ORDER BY 1,2;
node | id
--------+----
node_1 | 1
node_2 | 1
node_3 | 1
(3 rows)
A multimaster out of the box!
pg_auth_members: role membership granting management
To demonstrate this, let's take three basic roles with LOGIN
privileges:
CREATE ROLE alice LOGIN;
CREATE ROLE bob LOGIN;
CREATE ROLE charlie LOGIN;
First, let's observe how the "old" PostgreSQL 15 behaves, then switch to 16.
Let's grant alice
membership in bob
. The GRANTED BY
clause of the GRANT
command allows to specify which role grants the membership. This information is recorded in the system catalog:
15=# GRANT bob TO alice GRANTED BY charlie;
15=# SELECT roleid::regrole,
member::regrole,
grantor::regrole,
*
FROM pg_auth_members
WHERE roleid::regrole::text !~ '^pg_';
roleid | member | grantor | roleid | member | grantor | admin_option
--------+--------+---------+--------+--------+---------+--------------
bob | alice | charlie | 24590 | 24589 | 24591 | f
Seeing the pg_auth_members
record alone, one may wonder how it was possible for charlie
to grant alice
membership in bob
. To do that, charlie
must be a member of bob
and have the privilege that allows granting the membership to other roles (WITH ADMIN OPTION
). And charlie
isn't even a member of bob
.
We know that GRANT
was run by superuser (postgres
), which skips these sorts of privilege checks. But the system catalog record shows that charlie
granted alice
membership in bob
while not having the appropriate privileges, which is not what happened.
But that is not all. What if we drop the role charlie
altogether?
15=# DROP ROLE charlie;
DROP ROLE
The role is no more. But what about alice
's membership in bob
, which charlie
has granted?
15=# SELECT roleid::regrole,
member::regrole,
grantor::regrole,
*
FROM pg_auth_members
WHERE roleid::regrole::text !~ '^pg_';
roleid | member | grantor | roleid | member | grantor | admin_option
--------+--------+---------+--------+--------+---------+--------------
bob | alice | 24591 | 24590 | 24589 | 24591 | f
The membership is still there, and so is grantor info! 24591 is the OID of the dropped charlie
role. Now we have nonsense in pg_auth_members
, and, what is worse, the OID may now be reused when creating another role. It is improbable, but not impossible to end up with a record that shows some random new role as the grantor for alice
's membership in bob
.
Let's move on to PostgreSQL 16.
After creating the three roles, let's try to grant alice
membership in bob
with charlie
as the grantor:
16=# GRANT bob TO alice GRANTED BY charlie;
ERROR: grantor must have ADMIN OPTION on "bob"
The command returns an error because charlie
must be a member of bob
and must have the privileges necessary to grant the membership. All as it should be.
16=# GRANT bob TO charlie WITH ADMIN OPTION;
GRANT ROLE
16=# GRANT bob TO alice GRANTED BY charlie;
GRANT ROLE
Now alice
is successfully granted the membership.
Let's try to drop the role charlie
:
16=# DROP ROLE charlie;
ERROR: role "charlie" cannot be dropped because some objects depend on it
DETAIL: privileges for membership of role alice in role bob
Or try to revoke ADMIN OPTION
:
16=# REVOKE ADMIN OPTION FOR bob FROM charlie ;
ERROR: dependent privileges exist
HINT: Use CASCADE to revoke them too.
You can only revoke ADMIN OPTION
if you add CASCADE
, and you can't drop the role. The role charlie
now has a dependency because it has granted a role membership. The dependency uses the new OID column in pg_auth_members
, and the dependency itself is recorded in pg_shdepend
. Let's have a look at it:
16=# SELECT oid,
roleid::regrole,
member::regrole,
grantor::regrole
FROM pg_auth_members
WHERE oid = 16572;
oid | roleid | member | grantor
-------+--------+--------+---------
16572 | bob | alice | charlie
16=# SELECT d.classid::regclass,
(SELECT m FROM pg_auth_members m WHERE m.oid = d.objid) AS objid,
d.refclassid::regclass,
d.refobjid::regrole
FROM pg_shdepend d\gx
-[ RECORD 1 ]-----------------------------
classid | pg_auth_members
objid | (16572,16569,16568,16570,f,t)
refclassid | pg_authid
refobjid | charlie
Granting membership on behalf of third-party roles may not be the most demanded feature, but nevertheless PostgreSQL 16 will make the behavior more logical and predictable.
pg_auth_members: role membership and privilege inheritance
While the previous patch dealt with membership granters, this one concerns grantees.
Let's look at PostgreSQL 15 first:
15-postgres=# CREATE ROLE alice LOGIN;
15-postgres=# CREATE ROLE bob LOGIN;
15-postgres=# GRANT bob TO alice;
alice
is granted membership in bob
so that it can use bob
's privileges.
15-postgres=# CREATE TABLE t AS SELECT 42 AS col;
15-postgres=# GRANT SELECT ON t TO bob;
Both bob and alice can see the table contents:
15-alice=> SELECT * FROM t;
col
-----
42
(1 row)
This works because the role alice
has the INHERIT
attribute (all roles have it by default), which makes alice
inherit all bob
's privileges automatically.
15-alice=> SELECT rolname, rolinherit FROM pg_roles WHERE rolname = 'alice';
rolname | rolinherit
---------+------------
alice | t
If this attribute is disabled, alice
will have to switch roles in order to access the table:
15-postgres=# ALTER ROLE alice NOINHERIT;
15-alice=> SELECT * FROM t;
ERROR: permission denied for table t
15-alice=> SET ROLE bob;
15-alice=> SELECT * FROM t;
col
-----
42
(1 row)
Let's say that this is the desired behavior.
Now, consider that we want to grant alice
some system monitoring capabilities. There's a predefined role pg_read_all_settings
, which has the privileges required to view system configuration parameters. We grant alice
membership in it.
15-postgres=# GRANT pg_read_all_settings TO alice;
15-alice=> SHOW data_directory;
ERROR: must be superuser or have privileges of pg_read_all_settings to examine "data_directory"
The lack of the INHERIT
attribute makes alice
unable to view the parameters without explicitly switching roles.
15-alice=> SET ROLE pg_read_all_settings;
15-alice=> SHOW data_directory;
data_directory
---------------------
/home/pal/pg15/data
But switching into the role pg_read_all_settings
is a major hassle. We want alice
to have all pg_read_all_settings
privileges by default, but we also want to have them switch to bob
explicitly when bob
's privileges are required.
And the new patch allows us to achieve that!
Let's reproduce the situation in PostgreSQL 16:
16-postgres=# CREATE ROLE alice LOGIN;
16-postgres=# CREATE ROLE bob LOGIN;
16-postgres=# CREATE TABLE t AS SELECT 42 AS col;
16-postgres=# GRANT SELECT ON t TO bob;
The GRANT
syntax has been extended. Now, when granting a role membership, you can specify whether or not the grantee inherits the role's privileges:
16-postgres=# GRANT bob TO alice WITH INHERIT FALSE;
16-postgres=# GRANT pg_read_all_settings TO alice WITH INHERIT TRUE;
Here, alice
may use pg_read_all_settings
privileges right away but needs to switch to bob
to run the get_ultimate_answer
function:
16-alice=> SHOW data_directory;
data_directory
---------------------
/home/pal/pg16/data
16-alice=> SELECT * FROM t;
ERROR: permission denied for table t
16-alice=> SET ROLE bob;
16-alice=> SELECT * FROM t;
col
-----
42
(1 row)
GRANT
data is stored in pg_auth_members
. Now it also includes the inherit_option
value:
16-postgres=# SELECT roleid::regrole, member::regrole, grantor::regrole,
admin_option, inherit_option
FROM pg_auth_members
WHERE member = 'alice'::regrole;
roleid | member | grantor | admin_option | inherit_option
----------------------+--------+----------+--------------+----------------
bob | alice | postgres | f | f
pg_read_all_settings | alice | postgres | f | t
But what about the INHERIT
attribute? Now it is only used in GRANT
whenever the WITH INHERIT
clause is not present. In this example, alice
has the INHERIT
attribute set, which means that the following command will record it in pg_auth_members
:
16-postgres=# GRANT pg_read_all_data TO alice;
The default value will change if the attribute is revoked.
16-postgres=# ALTER ROLE alice NOINHERIT;
16-postgres=# GRANT pg_write_all_data TO alice;
16-postgres=# SELECT roleid::regrole, member::regrole, grantor::regrole,
admin_option, inherit_option
FROM pg_auth_members
WHERE member = 'alice'::regrole;
roleid | member | grantor | admin_option | inherit_option
----------------------+--------+----------+--------------+----------------
bob | alice | postgres | f | f
pg_read_all_settings | alice | postgres | f | t
pg_read_all_data | alice | postgres | f | t
pg_write_all_data | alice | postgres | f | f
Important takeaway: the privilege inheritance behavior in PostgreSQL 16 changes. The INHERIT
no longer affects privilege inheritance and only affects subsequent GRANT
commands without WITH INHERIT
.
The other commit is for the ALTER DEFAULT PRIVILEGES
command. Now you need the INHERIT
attribute to be able to modify other roles' privileges. In our example, alice
can't use ALTER DEFAULT PRIVILEGES
to manage privileges for objects within bob
's schema.
pg_receivewal and pg_recvlogical now can handle SIGTERM
commit: 8b60db77
pg_receivewal
and pg_recvlogical
can run as daemons. When they do, systemd can only send SIGTERM
to stop them, but both tools expect SIGINT
to stop cleanly and just terminate on SIGTERM
. This is not an issue in itself, but when compression for pg_receivewal
is enabled, it only dumps data on disk on a clean exit.
So the developers ended up teaching the tools to recognize SIGTERM
properly.
This is all for now. Looking forward to the results of the November CommitFest!