Pull to refresh
87.39

PostgreSQL *

Object-relational database management system (ORDBMS) with an emphasis on extensibility and standards compliance

Show first
Rating limit
Level of difficulty

How we boosted SQL query accuracy by 33% with LLMs

Level of difficultyMedium
Reading time8 min
Views465

Traditional approaches to SQL query generation often rely on instruction-tuned language models, but these can be inefficient and inaccurate. In this article, we’ll explore a new method based on reinforcement learning for model fine-tuning, which can improve both the accuracy and efficiency of SQL generation.

Read more

OAuth 2.0 authorization in PostgreSQL using Keycloak as an example

Level of difficultyEasy
Reading time27 min
Views763

Hello, Habr! We continue the series of articles on the innovations of the Tantor Postgres 17.5.0 DBMS, and today we will talk about authorization support via OAuth 2.0 Device Authorization Flow is a modern and secure access method that allows applications to request access to PostgreSQL on behalf of the user through an external identification and access control provider, such as Keycloak, which is especially convenient for cloud environments and microservice architectures (the feature will also be available in PostgreSQL 18). In this article, we'll take a step-by-step look at configuring OAuth authorization in PostgreSQL using Keycloak: configure Keycloak, prepare PostgreSQL, write an OAuth token validator in PostgreSQL, and verify successful authorization via psql using Device Flow.

Read more

Shardman. A quick guide for the architect

Reading time22 min
Views756

The myth of the magical fast=true parameter is still alive and well, but in distributed databases, another contender appears: distributed=true. Neither one will save you if you don’t rethink your schema, sharding keys, sequences, queries, and migration process. We walk through every corner with a clear-eyed approach — from choosing sharding keys and colocated tables to CDC, topologies, and foreign key constraints — showing where performance really improves, where it gets more expensive, and how to deal with it.

Read more

How to successfully migrate from Oracle to Postgres Pro Enterprise

Level of difficultyMedium
Reading time8 min
Views626

Migration from Oracle to vanilla PostgreSQL hits roadblocks with packages, autonomous transactions, and collections—they simply don’t exist there. We’ll break down why ora2pg stumbles, how native implementations of these mechanisms in Postgres Pro Enterprise make life easier, and how ora2pgpro translates PL/SQL semantically correctly, without hacks or crude regex.

Read more

PostgreSQL 18: Part 5 or CommitFest 2025-03

Level of difficultyMedium
Reading time34 min
Views760

September 25th marks the release of PostgreSQL 18. This article covers the March CommitFest and concludes the series covering the new features of the upcoming update. This article turned out quite large, as the last March CommitFest is traditionally the biggest and richest in new features.

You can find previous reviews of PostgreSQL 18 CommitFests here: 2024-07, 2024-09, 2024-11, 2025-01.

More

Global indexes for partitions in Postgres Pro: uniqueness without hacks

Level of difficultyMedium
Reading time5 min
Views652

When there’s no filter on the partitioning key, local indexes turn into a marathon across partitions. The new gbtree keeps a single catalog of keys and jumps straight to the row by primary key. In this article, we’ll show the algorithm, real numbers and limitations (primary key is mandatory, ON CONFLICT does not work) — and where this eases the pain in CRM/billing.

Read more

Postgres Pro TDE — security and performance

Level of difficultyMedium
Reading time14 min
Views1K

TDE comes in many flavors — from encryption at the TAM level to full-cluster encryption and tablespace markers. We take a close look at Percona, Cybertec/EDB, Pangolin/Fujitsu, and show where you lose performance and reliability, and where you gain flexibility.

On top of that, Vasily Bernstein, Deputy head of product development, and Vladimir Abramov, senior security engineer, will share how Postgres Pro Enterprise implements key rotation without rewriting entire tables — and why AES-GCM was the clear choice.

Read more

The Russian trace in the history of the PostgreSQL logo

Level of difficultyEasy
Reading time7 min
Views1.8K

The story of the PostgreSQL logo was shared by Oleg Bartunov, CEO of Postgres Professional, who personally witnessed these events and preserved an archive of correspondence and visual design development for the database system.

Our iconic PostgreSQL logo — our beloved “Slonik” — has come a long way. Soon, it will turn thirty! Over the years, its story has gathered plenty of myths and speculation. As a veteran of the community, I decided it’s time to set the record straight, relying on the memories of those who were there. Who actually came up with it? Why an elephant? How did it end up in a diamond, and how did the Russian word “slonik” become a part of the global IT vocabulary?

Read more

START: how to defeat hallucinations and teach LLMs accurate calculations

Level of difficultyEasy
Reading time3 min
Views963

START is an open-source LLM designed for precise calculations and code verification. It addresses two major issues that most standard models face: hallucinations and errors in multi-step calculations. This article explains why these problems arise and how START solves them.

Read more

How we loaded a petabyte into PostgreSQL before New Year — and what happened next

Level of difficultyMedium
Reading time17 min
Views1.2K

It all started as a joke by the office coffee machine. But, as with every decent joke, it suddenly sounded worth trying — and before we knew it, we were knee-deep in an experiment that turned out to be anything but trivial, complete with a whole minefield of gotchas.

It began simply: while everyone else was busy debating hardware tuning and squeezing out extra TPS from their systems, we thought — why not just shove a huge chunk of data into PostgreSQL and see how it holds up? Like, really huge. Say, a one-petabyte database. Let’s see how it survives that.

It was December 10, the boss wanted the report by January 20, and New Year was less than a month away. And that itch that all engineers know? It hit hard.

Read more

How to load test PostgreSQL database and not miss anything

Level of difficultyMedium
Reading time14 min
Views1K

During load testing of Tantor Postgres databases or other PostgreSQL-based databases using the standard tool pgbench, specialists often encounter non-representative results and the need for repeated tests due to the fact that details of the environment (such as DBMS configuration, server characteristics, PostgreSQL versions) are not recorded. In this article we are going to review author's pg_perfbench, which is designed to address this issue. It ensures that scenarios are repeatable, prevents the loss of important data, and streamlines result comparison by registering all parameters in a single template. It also automatically launches pgbench with TPC-B load generation, collects all metadata on the testing environment, and generates a structured report.

Read more

We’ve learned how to migrate databases from Oracle to Postgres Pro at 41 TB/day

Level of difficultyEasy
Reading time3 min
Views952

41 TB/day from Oracle to Postgres Pro without stopping the source system — not theory, but numbers from our latest tests. We broke the migration into three stages: fast initial load, CDC from redo logs, and validation, and wrapped them into ProGate. In this article, we’ll explain how the pipeline works, why we chose Go, and where the bottlenecks hide.

Read more

Partition and rule: sharing practical knowledge about partitioning in Postgres Pro

Level of difficultyMedium
Reading time11 min
Views939

Declarative partitioning may sound complex, but in reality it’s just a way to tell your database how best to organize large tables — so it can optimize queries and make maintenance easier. Let’s walk through how it works and when declarative partitioning can save the day.

Read more

Getting started with pgpro-otel-collector

Level of difficultyEasy
Reading time4 min
Views772

Now that pgpro-otel-collector has had its public release, I’m excited to start sharing more about the tool — and to kick things off, I’m launching a blog series focused entirely on the Collector.

The first post is an intro — a practical guide to installing, configuring, and launching the collector. We’ll also take our first look at what kind of data the collector exposes, starting with good old Postgres metrics.

Read more

Getting to know PPEM 2

Level of difficultyEasy
Reading time7 min
Views507

Postgres Pro recently announced the release of Enterprise Manager 2, commonly known as PPEM.

In short, PPEM is an administration tool designed for managing and monitoring Postgres databases. Its primary goal is to assist DBAs in their daily tasks and automate routine operations. In this article, I'll take a closer look at what PPEM has to offer. My name is Alexey, and I'm part of the PPEM development team.

Read more

Redundant statistics slow down your Postgres? Try sampling in pg_stat_statements

Level of difficultyMedium
Reading time11 min
Views657

pg_stat_statements is the standard PostgreSQL extension used to track query statistics: number of executions, total and average execution time, number of returned rows, and other metrics. This information allows to analyze query behavior over time, identify problem areas, and make informed optimization decisions. However, in systems with high contention, pg_stat_statements itself can become a bottleneck and cause performance drops. In this article, we will analyze in which scenarios the extension becomes a source of problems, how sampling is structured, and in which cases its application can reduce overhead.

Read more

The performance engineer: a detective licensed to kill… bottlenecks

Level of difficultyEasy
Reading time5 min
Views762

Picture this: a mission-critical SQL query is crawling along. Not for an hour. Not for two. Fifteen hours. A full workday of the system slowly grinding through data while the business bleeds money and users teeter on the edge of a nervous breakdown. And then — cue the dramatic music — in walks the performance engineer.

After a few hours of intense analysis and a couple of pinpoint code tweaks, the same query that took 15 hours now completes in just… two minutes. Sounds like magic? Nope. This is the thrilling (and very real) world of performance engineering.

Read more

How we implemented vector search in Postgres Pro

Level of difficultyEasy
Reading time7 min
Views822

In this article, we’ll explore what vector search is, what problems it solves, and how the pgpro_vector extension for Postgres Pro brings powerful vector capabilities directly into a relational database — no need for separate specialized systems.

Read more

On reordering expressions in Postgres

Level of difficultyEasy
Reading time4 min
Views609

Today, I want to talk about one of those sneaky tricks that can help speed up query execution. Specifically, this is about reordering conditions in WHERE clauses, JOINs, HAVING clauses, and so on.

The idea is simple: if a condition in an AND chain turns out to be false, or if one in an OR chain turns out to be true, there's no need to evaluate the rest. That means saved CPU cycles — and sometimes, a lot of them. Let’s break this down.

Read more

Automated management of extended statistics in PostgreSQL

Level of difficultyMedium
Reading time6 min
Views689

Here I describe the results of developing a PostgreSQL extension I built just out of curiosity. Its purpose is to automatically manage extended column statistics. The idea came to me while finishing work on another "smart" query-driven product for improving PostgreSQL planning quality. I realized that the current architecture of PostgreSQL isn’t quite ready for fully autonomous operation — automatic detection of bad plans and adaptive optimizer tuning. So why not try the other way around and build an autonomous data-driven assistant?

Read more

Authors' contribution