Pull to refresh
136.43

Database Administration *

Everything about database administration

Show first
Rating limit
Level of difficulty

PostgreSQL for WMS: a DBMS selection strategy in the era of import substitution

Level of difficultyMedium
Reading time9 min
Reach and readers5.2K

Today we want to talk about choosing a DBMS for WMS not as a dry technical discussion, but as a strategic decision that determines the security, budget, and future flexibility of your business. This is not about "why PostgreSQL is technically better," but about why it has become the only safe, cost-effective, and future-proof solution for Russian warehouse systems in the new reality.

This is not just another database article. It is a roadmap for those who do not want to wake up one day with a paralyzed warehouse and multi-million fines due to a bad decision made yesterday. At INTEKEY we have gone this path deliberately, and today our WMS projects for the largest market players run on PostgreSQL. We know from experience where the pitfalls are and how to avoid them.

Read more

How to speed up mass data inserts in PostgreSQL when using Spring

Level of difficultyHard
Reading time17 min
Reach and readers8K

A common task in enterprise systems is to load large volumes of data into PostgreSQL — sometimes tens or even hundreds of millions of rows. At first glance, this seems simple: just write a loop in Java and call save() for every record. But in reality, such an approach can be painfully slow. Even a perfectly tuned PostgreSQL instance won’t help if the application is sending data inefficiently.

This article explains how to significantly accelerate bulk inserts when working with PostgreSQL through Spring and Hibernate. We’ll walk through which Spring and Hibernate settings are worth enabling, why they matter, and how much performance they can actually unlock. We’ll also look at how to build your own data-insertion layer for PostgreSQL — one that lets you switch between different insertion strategies, leverage PostgreSQL’s custom capabilities, and parallelize the process. Finally, we’ll see how to integrate this layer with Spring and what real gains each approach can deliver.

Read more

File handling in PostgreSQL: barriers and ways around them

Level of difficultyMedium
Reading time9 min
Reach and readers9.5K

Hitting the 4-billion-row limit in a TOAST table or running into an OidGen lock during a massive document import is a PostgreSQL admin’s nightmare. Sure, architects will tell you to push files to S3 — but real life often means keeping them inside the database. In this post, application optimization lead Alexander Popov breaks down how the standard bytea and pg_largeobject mechanisms work, where their bottlenecks hide, and how Postgres Pro Enterprise helps you get around those limits.

Read more

Delivering Faster Analytics at Pinterest

Level of difficultyMedium
Reading time6 min
Reach and readers6.6K

Pinterest is a visual discovery platform where people can find ideas like recipes, home and style inspiration, and much more. The platform offers its partners shopping capabilities as well as a significant advertising opportunity with 500+ million monthly active users. Advertisers can purchase ads directly on Pinterest or through partnerships with advertising agencies. Due to our huge scale, advertisers get an opportunity to learn about their Pins and their interaction with Pinterest users from the analytical data. This gives advertisers an opportunity to make decisions which will allow their ads to perform better on our platform.

Read more

Breaking data for fun

Level of difficultyEasy
Reading time8 min
Reach and readers6.4K

Throughout their careers engineers build systems that protect data and guard it against corruption. But what if the right approach is the opposite: deliberately corrupting data, generating it out of thin air, and creating forgeries indistinguishable from the real thing?

Maksim Gramin, systems analyst at Postgres Professional, explains why creating fake data is a critical skill for testing, security, and development — and how to do it properly without turning your database into a junkyard of “John Smith” entries.

Read more

Write. Review. Commit. Repeat. Behind the scenes of Postgres Professional docs

Level of difficultyEasy
Reading time3 min
Reach and readers6.8K

Everyone knows great documentation makes or breaks a tech product — but few realize how much work goes into it. At Postgres Professional, the docs are written with the same discipline as the code. What’s even more impressive, all of it is done by a team of just ten people. We talked to senior technical writer Ekaterina Gololobova to see how it really works — from the first task to the final commit.

Read more

PostgreSQL multi-master: a pipe dream or a practical solution?

Level of difficultyMedium
Reading time7 min
Reach and readers6.6K

One of the open challenges in the database world is keeping a database consistent across multiple DBMS instances (nodes) that independently handle client connections. The crux of the issue is ensuring that if one node fails, the others keep running smoothly — accepting connections, committing transactions, and maintaining consistency without a hitch. Think of it like a single DBMS instance staying operational despite a faulty RAM stick or intermittent access to multiple CPU cores.

My name is Andrey Lepikhov, and I’d like to kick off a discussion about the multi-master concept in PostgreSQL: its practical value, feasibility, and the tech stack needed to make it happen. By framing the problem more narrowly, we might find a solution that’s genuinely useful for the industry.

Read more

How we boosted SQL query accuracy by 33% with LLMs

Level of difficultyMedium
Reading time8 min
Reach and readers11K

Traditional approaches to SQL query generation often rely on instruction-tuned language models, but these can be inefficient and inaccurate. In this article, we’ll explore a new method based on reinforcement learning for model fine-tuning, which can improve both the accuracy and efficiency of SQL generation.

Read more

OAuth 2.0 authorization in PostgreSQL using Keycloak as an example

Level of difficultyEasy
Reading time27 min
Reach and readers10K

Hello, Habr! We continue the series of articles on the innovations of the Tantor Postgres 17.5.0 DBMS, and today we will talk about authorization support via OAuth 2.0 Device Authorization Flow is a modern and secure access method that allows applications to request access to PostgreSQL on behalf of the user through an external identification and access control provider, such as Keycloak, which is especially convenient for cloud environments and microservice architectures (the feature will also be available in PostgreSQL 18). In this article, we'll take a step-by-step look at configuring OAuth authorization in PostgreSQL using Keycloak: configure Keycloak, prepare PostgreSQL, write an OAuth token validator in PostgreSQL, and verify successful authorization via psql using Device Flow.

Read more

Shardman. A quick guide for the architect

Reading time22 min
Reach and readers15K

The myth of the magical fast=true parameter is still alive and well, but in distributed databases, another contender appears: distributed=true. Neither one will save you if you don’t rethink your schema, sharding keys, sequences, queries, and migration process. We walk through every corner with a clear-eyed approach — from choosing sharding keys and colocated tables to CDC, topologies, and foreign key constraints — showing where performance really improves, where it gets more expensive, and how to deal with it.

Read more

How to successfully migrate from Oracle to Postgres Pro Enterprise

Level of difficultyMedium
Reading time8 min
Reach and readers25K

Migration from Oracle to vanilla PostgreSQL hits roadblocks with packages, autonomous transactions, and collections—they simply don’t exist there. We’ll break down why ora2pg stumbles, how native implementations of these mechanisms in Postgres Pro Enterprise make life easier, and how ora2pgpro translates PL/SQL semantically correctly, without hacks or crude regex.

Read more

PostgreSQL 18: Part 5 or CommitFest 2025-03

Level of difficultyMedium
Reading time34 min
Reach and readers19K

September 25th marks the release of PostgreSQL 18. This article covers the March CommitFest and concludes the series covering the new features of the upcoming update. This article turned out quite large, as the last March CommitFest is traditionally the biggest and richest in new features.

You can find previous reviews of PostgreSQL 18 CommitFests here: 2024-07, 2024-09, 2024-11, 2025-01.

More

Postgres Pro TDE — security and performance

Level of difficultyMedium
Reading time14 min
Reach and readers18K

TDE comes in many flavors — from encryption at the TAM level to full-cluster encryption and tablespace markers. We take a close look at Percona, Cybertec/EDB, Pangolin/Fujitsu, and show where you lose performance and reliability, and where you gain flexibility.

On top of that, Vasily Bernstein, Deputy head of product development, and Vladimir Abramov, senior security engineer, will share how Postgres Pro Enterprise implements key rotation without rewriting entire tables — and why AES-GCM was the clear choice.

Read more

How we loaded a petabyte into PostgreSQL before New Year — and what happened next

Level of difficultyMedium
Reading time17 min
Reach and readers13K

It all started as a joke by the office coffee machine. But, as with every decent joke, it suddenly sounded worth trying — and before we knew it, we were knee-deep in an experiment that turned out to be anything but trivial, complete with a whole minefield of gotchas.

It began simply: while everyone else was busy debating hardware tuning and squeezing out extra TPS from their systems, we thought — why not just shove a huge chunk of data into PostgreSQL and see how it holds up? Like, really huge. Say, a one-petabyte database. Let’s see how it survives that.

It was December 10, the boss wanted the report by January 20, and New Year was less than a month away. And that itch that all engineers know? It hit hard.

Read more

How to load test PostgreSQL database and not miss anything

Level of difficultyMedium
Reading time14 min
Reach and readers14K

During load testing of Tantor Postgres databases or other PostgreSQL-based databases using the standard tool pgbench, specialists often encounter non-representative results and the need for repeated tests due to the fact that details of the environment (such as DBMS configuration, server characteristics, PostgreSQL versions) are not recorded. In this article we are going to review author's pg_perfbench, which is designed to address this issue. It ensures that scenarios are repeatable, prevents the loss of important data, and streamlines result comparison by registering all parameters in a single template. It also automatically launches pgbench with TPC-B load generation, collects all metadata on the testing environment, and generates a structured report.

Read more

Redundant statistics slow down your Postgres? Try sampling in pg_stat_statements

Level of difficultyMedium
Reading time11 min
Reach and readers3.9K

pg_stat_statements is the standard PostgreSQL extension used to track query statistics: number of executions, total and average execution time, number of returned rows, and other metrics. This information allows to analyze query behavior over time, identify problem areas, and make informed optimization decisions. However, in systems with high contention, pg_stat_statements itself can become a bottleneck and cause performance drops. In this article, we will analyze in which scenarios the extension becomes a source of problems, how sampling is structured, and in which cases its application can reduce overhead.

Read more

The journey of scaling up a production Elasticsearch cluster

Reading time6 min
Reach and readers4.1K

In this article, I will tell you about a-few-years journey of scaling the Elasticsearch cluster in production environment, which is one of the vital elements of the iPrice technology stack. 
I will describe challenges we encountered and how we approached them.

Read more

Extending and moving a ZooKeeper ensemble

Reading time3 min
Reach and readers3.5K

    Once upon a time our DBA team had a task. We had to move a ZooKeeper ensemble which we had been using for Clickhouse cluster. Everyone is used to moving an ensemble by moving its data files. It seems easy and obvious but our Clickhouse cluster had more than 400 TB replicated data. All replication information had been collected in ZooKeeper cluster from the very beginning. At the end of the day we couldn’t miss even a row of data. Then we looked for information on the internet. Unfortunately there was a good tutorial about 3.4.5 and didn’t fit our version 3.6.2. So we decided to use “the extending” for moving our ensemble.

Read more

In-Memory Showdown: Redis vs. Tarantool

Reading time13 min
Reach and readers6.6K
image

In this article, I am going to look at Redis versus Tarantool. At a first glance, they are quite alike — in-memory, NoSQL, key value. But we are going to look deeper. My goal is to find meaningful similarities and differences, I am not going to claim that one is better than the other.

There are three main parts to my story:

  • We’ll find out what is an in-memory database, or IMDB. When and how are they better than disk solutions?
  • Then, we’ll consider their architecture. What about their efficiency, reliability, and scaling?
  • Then, we’ll delve into technical details. Data types, iterators, indexes, transactions, programming languages, replication, and connectors.

Feel free to scroll down to the most interesting part or even the summary comparison table at the very bottom and the article.
Read more →
1

Authors' contribution