All streams
Search
Write a publication
Pull to refresh

Development

Show first
Period
Level of difficulty

Новая атака с использованием бэкдора PhantomRShell

Level of difficultyMedium
Reading time7 min
Views1.6K

В августе, благодаря нашей песочнице, была предотвращена атака на российские организации с применением нового вредоносного кода. Изначально мы предположили, что это массовый фишинг с серверов злоумышленников, который каждый день можно встретить на почте любой организации. Но оказалось, что отправитель письма вполне легитимный: он был скомпрометирован злоумышленниками, нацеленными на российские оборонные и промышленные организации.

Хакеры использовали сложную схему сокрытия вредоносной нагрузки в архивах-полиглотах. Полиглоты — это файлы, которые могут быть валидны с точки зрения спецификации нескольких форматов. Сама вредоносная нагрузка является новой обфусцированной вариацией инструмента PhantomRShell, который использует группировка PhantomCore (ранее мы писали про нее в блоге).

В этой статье мы расскажем подробности атаки, ее возможный исходный вектор и дадим рекомендации по защите почтовой инфраструктуры от взлома и подобных атак. Интересно? Добро пожаловать под кат!

Read more

What’s in Store for pg_probackup 3

Level of difficultyMedium
Reading time12 min
Views849

While pg_probackup 3 is still in the works and not yet available to the public, let’s dive into what’s new under the hood. There’s a lot to unpack — from a completely reimagined application architecture to long-awaited features and seamless integration with other tools. 

Read more

Postgres Pro TDE — security and performance

Level of difficultyMedium
Reading time14 min
Views754

TDE comes in many flavors — from encryption at the TAM level to full-cluster encryption and tablespace markers. We take a close look at Percona, Cybertec/EDB, Pangolin/Fujitsu, and show where you lose performance and reliability, and where you gain flexibility.

On top of that, Vasily Bernstein, Deputy head of product development, and Vladimir Abramov, senior security engineer, will share how Postgres Pro Enterprise implements key rotation without rewriting entire tables — and why AES-GCM was the clear choice.

Read more

Energomera CE6806P: Bridging Analog and Digital in Energy Metering

Level of difficultyMedium
Reading time10 min
Views866

How did engineers in the past manage to measure electrical power without modern microchips and DSPs? This article explores the Energomera CE6806P, a device created in 2006 for verifying electricity meters, yet built using 1980s-era technology.

We’ll take a closer look at its design, principles of operation, and how discrete-analog solutions were used to achieve high accuracy. The Energomera is a fascinating example of engineering and ingenuity, giving us a unique perspective on the evolution of electrical measurement devices.

Read more

The Links Theory 0.0.2

Level of difficultyMedium
Reading time27 min
Views2.1K

This world needs a new theory — a theory that could describe all the theories on the planet. A theory that could easily describe philosophy, mathematics, physics, and psychology. The one that makes all kinds of sciences computable.

This is exactly what we are working on. If we succeed, this theory will become the unified meta-theory of everything.

A year has passed since our last publication, and our task is to share the progress with our English-speaking audience. This is still not a stable version; it’s a draft. Therefore, we welcome any feedback, as well as your participation in the development of the links theory.

As with everything we have done before, the links theory is published and released into the public domain — it belongs to humanity, that means, it is yours. This work has many authors, but the work itself is far more important than any specific authorship. We hope that today it can become useful to more people.

We invite you to become a part of this exciting adventure.

Witness the birth of meta-theory

PostgreSQL 18: Part 2 or CommitFest 2024-09

Level of difficultyMedium
Reading time14 min
Views597


Statistically, September CommitFests feature the fewest commits. Apparently, the version 18 CommitFest is an outlier. There are many accepted patches and many interesting new features to talk about.


If you missed the July CommitFest, get up to speed here: 2024-07.

Read more →

PostgreSQL 18: Part 5 or CommitFest 2025-03

Level of difficultyMedium
Reading time34 min
Views324

September 25th marks the release of PostgreSQL 18. This article covers the March CommitFest and concludes the series covering the new features of the upcoming update. This article turned out quite large, as the last March CommitFest is traditionally the biggest and richest in new features.

You can find previous reviews of PostgreSQL 18 CommitFests here: 2024-07, 2024-09, 2024-11, 2025-01.

More

The Russian trace in the history of the PostgreSQL logo

Level of difficultyEasy
Reading time7 min
Views1.6K

The story of the PostgreSQL logo was shared by Oleg Bartunov, CEO of Postgres Professional, who personally witnessed these events and preserved an archive of correspondence and visual design development for the database system.

Our iconic PostgreSQL logo — our beloved “Slonik” — has come a long way. Soon, it will turn thirty! Over the years, its story has gathered plenty of myths and speculation. As a veteran of the community, I decided it’s time to set the record straight, relying on the memories of those who were there. Who actually came up with it? Why an elephant? How did it end up in a diamond, and how did the Russian word “slonik” become a part of the global IT vocabulary?

Read more

Getting started with pgpro-otel-collector

Level of difficultyEasy
Reading time4 min
Views706

Now that pgpro-otel-collector has had its public release, I’m excited to start sharing more about the tool — and to kick things off, I’m launching a blog series focused entirely on the Collector.

The first post is an intro — a practical guide to installing, configuring, and launching the collector. We’ll also take our first look at what kind of data the collector exposes, starting with good old Postgres metrics.

Read more

How we loaded a petabyte into PostgreSQL before New Year — and what happened next

Level of difficultyMedium
Reading time17 min
Views1.1K

It all started as a joke by the office coffee machine. But, as with every decent joke, it suddenly sounded worth trying — and before we knew it, we were knee-deep in an experiment that turned out to be anything but trivial, complete with a whole minefield of gotchas.

It began simply: while everyone else was busy debating hardware tuning and squeezing out extra TPS from their systems, we thought — why not just shove a huge chunk of data into PostgreSQL and see how it holds up? Like, really huge. Say, a one-petabyte database. Let’s see how it survives that.

It was December 10, the boss wanted the report by January 20, and New Year was less than a month away. And that itch that all engineers know? It hit hard.

Read more

Global indexes for partitions in Postgres Pro: uniqueness without hacks

Level of difficultyMedium
Reading time5 min
Views320

When there’s no filter on the partitioning key, local indexes turn into a marathon across partitions. The new gbtree keeps a single catalog of keys and jumps straight to the row by primary key. In this article, we’ll show the algorithm, real numbers and limitations (primary key is mandatory, ON CONFLICT does not work) — and where this eases the pain in CRM/billing.

Read more

The Future of PostgreSQL: How a 64-bit Transaction Counter Solves Scaling Issues

Level of difficultyMedium
Reading time5 min
Views691

For many years, the PostgreSQL community was skeptical about using this database management system (DBMS) for high-transaction environments. While PostgreSQL worked well for lab tests, mid-tier web applications, and smaller backend systems, it was believed that for heavy transactional loads, you’d need an expensive DBMS designed specifically for such purposes. As a result, PostgreSQL wasn’t particularly developed in that direction, leaving a range of issues unanswered.

However, the reality has turned out differently. More and more of our clients are encountering problems that stem from this mindset. For example, in the global PostgreSQL community, it’s considered that 64 cores is the maximum size of a server where PostgreSQL can run effectively. But we’re now seeing that this is becoming a minimum typical configuration. One particular bottleneck that has emerged is the transaction counter, and this is a far more interesting issue. So, let’s dive into what the problem is, how we solved it, and what the international community thinks about it.

Read more

«Where, where have you gone», or searching for missing stations on public transport routes in OpenStreetMap

Level of difficultyMedium
Reading time6 min
Views1.1K

OpenStreetMap (OSM) is a global project formed around a geographic information database which is being filled by all comers — both enthusiasts and interested companies. Anybody can contribute, but the openness has its downside: incorrect edits often get into the database. Hence plenty of validators of OSM data have been written which allow to maintain the data quality at an acceptable level.

Since 2016 there exists an open source subway preprocessor that validates (generates error reports) rapid transit routes in OSM for completeness and logical/topological errors, and converts them into formats that are suitable for routing and rendering, e.g. GTFS. Besides OSM data it takes a list of public transport (PT) networks which contains the checking information about the number of lines, stations etc. per a PT network. The preprocessor has successfully proven itself in the preparation of PT data for applications such as Maps.me and Organic Maps.

In this article, I would like to share an approach to detecting one of the types of errors that occur quite often in OSM data and automatic detection of which is somewhat challenging. It's an accidental loss of a station from a route. The source code of the validator and the described algorithm are open source. But first, let's define the concepts used to represent PT data in OpenStreetMap.

Read more

START: how to defeat hallucinations and teach LLMs accurate calculations

Level of difficultyEasy
Reading time3 min
Views780

START is an open-source LLM designed for precise calculations and code verification. It addresses two major issues that most standard models face: hallucinations and errors in multi-step calculations. This article explains why these problems arise and how START solves them.

Read more

Partition and rule: sharing practical knowledge about partitioning in Postgres Pro

Level of difficultyMedium
Reading time11 min
Views846

Declarative partitioning may sound complex, but in reality it’s just a way to tell your database how best to organize large tables — so it can optimize queries and make maintenance easier. Let’s walk through how it works and when declarative partitioning can save the day.

Read more

The performance engineer: a detective licensed to kill… bottlenecks

Level of difficultyEasy
Reading time5 min
Views717

Picture this: a mission-critical SQL query is crawling along. Not for an hour. Not for two. Fifteen hours. A full workday of the system slowly grinding through data while the business bleeds money and users teeter on the edge of a nervous breakdown. And then — cue the dramatic music — in walks the performance engineer.

After a few hours of intense analysis and a couple of pinpoint code tweaks, the same query that took 15 hours now completes in just… two minutes. Sounds like magic? Nope. This is the thrilling (and very real) world of performance engineering.

Read more

Whose feature is better, or how to compare the efficiency of SQL query plans

Level of difficultyMedium
Reading time5 min
Views442

How to compare the efficiency of SQL query plans? “Measure the execution time, of course!” — an experienced reader would say. And they would be absolutely right: from a practical perspective, the more efficient DBMS is the one that delivers higher TPS. However, sometimes we need to design a system that doesn't exist yet or predict behavior under loads that haven't occurred yet. In such cases, we need a characteristic that allows us to perform a qualitative analysis of a plan or compare two plans. This post is dedicated to one such characteristic — the number of data pages read.

Read more

How to Fail Those Students Who Rely on ChatGPT

Reading time3 min
Views2.5K

We at Verilog Meetup constructed an exam/interview problem that has an interesting property: if a student tries to figure out a solution by thinking by himself, he usually succeeds; however if he dumps the problem on ChatGPT, the solution fails (does not pass the automated test), and the student goes into a death spiral of futility, kicking ChatGPT to get the solution right.

There is nothing weird about the problem, we do this in the industry all the time:

Read more

We’ve learned how to migrate databases from Oracle to Postgres Pro at 41 TB/day

Level of difficultyEasy
Reading time3 min
Views877

41 TB/day from Oracle to Postgres Pro without stopping the source system — not theory, but numbers from our latest tests. We broke the migration into three stages: fast initial load, CDC from redo logs, and validation, and wrapped them into ProGate. In this article, we’ll explain how the pipeline works, why we chose Go, and where the bottlenecks hide.

Read more

The future of AI: formal grammars

Level of difficultyEasy
Reading time15 min
Views651

Why does even the most powerful LLM sometimes produce meaningless phrases and contradictions? It all comes down to the exponential growth of possibilities (N^M) and the free copying of human errors. Read the article to learn how we use formal grammars to turn chaotic generation into controlled synthesis, strengthening the role of semantics and enforcing structural rules.

Read more