Database Administration *

Everything about database administration

ArticlesPostsNewsAuthors

kaze_no_saga Sep 26 at 08:00

PostgreSQL 18: Part 5 or CommitFest 2025-03

Medium

34 min

324

Postgres Professional corporate blogPostgreSQL * SQL * Database Administration *

Digest

Translation

September 25th marks the release of PostgreSQL 18. This article covers the March CommitFest and concludes the series covering the new features of the upcoming update. This article turned out quite large, as the last March CommitFest is traditionally the biggest and richest in new features.

You can find previous reviews of PostgreSQL 18 CommitFests here: 2024-07, 2024-09, 2024-11, 2025-01.

melanny20 Sep 15 at 13:29

Postgres Pro TDE — security and performance

Medium

14 min

754

Postgres Professional corporate blogPostgreSQL * Server Administration * System administration * Database Administration *

Review

Translation

TDE comes in many flavors — from encryption at the TAM level to full-cluster encryption and tablespace markers. We take a close look at Percona, Cybertec/EDB, Pangolin/Fujitsu, and show where you lose performance and reliability, and where you gain flexibility.

On top of that, Vasily Bernstein, Deputy head of product development, and Vladimir Abramov, senior security engineer, will share how Postgres Pro Enterprise implements key rotation without rewriting entire tables — and why AES-GCM was the clear choice.

+11

melanny20 Aug 25 at 10:42

How we loaded a petabyte into PostgreSQL before New Year — and what happened next

Medium

17 min

1.1K

Postgres Professional corporate blogPostgreSQL * Database Administration *

Retrospective

Translation

It all started as a joke by the office coffee machine. But, as with every decent joke, it suddenly sounded worth trying — and before we knew it, we were knee-deep in an experiment that turned out to be anything but trivial, complete with a whole minefield of gotchas.

It began simply: while everyone else was busy debating hardware tuning and squeezing out extra TPS from their systems, we thought — why not just shove a huge chunk of data into PostgreSQL and see how it holds up? Like, really huge. Say, a one-petabyte database. Let’s see how it survives that.

It was December 10, the boss wanted the report by January 20, and New Year was less than a month away. And that itch that all engineers know? It hit hard.

TantorLabs Aug 22 at 05:06

How to load test PostgreSQL database and not miss anything

Medium

14 min

887

Тантор Лабс corporate blogPostgreSQL * Database Administration * High performance * IT systems testing *

Review

During load testing of Tantor Postgres databases or other PostgreSQL-based databases using the standard tool pgbench, specialists often encounter non-representative results and the need for repeated tests due to the fact that details of the environment (such as DBMS configuration, server characteristics, PostgreSQL versions) are not recorded. In this article we are going to review author's pg_perfbench, which is designed to address this issue. It ensures that scenarios are repeatable, prevents the loss of important data, and streamlines result comparison by registering all parameters in a single template. It also automatically launches pgbench with TPC-B load generation, collects all metadata on the testing environment, and generates a structured report.

TantorLabs Jul 18 at 03:43

Redundant statistics slow down your Postgres? Try sampling in pg_stat_statements

Medium

11 min

605

Тантор Лабс corporate blogPostgreSQL * SQL * Database Administration * System administration *

Tutorial

pg_stat_statements is the standard PostgreSQL extension used to track query statistics: number of executions, total and average execution time, number of returned rows, and other metrics. This information allows to analyze query behavior over time, identify problem areas, and make informed optimization decisions. However, in systems with high contention, pg_stat_statements itself can become a bottleneck and cause performance drops. In this article, we will analyze in which scenarios the extension becomes a source of problems, how sampling is structured, and in which cases its application can reduce overhead.

qqwrst Apr 4 2022 at 02:34

The journey of scaling up a production Elasticsearch cluster

6 min

3.6K

High performance * NoSQL * Database Administration * Amazon Web Services * Distributed systems *

In this article, I will tell you about a-few-years journey of scaling the Elasticsearch cluster in production environment, which is one of the vital elements of the iPrice technology stack.
I will describe challenges we encountered and how we approached them.

Yersin_DBA Oct 30 2021 at 17:04

Extending and moving a ZooKeeper ensemble

3 min

2.7K

Database Administration * Big Data *

Tutorial

Translation

Once upon a time our DBA team had a task. We had to move a ZooKeeper ensemble which we had been using for Clickhouse cluster. Everyone is used to moving an ensemble by moving its data files. It seems easy and obvious but our Clickhouse cluster had more than 400 TB replicated data. All replication information had been collected in ZooKeeper cluster from the very beginning. At the end of the day we couldn’t miss even a row of data. Then we looked for information on the internet. Unfortunately there was a good tutorial about 3.4.5 and didn’t fit our version 3.6.2. So we decided to use “the extending” for moving our ensemble.

michael-filonenko Sep 1 2021 at 13:15

In-Memory Showdown: Redis vs. Tarantool

13 min

VK corporate blogTarantool * Database Administration * High performance *

In this article, I am going to look at Redis versus Tarantool. At a first glance, they are quite alike — in-memory, NoSQL, key value. But we are going to look deeper. My goal is to find meaningful similarities and differences, I am not going to claim that one is better than the other.

There are three main parts to my story:

We’ll find out what is an in-memory database, or IMDB. When and how are they better than disk solutions?
Then, we’ll consider their architecture. What about their efficiency, reliability, and scaling?
Then, we’ll delve into technical details. Data types, iterators, indexes, transactions, programming languages, replication, and connectors.

Feel free to scroll down to the most interesting part or even the summary comparison table at the very bottom and the article.

+16

Dradmin Apr 30 2021 at 05:33

MySQL 8 Performance Benchmark

3 min

9.6K

MySQL * IT Infrastructure * *nix * Server Administration * Database Administration *

In this article, we benchmark the performance of MySQL 8 default configuration vs. innodb_dedicated_server enabled configuration vs. the configuration recommended by MySQL Performance Tuning Service.

KAPANDR Dec 29 2020 at 16:59

Tarantool: an analyst's view

8 min

VK corporate blogTarantool * Database Administration * System Analysis and Design * Internet marketing *

Hi all! I'm Andrey Kapustin. I work as a system analyst at Mail.ru Group. Our products form a unified ecosystem. Many independent infrastructures generate data in it: taxi and food delivery services, email services, social networks, etc. The faster and more precise we can predict a client's needs, the sooner and more correctly we can offer our products.

Many system analysts and engineers are keen to know:

How to design the architecture of a trigger platform for real-time marketing?
How to arrange a data structure that would be in line with the requirements of a marketing strategy for interacting with clients?
How to ensure the stable operations of the system under very heavy workloads?

Such systems are based on technologies of high-load processing and Big Data analysis. We have accumulated considerable experience in these areas. Our expertise is in high demand on the market. I'm going to show how we help our customers to switch from off-line to on-line in their interactions with clients using Real-Time Marketing solutions based on Tarantool.

+26

Wendigoo Oct 5 2020 at 07:38

Mysql 8.x Group Replication (Master-Slave) with Docker Compose

5 min

6.2K

MySQL * Database Administration * DevOps *

This post is handling the following situation - how to setup up simple Mysql services with group replication being dockerized. In our case, we’ll take the latest Mysql (version 8.x.x)

FYI: all mentioned code (worked and tested manually) located here.

I will skip not interested steps like ‘what is Mysql, Docker and why we choose them, etc’. We want to set up possibly trouble proof DB. That’s our plan.

AnnaPhc Aug 11 2020 at 16:05

IIoT platform databases – How Mail.ru Cloud Solutions deals with petabytes of data coming from a multitude of devices

11 min

1.9K

VK corporate blogData storage * IOTDatabase Administration * Tarantool *

Hello, my name is Andrey Sergeyev and I work as a Head of IoT Solution Development at Mail.ru Cloud Solutions. We all know there is no such thing as a universal database. Especially when the task is to build an IoT platform that would be capable of processing millions of events from various sensors in near real-time.

Our product Mail.ru IoT Platform started as a Tarantool-based prototype. I’m going to tell you about our journey, the problems we faced and the solutions we found. I will also show you a current architecture for the modern Industrial Internet of Things platform. In this article we will look into:

our requirements for the database, universal solutions, and the CAP theorem
whether the database + application server in one approach is a silver bullet
the evolution of the platform and the databases used in it
the number of Tarantools we use and how we came to this

+19

olku Aug 3 2020 at 15:35

Lossless ElasticSearch data migration

5 min

4.3K

DevOps * NoSQL * Database Administration *

Translation

Academic data warehouse design recommends keeping everything in a normalized form, with links between. Then the roll forward of changes in relational math will provide a reliable repository with transaction support. Atomicity, Consistency, Isolation, Durability — that's all. In other words, the storage is explicitly built to safely update the data. But it is not optimal for searching, especially with a broad gesture on the tables and fields. We need indices, a lot of indices! Volumes expand, recording slows down. SQL LIKE can not be indexed, and JOIN GROUP BY sends us to meditate in the query planner.

vovkins Jun 26 2020 at 08:31

Making a Tarantool-Based Investment Business Core for Alfa-Bank

10 min

1.9K

VK corporate blogTarantool * Database Administration * System Analysis and Design * High performance *

A still from «Our Secret Universe: The Hidden Life of the Cell»

Investment business is one of the most complex domains in the banking world. It's about not just credits, loans, and deposits — there are also securities, currencies, commodities, derivatives, and all kinds of complex stuff like structured products.

Recently, people have become increasingly aware of their finances. More and more get involved in securities trading. Individual investment accounts have emerged not so long ago. They allow you to trade in securities and get tax credits or avoid taxes at the same time. All clients coming to us want to manage their portfolios and see their reporting on-line. Most frequently, these are multi-product portfolios, which means that people are clients of different business areas.

Moreover, the demands of regulators, both Russian and international, also grow.

To meet the current needs and lay a foundation for future upgrades, we've developed our Tarantool-based investment business core.

+14

dokshina Apr 13 2020 at 11:34

Deploying Tarantool Cartridge applications with zero effort (Part 2)

11 min

1.5K

VK corporate blogTarantool * Database Administration * High performance * Distributed systems *

Tutorial

We have recently talked about how to deploy a Tarantool Cartridge application. However, an application's life doesn't end with deployment, so today we will update our application and figure out how to manage topology, sharding, and authorization, and change the role configuration.

Feeling interested? Please continue reading under the cut.

+15

msgeek Mar 19 2020 at 07:00

Build apps for free with Azure Cosmos DB Free Tier

3 min

1.8K

Microsoft corporate blogMicrosoft Azure * Database Administration * Cloud computing * Cloud services *

Looking to build a new app, develop and test, or run small production workloads with Azure Cosmos DB? Our new Free Tier makes it easy to get started with no cost and save money as you build and grow new apps.

With Azure Cosmos DB Free Tier enabled, you’ll get the first 400 RU/s throughput and 5 GB storage in your account for free each month, for the lifetime of the account. That means that you can start small and grow with confidence, knowing your app will be running on a high-performance database service. You’ll only pay if your account exceeds 400 RU/s and 5 GB. Additionally, if your app has a lot of containers you can create up to 25 containers in a shared throughput database and have them all share the free 400 RU/s. You can have up to one free tier Azure Cosmos DB account per Azure subscription.

dokshina Dec 16 2019 at 09:41

Deploying Tarantool Cartridge applications with zero effort (Part 1)

8 min

VK corporate blogIT Infrastructure * Database Administration * High performance * Distributed systems *

We have already presented Tarantool Cartridge that allows you to develop and pack distributed applications. Now let's learn how to deploy and control these applications. No panic, it's all under control! We have brought together all the best practices of working with Tarantool Cartridge and wrote an Ansible role, which will deploy the package to servers, start and join instances into replica sets, configure authorization, bootstrap vshard, enable automatic failover and patch cluster configuration.

Interesting, huh? Dive in, check details under the cut.

+29

vasiliy-t Oct 21 2019 at 16:09

Tarantool Kubernetes Operator

10 min

VK corporate blogKubernetes * Tarantool * Database Administration * High performance *

Kubernetes has already become a de-facto standard for running stateless applications, mainly because it can reduce time-to-market for new features. Launching stateful applications, such as databases or stateful microservices, is still a complex task, but companies have to meet the competition and maintain a high delivery rate. So they create a demand for such solutions.

We want to introduce our solution for launching stateful Tarantool Cartridge clusters: Tarantool Kubernetes Operator, more under the cut.

+34

AlanDenton Jun 23 2019 at 05:14

SQL Index Manager – a long story about SQL Server, grave digging and index maintenance

14 min

2.8K

.NET * Database Administration * Microsoft SQL Server * Programming * SQL *

Every now and then we create our own problems with our own hands… with our vision of the world… with our inaction… with our laziness… and with our fears. As a result, it seems to become very convenient to swim in the public flow of sewage patterns… because it is warm and fun, and the rest does not matter – we can smell round. But after a fail comes the realization of the simple truth – instead of generating an endless stream of causes, self-pity and self-justification, it is enough just to do what you consider the most important for yourself. This will be the starting point for your new reality.

For me, the written below is just such a starting point. The way is expected to be lingering…

Let's go?

+15

ashotog Mar 10 2019 at 07:48

How to Discover MongoDB and Elasticsearch Open Databases

3 min

17K

Database Administration * Information Security * Search engines * System administration *

Some time ago among security researchers, it was very “fashionable” to find improperly configured AWS cloud storages with various kinds of confidential information. At that time, I even published a small note about how Amazon S3 open cloud storage is discovered.

However, time passes and the focus in research has shifted to the search for unsecured and exposed public domain databases. More than half of the known cases of large data leaks over the past year are leaks from open databases.

Today we will try to figure out how such databases are discovered by security researchers...

+16