Pull to refresh
120.67

Database Administration *

Everything about database administration

Show first
Rating limit
Level of difficulty

The journey of scaling up a production Elasticsearch cluster

Reading time6 min
Views3.4K

In this article, I will tell you about a-few-years journey of scaling the Elasticsearch cluster in production environment, which is one of the vital elements of the iPrice technology stack. 
I will describe challenges we encountered and how we approached them.

Read more

Extending and moving a ZooKeeper ensemble

Reading time3 min
Views2.5K

    Once upon a time our DBA team had a task. We had to move a ZooKeeper ensemble which we had been using for Clickhouse cluster. Everyone is used to moving an ensemble by moving its data files. It seems easy and obvious but our Clickhouse cluster had more than 400 TB replicated data. All replication information had been collected in ZooKeeper cluster from the very beginning. At the end of the day we couldn’t miss even a row of data. Then we looked for information on the internet. Unfortunately there was a good tutorial about 3.4.5 and didn’t fit our version 3.6.2. So we decided to use “the extending” for moving our ensemble.

Read more

In-Memory Showdown: Redis vs. Tarantool

Reading time13 min
Views5.6K
image

In this article, I am going to look at Redis versus Tarantool. At a first glance, they are quite alike — in-memory, NoSQL, key value. But we are going to look deeper. My goal is to find meaningful similarities and differences, I am not going to claim that one is better than the other.

There are three main parts to my story:

  • We’ll find out what is an in-memory database, or IMDB. When and how are they better than disk solutions?
  • Then, we’ll consider their architecture. What about their efficiency, reliability, and scaling?
  • Then, we’ll delve into technical details. Data types, iterators, indexes, transactions, programming languages, replication, and connectors.

Feel free to scroll down to the most interesting part or even the summary comparison table at the very bottom and the article.
Read more →

Tarantool: an analyst's view

Reading time8 min
Views1.9K
Hi all! I'm Andrey Kapustin. I work as a system analyst at Mail.ru Group. Our products form a unified ecosystem. Many independent infrastructures generate data in it: taxi and food delivery services, email services, social networks, etc. The faster and more precise we can predict a client's needs, the sooner and more correctly we can offer our products. 

Many system analysts and engineers are keen to know: 

  1. How to design the architecture of a trigger platform for real-time marketing?
  2. How to arrange a data structure that would be in line with the requirements of a marketing strategy for interacting with clients?
  3. How to ensure the stable operations of the  system under very heavy workloads? 

Such systems are based on technologies of high-load processing and Big Data analysis. We have accumulated considerable experience in these areas. Our expertise is in high demand on the market.  I'm going to show how we help our customers to switch from off-line to on-line in their interactions with clients using Real-Time Marketing solutions based on Tarantool.
Read more →

Mysql 8.x Group Replication (Master-Slave) with Docker Compose

Reading time5 min
Views5.9K

This post is handling the following situation - how to setup up simple Mysql services with group replication being dockerized. In our case, we’ll take the latest Mysql (version 8.x.x)

FYI: all mentioned code (worked and tested manually) located here.

I will skip not interested steps like ‘what is Mysql, Docker and why we choose them, etc’. We want to set up possibly trouble proof DB. That’s our plan.

Read more →

IIoT platform databases – How Mail.ru Cloud Solutions deals with petabytes of data coming from a multitude of devices

Reading time11 min
Views1.8K


Hello, my name is Andrey Sergeyev and I work as a Head of IoT Solution Development at Mail.ru Cloud Solutions. We all know there is no such thing as a universal database. Especially when the task is to build an IoT platform that would be capable of processing millions of events from various sensors in near real-time.

Our product Mail.ru IoT Platform started as a Tarantool-based prototype. I’m going to tell you about our journey, the problems we faced and the solutions we found. I will also show you a current architecture for the modern Industrial Internet of Things platform. In this article we will look into:

  • our requirements for the database, universal solutions, and the CAP theorem
  • whether the database + application server in one approach is a silver bullet
  • the evolution of the platform and the databases used in it
  • the number of Tarantools we use and how we came to this
Read more →

Lossless ElasticSearch data migration

Reading time5 min
Views4.2K


Academic data warehouse design recommends keeping everything in a normalized form, with links between. Then the roll forward of changes in relational math will provide a reliable repository with transaction support. Atomicity, Consistency, Isolation, Durability — that's all. In other words, the storage is explicitly built to safely update the data. But it is not optimal for searching, especially with a broad gesture on the tables and fields. We need indices, a lot of indices! Volumes expand, recording slows down. SQL LIKE can not be indexed, and JOIN GROUP BY sends us to meditate in the query planner.

Read more →

Making a Tarantool-Based Investment Business Core for Alfa-Bank

Reading time10 min
Views1.9K

A still from «Our Secret Universe: The Hidden Life of the Cell»

Investment business is one of the most complex domains in the banking world. It's about not just credits, loans, and deposits — there are also securities, currencies, commodities, derivatives, and all kinds of complex stuff like structured products.

Recently, people have become increasingly aware of their finances. More and more get involved in securities trading. Individual investment accounts have emerged not so long ago. They allow you to trade in securities and get tax credits or avoid taxes at the same time. All clients coming to us want to manage their portfolios and see their reporting on-line. Most frequently, these are multi-product portfolios, which means that people are clients of different business areas.

Moreover, the demands of regulators, both Russian and international, also grow.

To meet the current needs and lay a foundation for future upgrades, we've developed our Tarantool-based investment business core.
Read more →

Deploying Tarantool Cartridge applications with zero effort (Part 2)

Reading time11 min
Views1.5K


We have recently talked about how to deploy a Tarantool Cartridge application. However, an application's life doesn't end with deployment, so today we will update our application and figure out how to manage topology, sharding, and authorization, and change the role configuration.

Feeling interested? Please continue reading under the cut.
Read more →

Build apps for free with Azure Cosmos DB Free Tier

Reading time3 min
Views1.8K
Looking to build a new app, develop and test, or run small production workloads with Azure Cosmos DB? Our new Free Tier makes it easy to get started with no cost and save money as you build and grow new apps.



With Azure Cosmos DB Free Tier enabled, you’ll get the first 400 RU/s throughput and 5 GB storage in your account for free each month, for the lifetime of the account. That means that you can start small and grow with confidence, knowing your app will be running on a high-performance database service. You’ll only pay if your account exceeds 400 RU/s and 5 GB. Additionally, if your app has a lot of containers you can create up to 25 containers in a shared throughput database and have them all share the free 400 RU/s. You can have up to one free tier Azure Cosmos DB account per Azure subscription.
Read more →

Deploying Tarantool Cartridge applications with zero effort (Part 1)

Reading time8 min
Views2K


We have already presented Tarantool Cartridge that allows you to develop and pack distributed applications. Now let's learn how to deploy and control these applications. No panic, it's all under control! We have brought together all the best practices of working with Tarantool Cartridge and wrote an Ansible role, which will deploy the package to servers, start and join instances into replica sets, configure authorization, bootstrap vshard, enable automatic failover and patch cluster configuration.

Interesting, huh? Dive in, check details under the cut.
Read more →

Tarantool Kubernetes Operator

Reading time10 min
Views1.9K


Kubernetes has already become a de-facto standard for running stateless applications, mainly because it can reduce time-to-market for new features. Launching stateful applications, such as databases or stateful microservices, is still a complex task, but companies have to meet the competition and maintain a high delivery rate. So they create a demand for such solutions.

We want to introduce our solution for launching stateful Tarantool Cartridge clusters: Tarantool Kubernetes Operator, more under the cut.
Read more →

SQL Index Manager – a long story about SQL Server, grave digging and index maintenance

Reading time14 min
Views2.7K
Every now and then we create our own problems with our own hands… with our vision of the world… with our inaction… with our laziness… and with our fears. As a result, it seems to become very convenient to swim in the public flow of sewage patterns… because it is warm and fun, and the rest does not matter – we can smell round. But after a fail comes the realization of the simple truth – instead of generating an endless stream of causes, self-pity and self-justification, it is enough just to do what you consider the most important for yourself. This will be the starting point for your new reality.

For me, the written below is just such a starting point. The way is expected to be lingering…
Let's go?

How to Discover MongoDB and Elasticsearch Open Databases

Reading time3 min
Views17K

Some time ago among security researchers, it was very “fashionable” to find improperly configured AWS cloud storages with various kinds of confidential information. At that time, I even published a small note about how Amazon S3 open cloud storage is discovered.


However, time passes and the focus in research has shifted to the search for unsecured and exposed public domain databases. More than half of the known cases of large data leaks over the past year are leaks from open databases.



Today we will try to figure out how such databases are discovered by security researchers...

Read more →

Authors' contribution