In this article, I will tell you about a-few-years journey of scaling the Elasticsearch cluster in production environment, which is one of the vital elements of the iPrice technology stack.
I will describe challenges we encountered and how we approached them.
Database Administration *
Everything about database administration
Extending and moving a ZooKeeper ensemble
Once upon a time our DBA team had a task. We had to move a ZooKeeper ensemble which we had been using for Clickhouse cluster. Everyone is used to moving an ensemble by moving its data files. It seems easy and obvious but our Clickhouse cluster had more than 400 TB replicated data. All replication information had been collected in ZooKeeper cluster from the very beginning. At the end of the day we couldn’t miss even a row of data. Then we looked for information on the internet. Unfortunately there was a good tutorial about 3.4.5 and didn’t fit our version 3.6.2. So we decided to use “the extending” for moving our ensemble.
In-Memory Showdown: Redis vs. Tarantool
In this article, I am going to look at Redis versus Tarantool. At a first glance, they are quite alike — in-memory, NoSQL, key value. But we are going to look deeper. My goal is to find meaningful similarities and differences, I am not going to claim that one is better than the other.
There are three main parts to my story:
- We’ll find out what is an in-memory database, or IMDB. When and how are they better than disk solutions?
- Then, we’ll consider their architecture. What about their efficiency, reliability, and scaling?
- Then, we’ll delve into technical details. Data types, iterators, indexes, transactions, programming languages, replication, and connectors.
Feel free to scroll down to the most interesting part or even the summary comparison table at the very bottom and the article.
MySQL 8 Performance Benchmark
In this article, we benchmark the performance of MySQL 8 default configuration vs. innodb_dedicated_server enabled configuration vs. the configuration recommended by MySQL Performance Tuning Service.
Tarantool: an analyst's view
Many system analysts and engineers are keen to know:
- How to design the architecture of a trigger platform for real-time marketing?
- How to arrange a data structure that would be in line with the requirements of a marketing strategy for interacting with clients?
- How to ensure the stable operations of the system under very heavy workloads?
Such systems are based on technologies of high-load processing and Big Data analysis. We have accumulated considerable experience in these areas. Our expertise is in high demand on the market. I'm going to show how we help our customers to switch from off-line to on-line in their interactions with clients using Real-Time Marketing solutions based on Tarantool.
Mysql 8.x Group Replication (Master-Slave) with Docker Compose
This post is handling the following situation - how to setup up simple Mysql services with group replication being dockerized. In our case, we’ll take the latest Mysql (version 8.x.x)
FYI: all mentioned code (worked and tested manually) located here.
I will skip not interested steps like ‘what is Mysql, Docker and why we choose them, etc’. We want to set up possibly trouble proof DB. That’s our plan.
IIoT platform databases – How Mail.ru Cloud Solutions deals with petabytes of data coming from a multitude of devices
Hello, my name is Andrey Sergeyev and I work as a Head of IoT Solution Development at Mail.ru Cloud Solutions. We all know there is no such thing as a universal database. Especially when the task is to build an IoT platform that would be capable of processing millions of events from various sensors in near real-time.
Our product Mail.ru IoT Platform started as a Tarantool-based prototype. I’m going to tell you about our journey, the problems we faced and the solutions we found. I will also show you a current architecture for the modern Industrial Internet of Things platform. In this article we will look into:
- our requirements for the database, universal solutions, and the CAP theorem
- whether the database + application server in one approach is a silver bullet
- the evolution of the platform and the databases used in it
- the number of Tarantools we use and how we came to this
Lossless ElasticSearch data migration
Academic data warehouse design recommends keeping everything in a normalized form, with links between. Then the roll forward of changes in relational math will provide a reliable repository with transaction support. Atomicity, Consistency, Isolation, Durability — that's all. In other words, the storage is explicitly built to safely update the data. But it is not optimal for searching, especially with a broad gesture on the tables and fields. We need indices, a lot of indices! Volumes expand, recording slows down. SQL LIKE can not be indexed, and JOIN GROUP BY sends us to meditate in the query planner.
Making a Tarantool-Based Investment Business Core for Alfa-Bank
A still from «Our Secret Universe: The Hidden Life of the Cell»
Investment business is one of the most complex domains in the banking world. It's about not just credits, loans, and deposits — there are also securities, currencies, commodities, derivatives, and all kinds of complex stuff like structured products.
Recently, people have become increasingly aware of their finances. More and more get involved in securities trading. Individual investment accounts have emerged not so long ago. They allow you to trade in securities and get tax credits or avoid taxes at the same time. All clients coming to us want to manage their portfolios and see their reporting on-line. Most frequently, these are multi-product portfolios, which means that people are clients of different business areas.
Moreover, the demands of regulators, both Russian and international, also grow.
To meet the current needs and lay a foundation for future upgrades, we've developed our Tarantool-based investment business core.
Deploying Tarantool Cartridge applications with zero effort (Part 2)
We have recently talked about how to deploy a Tarantool Cartridge application. However, an application's life doesn't end with deployment, so today we will update our application and figure out how to manage topology, sharding, and authorization, and change the role configuration.
Feeling interested? Please continue reading under the cut.
Build apps for free with Azure Cosmos DB Free Tier
With Azure Cosmos DB Free Tier enabled, you’ll get the first 400 RU/s throughput and 5 GB storage in your account for free each month, for the lifetime of the account. That means that you can start small and grow with confidence, knowing your app will be running on a high-performance database service. You’ll only pay if your account exceeds 400 RU/s and 5 GB. Additionally, if your app has a lot of containers you can create up to 25 containers in a shared throughput database and have them all share the free 400 RU/s. You can have up to one free tier Azure Cosmos DB account per Azure subscription.
Deploying Tarantool Cartridge applications with zero effort (Part 1)
We have already presented Tarantool Cartridge that allows you to develop and pack distributed applications. Now let's learn how to deploy and control these applications. No panic, it's all under control! We have brought together all the best practices of working with Tarantool Cartridge and wrote an Ansible role, which will deploy the package to servers, start and join instances into replica sets, configure authorization, bootstrap vshard, enable automatic failover and patch cluster configuration.
Interesting, huh? Dive in, check details under the cut.
Tarantool Kubernetes Operator
Kubernetes has already become a de-facto standard for running stateless applications, mainly because it can reduce time-to-market for new features. Launching stateful applications, such as databases or stateful microservices, is still a complex task, but companies have to meet the competition and maintain a high delivery rate. So they create a demand for such solutions.
We want to introduce our solution for launching stateful Tarantool Cartridge clusters: Tarantool Kubernetes Operator, more under the cut.
SQL Index Manager – a long story about SQL Server, grave digging and index maintenance
For me, the written below is just such a starting point. The way is expected to be lingering…
How to Discover MongoDB and Elasticsearch Open Databases
Some time ago among security researchers, it was very “fashionable” to find improperly configured AWS cloud storages with various kinds of confidential information. At that time, I even published a small note about how Amazon S3 open cloud storage is discovered.
However, time passes and the focus in research has shifted to the search for unsecured and exposed public domain databases. More than half of the known cases of large data leaks over the past year are leaks from open databases.
Today we will try to figure out how such databases are discovered by security researchers...