Machine Learning and Data Science: Academia vs. Industry

Level of difficultyMedium
Reading time8 min

Machine Learning (ML) technologies are becoming increasingly popular and have various applications, ranging from smartphones and computers to large-scale enterprise infrastructure that serves billions of requests per day. Building ML tools, however, remains difficult today because there are no industry-wide standardised approaches to development. Many engineering students studying ML and Data Science must re-learn once they begin their careers. In this article, I've compiled a list of the top five problems that every ML specialist faces only on the job, highlighting the gap between university curriculum and real-world practice. 

Total votes 17: ↑17 and ↓0

7 tips to make video learning more effective

Level of difficultyEasy
Reading time5 min

While video-based learning continues to rank high in the latest trends, there are a few points that are regularly overlooked in the production of learning videos, with a focus on user experience (UX) and user interaction 

People really enjoy watching videos. According to a survey conducted among consumers worldwide, respondents watched an average of 19 hours of online video content per week in 2022. And nearly half of all internet users watch online videos at least once a week.

Total votes 28: ↑28 and ↓0

Designing for Success: Crafting Effective Learning Experiences

Level of difficultyMedium
Reading time7 min

The Challenge of Mandatory Learning
Once we had several mandatory learning courses designed to be passed successfully by all employees. Still, many of them struggled to do so. Reminder emails to all participants could not solve the issue. And that is when my team was summoned to develop a thorough plan to reduce the number of overdue courses to a minimum. Of course, we were asked to develop something fun and engaging.

Uncovering the Root Problems
While working on the project, we managed to uncover several problems with course assignments, including the fact that they were not offered just in time, there were too many of them, and all of them had different due dates, which made it impossible to remember when to complete them. Additionally, we found that the content itself was often dry and unengaging, further contributing to the lack of motivation among employees. Finally, we came up with a system of notifications that included clear explanatory reminder emails, an escalation system, and a redesign of the course content to make it more interactive and relevant to employees' daily work. The result was almost no overdue courses after system integration.

The Myth of Mandatory Fun
So the case first seemed to be about motivation and engagement, but it is actually about smart course design that allows people to worry about work tasks instead of worrying about course assignments. It's also about creating content that resonates with the learners and helps them see the value in the training.

Total votes 27: ↑27 and ↓0

Argo CD vs Flux CD

Level of difficultyEasy
Reading time7 min

За последнее время я вижу всё больше споров на тему двух популярных GitOps инструментов: Argo CD и Flux CD.

На самом деле я считаю такие споры необоснованными, потому что глубоко убеждён что внимания заслуживают оба инструмента и каждый из них хорош для решения своего круга задач.

В своей профессиональной деятельности я активно использую и тот и другой. Я хочу поделиться с вами своим мнением и кейсами использования. Надеюсь эта статья поможет вам выбрать наиболее подходящий инструмент под ваши нужды.

Total votes 12: ↑12 and ↓0

Harnessing the Power of Machine Learning in Fraud Prevention

Level of difficultyMedium
Reading time6 min

Picture this: A thriving e-commerce platform faces a constant battle against fake reviews that skew product ratings and mislead customers. In response, the company employs cutting-edge algorithms to detect and prevent fraudulent activities. Solutions like these are crucial in the modern digital landscape, safeguarding businesses from financial losses and ensuring a seamless consumer experience.

The industry has relied on rules-based systems to detect fraud for decades. They remain a vital tool in scenarios where continuous collecting of a training sample is challenging, as retraining methods and metrics can be difficult. However, machine learning outperforms rules-based systems in detecting and identifying attacks when an ongoing training sample is available.

With advancements in machine learning, fraud detection systems have become more efficient, accurate, and adaptable. In this article, I will review several ML methods for preventing fraudulent activities and discuss their weaknesses and advantages.

Total votes 11: ↑11 and ↓0

PostgreSQL 16: Part 5 or CommitFest 2023-03

Level of difficultyMedium
Reading time28 min

The end of the March Commitfest concludes the acceptance of patches for PostgreSQL 16. Let’s take a look at some exciting new updates it introduced.

I hope that this review together with the previous articles in the series (2022-072022-092022-112023-01) will give you a coherent idea of the new features of PostgreSQL 16.

Total votes 10: ↑10 and ↓0

Building a GPT-like Model from Scratch with Detailed Theory and Code Implementation

Reading time14 min

Unlock the power of Transformer Neural Networks and learn how to build your own GPT-like model from scratch. In this in-depth guide, we will delve into the theory and provide a step-by-step code implementation to help you create your own miniGPT model. The final code is only 400 lines and works on both CPUs as well as on the GPUs. If you want to jump straight to the implementation here is the GitHub repo.

Transformers are revolutionizing the world of artificial intelligence. This simple, but very powerful neural network architecture, introduced in 2017, has quickly become the go-to choice for natural language processing, generative AI, and more. With the help of transformers, we've seen the creation of cutting-edge AI products like BERT, GPT-x, DALL-E, and AlphaFold, which are changing the way we interact with language and solve complex problems like protein folding. And the exciting possibilities don't stop there - transformers are also making waves in the field of computer vision with the advent of Vision Transformers.

Total votes 25: ↑25 and ↓0

Turning a typewriter into a Linux terminal

Reading time3 min

Hi everyone, a few months ago I got a Brother AX-25, and since then, I've been working on turning it into a computer. It uses an Arduino to scan the custom mechanical keyboard and control the typewriter, and a Raspberry Pi is connected to the Arduino over serial so I can log into it in headless mode.

Total votes 10: ↑10 and ↓0

Backup & Recovery Solutions from China

Reading time9 min

There are new challenges that force IT companies to look for non-trivial approaches to solve the problems of their customers every year.  And as you know LANIT-Integration is not an exception. Our team has already managed to work with many products, but we never stop discovering new ones.

In this article I would like to provide an overview of backup and recovery software from Chinese vendors and to compare these solutions with domestic ones.

Total votes 15: ↑15 and ↓0

Our new public speech synthesis in super-high quality, 10x faster and more stable

Reading time3 min


In our last article we made a bunch of promises about our speech synthesis.

After a lot of hard work we finally have delivered upon these promises:

  • Model size reduced 2x;
  • New models are 10x faster;
  • We added flags to control stress;
  • Now the models can make proper pauses;
  • High quality voice added (and unlimited "random" voices);
  • All speakers squeezed into the same model;
  • Input length limitations lifted, now models can work with paragraphs of text;
  • Pauses, speed and pitch can be controlled via SSML;
  • Sampling rates of 8, 24 or 48 kHz are supported;
  • Models are much more stable — they do not omit words anymore;

This is a truly break-through achievement for us and we are not planning to stop anytime soon. We will be adding as many languages as possible shortly (the CIS languages, English, European languages, Hindic languages). Also we are still planning to make our models additional 2-5x faster.

We are also planning to add phonemes and a new model for stress, as well as to reduce the minimum amount of audio required to train a high-quality voice to 5 — 15 minutes.

As usual you can try our model in our repo or in colab.

Total votes 13: ↑13 and ↓0

Q4 2021 DDoS attacks and BGP incidents

Reading time6 min

2021 was an action-packed year for Qrator Labs.

It started with the official celebration of our tenth year anniversary, continued with massive routing incidents, and ended with the infamous Meris botnet we reported back in September.

Now it is time to look at the events of the last quarter of 2021. There are interesting details in the BGP section, like the new records in route leaks and hijacking ASes, but first things first, as we start with the DDoS attacks statistics.

Total votes 13: ↑13 and ↓0

New botnet with lots of cameras and some routers

Reading time3 min

DDoS attacks send ripples on the ocean of the Internet, produced by creations of various sizes - botnets. Some of them feed at the top of the ocean, but there also exists a category of huge, deep water monstrosities that are rare and dangerous enough they could be seen only once in a very long time.

November 2021 we encountered, and mitigated, several attacks from a botnet, that seems to be unrelated to one described and/or well-known, like variants of Mirai, Bashlite, Hajime or Brickerbot.

Although our findings are reminiscent of Mirai, we suppose this botnet is not based purely on propagating Linux malware, but a combination of brute forcing and exploiting already patched CVEs in unpatched devices to grow the size of it. Either way, to confirm how exactly this botnet operates, we need to have a sample device to analyze, which isn’t our area of expertise.

This time, we won’t give it a name. It is not 100% clear what we are looking at, what are the exact characteristics of it, and how big this thing actually is. But there are some numbers, and where possible, we have made additional reconnaissance in order to better understand what we’re dealing with.

But let us first show you the data we’ve gathered, and leave conclusions closer to the end of this post.

Total votes 12: ↑12 and ↓0

Q3 2021 DDoS attacks and BGP incidents

Reading time7 min

The third quarter of 2021 brought a massive upheaval in the scale and intensity of DDoS attacks worldwide.

It all led to September when together with Yandex, we uncovered one of the most devastating botnets since the Mirai and named it Meris, as it was held accountable for a series of attacks with a very high RPS rate. And as those attacks were aimed all over the world, our quarterly statistics also changed.

This quarter, we've also prepared for your consideration a slice of statistics on the application layer (L7) DDoS attacks. Without further ado, let us elaborate on the details of DDoS attacks statistics and BGP incidents for Q3, 2021.

Total votes 17: ↑17 and ↓0

Спасти производство во время пандемии: личный опыт

Reading time6 min
Вызванный пандемией экономический кризис ударил по экономикам всего мира двумя путями: снижение и частичная остановка экономической активности из-за карантинных мер в разных странах, и вызванными локдаунами нарушениями цепочек производства и поставок. Одно из достижений, которым гордилась современная экономика — это тонко настроенное глобальное разделение труда, при котором логистические цепочки растянуты по всей планете, и поставки по ним ходят «встык», без задержек — и, как следствие того, без запаса.

В результате этого, выход из карантинного состояния обернулся для многих секторов экономики, своего рода, афтершоком. После периода просевшего спроса и нарушенных цепочек производства-поставки, они столкнулись с эффектом отложенного спроса — и даже просто возвращение к докризисному уровню оказалось шоковым для выходящих из экономической гибернации бизнесов. Как результат — возникли перегрев рынка и инфляция в отдельных сферах экономики. Многие потребители в тех или иных сферах ощутили на себе, как продукты и услуги, невостребованные, когда им было не до них в разгар локдаунов и связанных с ними проблем, оказались в дефиците или подорожали после ослабления карантинных мер.

В этой ситуации выиграли те бизнесы, которые были готовы к такому повороту событий. Разумеется, подготовиться к замедлению экономической активности заранее никто не мог, поскольку предсказать серьёзность мер в ответ на пандемию до их принятия во многих случаях до последнего момента не могли даже правительства разных стран, то предугадать их экономические последствия на ход-два вперёд было вполне возможно.
Total votes 10: ↑10 and ↓0

Who controls App Store: Martians or AI? Closed session of Russia's Federation Council and Apple leaked online

Reading time2 min

Video recording of a closed session of the upper house of Russia's parliament was leaked online by Telegram channel A000MP97. In the video, Andrei Klimov, head of the Ad Hoc Sovereignty and Preventing Interference in the Domestic Affairs Commission, demands Apple to disclose who controls the App Store: people from Mars or artificial intelligence?

On September 16th, a closed session of the Commission took place, and representatives of Apple and Google were among those who were invited. The session discussed ways to protect sovereignty of the country, in particular, the fact that the Navalny app was still available in Apple App Store and Google Play. The services were accused of being complicit with organisations deemed extremist and banned in Russia as well as interference with Russian elections.
Total votes 17: ↑17 and ↓0

Mēris botnet, climbing to the record

Reading time7 min


For the last five years, there have virtually been almost no global-scale application-layer attacks.

During this period, the industry has learned how to cope with the high bandwidth network layer attacks, including amplification-based ones. It does not mean that botnets are now harmless.

End of June 2021, Qrator Labs started to see signs of a new assaulting force on the Internet – a botnet of a new kind. That is a joint research we conducted together with Yandex to elaborate on the specifics of the DDoS attacks enabler emerging in almost real-time.

Total votes 28: ↑28 and ↓0

In-Memory Showdown: Redis vs. Tarantool

Reading time13 min

In this article, I am going to look at Redis versus Tarantool. At a first glance, they are quite alike — in-memory, NoSQL, key value. But we are going to look deeper. My goal is to find meaningful similarities and differences, I am not going to claim that one is better than the other.

There are three main parts to my story:

  • We’ll find out what is an in-memory database, or IMDB. When and how are they better than disk solutions?
  • Then, we’ll consider their architecture. What about their efficiency, reliability, and scaling?
  • Then, we’ll delve into technical details. Data types, iterators, indexes, transactions, programming languages, replication, and connectors.

Feel free to scroll down to the most interesting part or even the summary comparison table at the very bottom and the article.
Total votes 9: ↑8 and ↓1
