• Big / Bug Data: Analyzing the Apache Flink Source Code

      image1.png

      Applications used in the field of Big Data process huge amounts of information, and this often happens in real time. Naturally, such applications must be highly reliable so that no error in the code can interfere with data processing. To achieve high reliability, one needs to keep a wary eye on the code quality of projects developed for this area. The PVS-Studio static analyzer is one of the solutions to this problem. Today, the Apache Flink project developed by the Apache Software Foundation, one of the leaders in the Big Data software market, was chosen as a test subject for the analyzer.
      Read more →
    • A Book on the API Design

        This year, each of us seeks a special way to pass the time. I am writing a book, for example. A book about one thing I love dearly: the API. (You may read who am I and what expertise got in APIs in my LinkedIn profile.)


        I've just finished the first large section dedicated to the API design. You may read it online, or download either pdf or epub version, or take a look at the source code on Github.


        The book is distributed for free under a CC-BY-NC license. Enjoy!

      • Algorithms in Go: Sliding Window Pattern (Part II)

        • Tutorial

        https://s3-us-west-2.amazonaws.com/secure.notion-static.com/adf4f836-dc81-4a3d-8a84-9c1d9c81fd66/algo_-_Starting_Picture.jpg


        This is the second part of the article covering the Sliding Window Pattern and its implementation in Go, the first part can be found here.


        Let's have a look at the following problem: we have an array of words, and we want to check whether a concatenation of these words is present in the given string. The length of all words is the same, and the concatenation must include all the words without any overlapping. Would it be possible to solve the problem with linear time complexity?


        Let's start with string catdogcat and target words cat and dog.


        https://s3-us-west-2.amazonaws.com/secure.notion-static.com/a49a78c7-5177-401b-9d30-3f02d3d8db49/algo_-_Input_string.jpg


        two concat


        How can we handle this problem?

        Read more →
      • The Rules for Data Processing Pipeline Builders


          "Come, let us make bricks, and burn them thoroughly."
          – legendary builders

          You may have noticed by 2020 that data is eating the world. And whenever any reasonable amount of data needs processing, a complicated multi-stage data processing pipeline will be involved.


          At Bumble — the parent company operating Badoo and Bumble apps — we apply hundreds of data transforming steps while processing our data sources: a high volume of user-generated events, production databases and external systems. This all adds up to quite a complex system! And just as with any other engineering system, unless carefully maintained, pipelines tend to turn into a house of cards — failing daily, requiring manual data fixes and constant monitoring.


          For this reason, I want to share certain good engineering practises with you, ones that make it possible to build scalable data processing pipelines from composable steps. While some engineers understand such rules intuitively, I had to learn them by doing, making mistakes, fixing, sweating and fixing things again…


          So behold! I bring you my favourite Rules for Data Processing Pipeline Builders.

          Read more →
        • Development of “YaRyadom” (“I’mNear”) application under the control of Vk Mini Apps. Part 1 .Net Core

          • Translation
          Application is developed in order to help people find their peers who share similar interests and to be able to spend some time doing what you like. The project is currently on the stage of beta-testing in the social network “VKontakte”. Right now I am in the process of fixing bugs and adding everything that is missing. I felt like I could use a bit of destruction and decided to write a little about the development. While I was writing, I decided to divide the text into different parts. Here we are going to pay more attention to backend nuances which I faced, and to everything that a user does not see.
          Read more →
        • Ads
          AdBlock has stolen the banner, but banners are not teeth — they will be back

          More
        • cGit-UI — a web interface for Git Repositories

          • Tutorial

          cGit-UI — is a web interface for Git repositories. cGit-UI is based on CGI script written in С.


          This article covers installing and configuring cGit-UI to work using Nginx + uWsgi. Setting up server components is quite simple and practically does not differ from setting up cGit.


          cGit-UI supports Markdown files that are processed on the server side using the md4c library, which has proven itself in the KDE Plasma project. cGit-UI provides the ability to add site verification codes and scripts from systems such as Google Analytics and Yandex.Metrika for trafic analysis. Users who wonder to receive donations for his projects can create and import custom donation modal dialogs.


          Instead of looking at screenshots, it is better to look at the working site to decide on installing cGit-UI on your own server.

          Read more →
        • Configuring FT4232H using the ftdi_eeprom

          • Tutorial


          The FT4232H is USB 2.0 High speed to UART IC converter. The FT4232H has four UART ports and one USB port.


          By connecting EEPROM memory to this chip, you can set specific operating modes or change the manufacturer's data.


          Let's look at the example and configure FT4232H directly on a system running GNU/Linux. We will do this using the ftdi_eeprom.

          Read more →
        • Playing with Nvidia's New Ampere GPUs and Trying MIG


            Every time when the essential question arises, whether to upgrade the cards in the server room or not, I look through similar articles and watch such videos.


            Channel with the aforementioned video is very underestimated, but the author does not deal with ML. In general, when analyzing comparisons of accelerators for ML, several things usually catch your eye:


            • The authors usually take into account only the "adequacy" for the market of new cards in the United States;
            • The ratings are far from the people and are made on very standard networks (which is probably good overall) without details;
            • The popular mantra to train more and more gigantic models makes its own adjustments to the comparison;

            The answer to the question "which card is better?" is not rocket science: Cards of the 20* series didn't get much popularity, while the 1080 Ti from Avito (Russian craigslist) still are very attractive (and, oddly enough, don't get cheaper, probably for this reason).


            All this is fine and dandy and the standard benchmarks are unlikely to lie too much, but recently I learned about the existence of Multi-Instance-GPU technology for A100 video cards and native support for TF32 for Ampere devices and I got the idea to share my experience of the real testing cards on the Ampere architecture (3090 and A100). In this short note, I will try to answer the questions:


            • Is the upgrade to Ampere worth it? (spoiler for the impatient — yes);
            • Are the A100 worth the money (spoiler — in general — no);
            • Are there any cases when the A100 is still interesting (spoiler — yes);
            • Is MIG technology useful (spoiler — yes, but for inference and for very specific cases for training);
            Read more →
          • Russian AI Cup 2020 — a new strategy game for developers



              This year, many processes transformed, with traditions and habits being modified. The rhythm of life has changed, and there's more uncertainty and strain. But IT person's soul wants diversity, and many developers have asked us if annual Russian AI Cup will be held this year. Is there going to be an announcement? What is the main theme of the upcoming championship? Should I take a vacation?

              Though some changes are expected, it will be held in keeping with the best traditions. In the run-up, we will announce one of today's largest online AI programming championships — Russian AI Cup. We invite you to make history!
              Read more →
            • Vital Characteristics Of The Best Webflow Designers

                A great website serves as the main key to hit success. Enhancing the online presence of your brand is absolutely a must nowadays. Technological advancement has changed the global business landscape. Hence, it is important to have a webflow developer who will take charge of the codes to be used in website templates and designs. Webflow is a website tool. It is a flexible platform that is geared to create a homogenous biz site. The use of this tool will definitely pave the way for your brand to excel on the web.
                Read more →
              • The Code Analyzer is wrong. Long live the Analyzer

                  Foo(std::move(buffer), line_buffer - buffer.get());

                  Combining many actions in a single C++ expression is a bad practice, as such code is hard to understand, maintain, and it is easy to make mistakes in it. For example, one can instill a bug by reconciling different actions when evaluating function arguments. We agree with the classic recommendation that code should be simple and clear. Now let's look at an interesting case where the PVS-Studio analyzer is technically wrong, but from a practical point of view, the code should still be changed.
                  Read more →
                • How static code analysis helps in the GameDev industry

                    image1.png

                    The gaming industry is constantly evolving and is developing faster than a speeding bullet. Along with the growth of the industry, the complexity of development also increases: the code base is getting larger and the number of bugs is growing as well. Therefore, modern game projects need to pay special attention to the code quality. Today we will cover one of the ways to make your code more decent, which is static analysis, as well as how PVS-Studio in practice helps in the game project development of various sizes.
                    Read more →
                  • Analyzing the Code Quality of Microsoft's Open XML SDK

                      image1.png

                      My first encounter with Open XML SDK took place when I was looking for a library that I could use to create some accounting documents in Word. After more than 7 years of working with Word API, I wanted to try something new and easier-to-use. That's how I learned that Microsoft offered an alternative solution. As tradition has it, before our team adopts any program or library, we check them with the PVS-Studio analyzer.
                      Read more →
                    • How to build a high-performance application on Tarantool from scratch

                      • Tutorial
                      image

                      I came to Mail.ru Group in 2013, and I required a queue for one task. First of all, I decided to check what the company had already got. They told me they had this Tarantool product, and I checked how it worked and decided that adding a queue broker to it could work perfectly well.

                      I contacted Kostja Osipov, the senior expert in Tarantool, and the next day he gave me a 250-string script that was capable of managing almost everything I needed. Since that moment, I have been in love with Tarantool. It turned out that a small amount of code written with a quite simple script language was capable of ensuring some totally new performance for this DBMS.

                      Today, I’m going to tell you how to instantiate your own queue in Tarantool 2.2.
                      Read more →
                    • Modern Web-UI for SVN repositories

                      • Tutorial

                      cSvn — is a web interface for Subversion repositories. cSvn is based on CGI script written in С.


                      This article covers installing and configuring cSvn to work using Nginx + uWsgi. Setting up server components is quite simple and practically does not differ from setting up cGit.


                      cSvn supports Markdown files that are processed on the server side using the md4c library, which has proven itself in the KDE Plasma project. cSvn provides the ability to add site verification codes and scripts from systems such as Google Analytics and Yandex.Metrika for trafic analysis. Users who wonder to receive donations for his projects can create and import custom donation modal dialogs.


                      Instead of looking at screenshots, it is better to look at the working site to decide on installing cSvn on your own server.


                      It should be noted that you can browse not only your own repositories, but also configure viewing of third-party resources via HTTPS and SVN protocols.

                      Read more →
                    • Spring Boot app with Apache Kafka in Docker container

                      • Translation
                      • Tutorial

                      Privet, comrads!

                      In this article i’ll show how easy it is to setup Spring Java app with Kafka message brocker. We will use docker containers for kafka zookeeper/brocker apps and configure plaintext authorization for access from both local and external net.

                      Link to final project on github can be picked up at the end of the article.

                      Read more
                    • Active Termination Drivers

                      • Tutorial


                      The easiest way to build a driver with specified output impedance is to use an amplifier with high load compatibility and add a resistor to its output. The penalty is voltage drop across this resistor, so there is power loss and we need a higher supply voltage. If our driver is able to deliver the same voltage and current to the same load, but the extra resistor will have a lower value, our device will be able to deliver the same output power at a lower supply voltage. Less power losses, less heat, and longer working time when a battery is used.
                      There is an idea how to solve this problem: active termination. We can synthesize the output impedance!

                      Now when we know what we want, go to design our drivers!
                      Read more →