• Введение в метрики для PHP разработчика

      Если вы php разработчик и слышали про метрики, но не знаете с чего начать - эта статья для вас. Я подготовил тестовый реопзиторий, который поможет вам начать работать с метрики, строить графики, настраивать оповещения. Если все это у вас отликается, то добро пожаловать под кат.

      Читать далее
    • HDB++ TANGO Archiving System

      • Translation
      • Tutorial

      What is HDB++?

      This is a TANGO archiving system, allows you to save data received from devices in the TANGO system.

      Working with Linux will be described here (TangoBox 9.3 on base Ubuntu 18.04), this is a ready-made system where everything is configured.

      What is the article about?

      • System architecture.
      • How to set up archiving.

      It took me ~ 2 weeks to understand the architecture and write my own scripts for python for this case.

      What is it for?

      Allows you to store the history of the readings of your equipment.

      • You don't need to think about how to store data in the database.
      • You just need to specify which attributes to archive from which equipment.
      Read more →
    • Top 7 Technology Trends to Look out for in 2021

      Technology is as adaptable and compatible as mankind; it finds its way through problems and situations. 2020 was one such package of uncertain events that forced businesses to adapt to digital transformation, even to an extent where many companies started to consider the remote work culture to be a beneficiary long-term model. Technological advancements like Hyper automation, AI Security, and Distributed cloud showed how any people-centric idea could rule the digital era. The past year clearly showed the boundless possibilities through which technology can survive or reinvent itself. With all those learnings let's deep-dive and focus on some of the top technology trends to watch out for in 2021.

      Read more
    • How to Get Nice Error Reports Using SARIF in GitHub

        Let's say you use GitHub, write code, and do other fun stuff. You also use a static analyzer to enhance your work quality and optimize the timing. Once you come up with an idea - why not view the errors that the analyzer gave right in GitHub? Yeah, and also it would be great if it looked nice. So, what should you do? The answer is very simple. SARIF is right for you. This article will cover what SARIF is and how to set it up. Enjoy the reading!

        Читать далее
      • Speech Analytics: Benefits and its New Importance in Telecommunication Technology

          Speech analytics is the process of analysing recorded speech, such as phone calls, to gather customer information to improve communication and future customer interaction. Speech analytics as a technology has been evolving especially rapidly over the last few years. It gives the ability to structure and analyse previously lost streams of insight-rich data, such as phone conversations. Empowered with this technology, operations can gather incredibly valuable business intelligence to drive call delivery performance improvements. It’s smart in that it automatically identifies focus areas in which customer service or sales teams may need additional call training which then, in turn, improves the call’s successful outcome. Speech analytics, as a process, can isolate buzzwords and phrases used most frequently within a given time period, plus indicate usage is trending up or down. This data is highly useful to call managers to spot changes in consumer behaviour so that action can be taken to improve customer satisfaction.

          Zadarma is a leading global VoIP provider and offers a smart speech analytics feature as part of their incredibly easy to use telecommunications offering. The tool is free as part of the wider PBX phone system bundles, included in the free recognition minutes. Zadarma’s analytics feature allows data access to every internal or external call conversation. The benefits of speech analytics include:

          Read more
        • Ads
          AdBlock has stolen the banner, but banners are not teeth — they will be back

        • Coins classifier Neural Network: Head or Tail?

            Home of this article: https://robotics.snowcron.com/coins/02_head_or_tail.htm

            The global objective of these articles is to build a coin classifier, capable of scanning your pocket change and find rare / valuable coins. This is a second article in a series, so let me remind you what happened earlier (https://habr.com/ru/post/538958/).

            During previous step we got a rather large dataset composed of pairs of images, loaded from an online coins site meshok.ru. Those images were uploaded to the Internet by people we do not know, and though they are supposed to contain coin's head in one image and tail in the other, we can not rule out a situation when we have two heads and no tail and vice versa. Also at the moment we have no idea which image contains head and which contains tail: this might be important when we feed data to our final classifier.

            So let's write a program to distinguish heads from tails. It is a rather simple task, involving a convolutional neural network that is using transfer learning.

            Same way as before, we are going to use Google Colab environment, taking the advantage of a free video card they grant us an access to. We will store data on a Google Drive, so first thing we need is to allow Colab to access the Drive:

            Читать далее
          • Implementing Offline traceroute Tool Using Python

            • Translation

            Hey everyone! This post was born from a question asked by an IT forum member. The summary of the question looked as follows:

            • There is a set of text files containing routing tables collected from various network devices.
            • Each file represents one device.
            • Device platforms and routing table formats may vary.
            • It is required to analyze a routing path from any device to an arbitrary subnet or host on-demand.
            • Resulting output should contain a list of routing table entries that are used for the routing to the given destination on each hop.

            The one who asked a question worked as a TAC engineer. It is often that they collect or receive from the customers some text 'snapshots' of the network state for further offline analysis while troubleshooting the issues. Some automation could really save a lot of time.

            I found this task interesting and also applicable to my own needs, so I decided to write a Proof-of-Concept implementation in Python 3 for Cisco IOS, IOS-XE, and ASA routing table format.

            In this article, I’ll try to reconstruct the resulting script development process and my considerations behind each step.

            Let’s get started.

            Read more →
          • Prometheus in Action: from default counters to SLO-related queries

            • Tutorial

            All Prometheus metrics are based on time series - streams of timestamped values belonging to the same metric. Each time series is uniquely identified by its metric name and optional key-value pairs called labels. The metric name specifies some characteristics of the measured system, such as http_requests_total - the total number of received HTTP requests. In practice, you often will be interested in some subset of the values of a metric, for example, in the number of requests received by a particular endpoint; and here is where the labels come in handy. We can partition a metric by adding endpoint label and see the statics for a particular endpoint: http_requests_total{endpoint="api/status"}. Every metric has two automatically created labels: job_name and instance. We see their roles in the next section.

            Prometheus provides a functional query language called PromQL. The result of the query might be evaluated to one of four types:

            Scalar (aka float)

            String (currently unused)

            Instant Vector - a set of time series that have exactly one value per timestamp.

            Range Vector - a set of time series that have a range of values between two timestamps.

            At first glance, Instant Vector might look like an array, and Range Vector as a matrix.

            If that would be the case, then a Range Vector for a single time series "downgrades" to an Instant Vector. However, that's not the case:

            Read more
          • Distributed Tracing for Microservice Architecture

            • Tutorial

            What is distributed tracing? Distributed tracing is a method used to profile and monitor applications, especially those built using a microservices architecture. Distributed tracing helps pinpoint where failures occur and what causes poor performance.

            Let’s have a look at a simple prototype. A user fetches information about a shipment from `logistic` service. logistic service does some computation and fetches the data from a database. logistic service doesn’t know the actual status of the shipment, so it has to fetch the updated status from another service `tracking`. `tracking` service also needs to fetch the data from a database and to do some computation.

            In the screenshot below, we see a whole life cycle of the request issued to `logistics` service:

            Read more
          • Coins Classification using Neural Networks

            • Tutorial

            See more at robotics.snowcron.comThis is the first article in a serie dedicated to coins classification.Having countless "dogs vs cats" or "find a pedestrian on the street" classifiers all over the Internet, coins classification doesn't look like a difficult task. At first. Unfortunately, it is degree of magnitude harder - a formidable challenge indeed. You can easily tell heads of tails? Great. Can you figure out if the number is 1 mm shifted to the left? See, from classifier's view it is still the same head... while it can make a difference between a common coin priced according to the number on it and a rare one, 1000 times more expensive.Of course, we can do what we usually do in image classification: provide 10,000 sample images... No, wait, we can not. Some types of coins are rare indeed - you need to sort through a BASKET (10 liters) of coins to find one. Easy arithmetics suggests that to get 10000 images of DIFFERENT coins you will need 10,000 baskets of coins to start with. Well, and unlimited time.So it is not that easy.Anyway, we are going to begin with getting large number of images and work from there. We will use Russian coins as an example, as Russia had money reform in 1994 and so the number of coins one can expect to find in the pocket is limited. Unlike USA with its 200 years of monetary history. And yes, we are ONLY going to focus on current coins: the ultimate goal of our work is to write a program for smartphone to classify coins you have received in a grocery store as a change.Which makes things even worse, as we can not count on good lighting and quality cameras anymore. But we'll still try.In addition to "only Russian coins, beginning from 1994", we are going to add an extra limitation: no special occasion coins. Those coins look distinctive, so anyone can figure that this coin is special. We focus on REGULAR coins. Which limits their number severely.Don't take me wrong: if we need to apply the same approach to a full list of coins... it will work. But I got 15 GB of images for that limited set, can you imagine how large the complete set will be?!To get images, I am going to scan one of the largest Russian coins site "meshok.ru".This site allows buyers and sellers to find each other; sellers can upload images... just what we need. Unfortunately, a business-oriented seller can easily upload his 1 rouble image to 1, 2, 5, 10 roubles topics, just to increase the exposure.

            So we can not count on the topic name, we have to determine what coin is on the photo ourselves.To scan the site, a simple scanner was written, based on the Python's Beautiful Soup library. In just few hours I got over 50,000 photos. Not a lot by Machine Learning standards, but definitely a start.After we got the images, we have to - unfortunately - revisit them by hand, looking for images we do not want in our training set, or for images that should be edited somehow. For example, someone could have uploaded a photo of his cat. We don't need a cat in our dataset.First, we delete all images, that can not be split to head/hail.

            Читать далее
          • Tarantool: an analyst's view

              Hi all! I'm Andrey Kapustin. I work as a system analyst at Mail.ru Group. Our products form a unified ecosystem. Many independent infrastructures generate data in it: taxi and food delivery services, email services, social networks, etc. The faster and more precise we can predict a client's needs, the sooner and more correctly we can offer our products. 

              Many system analysts and engineers are keen to know: 

              1. How to design the architecture of a trigger platform for real-time marketing?
              2. How to arrange a data structure that would be in line with the requirements of a marketing strategy for interacting with clients?
              3. How to ensure the stable operations of the  system under very heavy workloads? 

              Such systems are based on technologies of high-load processing and Big Data analysis. We have accumulated considerable experience in these areas. Our expertise is in high demand on the market.  I'm going to show how we help our customers to switch from off-line to on-line in their interactions with clients using Real-Time Marketing solutions based on Tarantool.
              Read more →
            • Visualizing Network Topologies: Zero to Hero in Two Days

              • Translation

              Hey everyone! This is a follow-up article on a local Cisco Russia DevNet Marathon online event I attended in May 2020. It was a series of educational webinars on network automation followed by daily challenges based on the discussed topics.
              On a final day, the participants were challenged to automate a topology analysis and visualization of an arbitrary network segment and, optionally, track and visualize the changes.

              The task was definitely not trivial and not widely covered in public blog posts. In this article, I would like to break down my own solution that finally took first place and describe the selected toolset and considerations.

              Let's get started.

              Read more →
            • Big / Bug Data: Analyzing the Apache Flink Source Code


                Applications used in the field of Big Data process huge amounts of information, and this often happens in real time. Naturally, such applications must be highly reliable so that no error in the code can interfere with data processing. To achieve high reliability, one needs to keep a wary eye on the code quality of projects developed for this area. The PVS-Studio static analyzer is one of the solutions to this problem. Today, the Apache Flink project developed by the Apache Software Foundation, one of the leaders in the Big Data software market, was chosen as a test subject for the analyzer.
                Read more →
              • The Rules for Data Processing Pipeline Builders

                  "Come, let us make bricks, and burn them thoroughly."
                  – legendary builders

                  You may have noticed by 2020 that data is eating the world. And whenever any reasonable amount of data needs processing, a complicated multi-stage data processing pipeline will be involved.

                  At Bumble — the parent company operating Badoo and Bumble apps — we apply hundreds of data transforming steps while processing our data sources: a high volume of user-generated events, production databases and external systems. This all adds up to quite a complex system! And just as with any other engineering system, unless carefully maintained, pipelines tend to turn into a house of cards — failing daily, requiring manual data fixes and constant monitoring.

                  For this reason, I want to share certain good engineering practises with you, ones that make it possible to build scalable data processing pipelines from composable steps. While some engineers understand such rules intuitively, I had to learn them by doing, making mistakes, fixing, sweating and fixing things again…

                  So behold! I bring you my favourite Rules for Data Processing Pipeline Builders.

                  Read more →
                • Spring Boot app with Apache Kafka in Docker container

                  • Translation
                  • Tutorial

                  Privet, comrads!

                  In this article i’ll show how easy it is to setup Spring Java app with Kafka message brocker. We will use docker containers for kafka zookeeper/brocker apps and configure plaintext authorization for access from both local and external net.

                  Link to final project on github can be picked up at the end of the article.

                  Read more
                • Patroni cluster (with Zookeeper) in a docker swarm on a local machine

                  • Tutorial

                  There probably is no way one who stores some crucial data (and well, in particular, using SQL databases) can possibly dodge from thoughts of building some kind of safe cluster, distant guardian to protect consistency and availability at all times. Even if the main server with your precious database gets knocked out deadly - the show must go on, right? This basically means the database must still be available and data be up-to-date with the one on the failed server.

                  As you might have noticed, there are dozens of ways to go and Patroni is just one of them. There is plenty of articles providing a more or less detailed comparison of the options available, so I assume I'm free to skip the part of luring you into Patroni's side. Let's start off from the point where among others you are already leaning towards Patroni and are willing to try that out in a more or less real-case setup.

                  I am not a DevOps engineer originally so when the need for the high-availability cluster arose and I went on I would catch every single bump on the road. Hope this tutorial will help you out to get the job done with ease! If you don't want any more explanations, jump right in. Otherwise, you might want to read some more notes on the setup I went on with.

                  Read more
                • OPPO, Huawei, Xiaomi. Chinese app stores join forces to take on Google

                    Major players in the Chinese app market are joining forces to take on the almighty Google Play store. Xiaomi, Oppo and Vivo are reported to launch the Global Developer Service Alliance (GDSA), a platform allowing Android developers to publish their apps in the partnering stores from one upload.

                    The GDSA is expected to launch in nine countries—including India, Indonesia, Malaysia, Russia, Spain, Thailand, the Philippines, and Vietnam—although paid app support may vary across the regions. Canalys’ Nicole Peng explains the wide reach of this alliance:

                    By forming this alliance each company will be looking to leverage the others’ advantages in different regions, with Xiaomi’s strong user base in India, Vivo and Oppo in Southeast Asia, and Huawei in Europe. 

                    Читать далее
                  • Linux Switchdev the Mellanox way

                      This is a transcription of a talk that was presented at CSNOG 2020 — video is at the end of the page

                      Greetings! My name is Alexander Zubkov. I work at Qrator Labs, where we protect our customers against DDoS attacks and provide BGP analytics.

                      We started using Mellanox switches around 2 or 3 years ago. At the time we got acquainted with Switchdev in Linux and today I want to share with you our experience.
                      Read more →