• 100 cамых ценных репозиториев GitHub [по версии алгоритма UOS]

    Привет, Хабр! Представляю вашему вниманию подборку — перевод статьи из Hackernoon «GitHub’s Top 100 Most Valuable Repositories Out of 96 Million». А саму статью написали ребята, которые использовали алгоритм U°OS Network, чтобы выявить самые ценные оупесорсные проекты на Github.

    image
    Read more →
  • Fancy Euclid's “Elements” in TeX

    • Translation


    In 2016, I came across Oliver Byrne's “The first six books of the Elements of Euclid.” The main feature of this book is that instead of ordinary letter designations such as “triangle ABC,” it employs inclusions of miniature pictures directly in the text, that is, for example, an image of a triangle. As difficult as it probably was in the XIX century, as easy, with the right tools, it should be to make such a book nowadays. And so I decided to find out by myself whether that's the case.
    Read more →
  • Long journey to Tox-rs. Part 1

      Tox logo

      Hi everyone!


      I like Tox and respect the participants of this project and their work. In an effort to help Tox developers and users, I looked into the code and noticed potential problems that could lead to a false sense of security. Since I originally published this article in 2016 (in Russian), many improvements have been made to Tox, and I lead a team that re-wrote secure Tox software from scratch using the Rust programming language (check out Tox-rs). I DO recommend using tox in 2019. Let's take a look what actually made us rewrite Tox in Rust.


      Original article of 2016


      There is an unhealthy tendency to overestimate the security of E2E systems only on the basis that they are E2E. I will present objective facts supplemented with my own comments for you to draw your own conclusions.


      Spoiler: The Tox developers agree with my points and my source code pull request was accepted.

      Here go facts:
    • Top 10 Chat, Audio & Video Calling API & SDK Providers for Enterprise Business

        image

        With the growing trend of digitalization, most enterprises have transformed their communication methodology from mainstream to digital. In order to keep up with competitors, companies regularly upgrade their services, specially the way they relay information to their customers as well as their employees. Today, seamless real-time networking plays a critical role in engaging with individuals and enterprises, and the best way to implement such a feat is onboarding a Real-Time Chat, Voice & Video Calling SDK/API providers.
        Read more →
      • ANPR using RoR & React Native

        Danny Krastev, Mirabbos Umarov, Ekaterina Menshenina, ITMO University, Info communication Systems, Computer Science. 2019

        image

        Abstract


        Due to the never-ending increase in volume of vehicles surrounding our daily lives, Automatic Number Plate Recognition (ANPR), has become an evolving solution for managing and monitoring vehicles worldwide to enforce rules and prevent criminal activities, such as parking violation, red light violation, speeding, and vehicle theft. Although there is already a variety of public and private methods and libraries that have been developed and are used to achieve the automatic recognition of car license plate numbers around the world, there has not been much focus on making advancements toward a cross platform ANPR solution that supports all vehicle license plates worldwide. This paper introduces the Plate Vision project, a web and mobile application built on Ruby on Rails and React Native, which aims to serve as an alternative ANPR platform that supports detection of all license plates worldwide by utilizing various open source optical character recognition (OCR) libraries and making efficiency optimizations.

        Key words and phrases: ruby, rails, react native, license plate recognition, plate region extraction, optical character recognition (OCR), ANPR.
        Read more →
      • How to conduct a Distributed Paperless quarterly planning and not screw it up?

          Given: A company which uses the Scaled Agile Framework (SAFe) to scale Agile development across the organization; 10 development teams combined into one big team (Agile Release Train, according to SAFe terminology) to deliver a common product; the need for a two-day quarterly planning (PI Planning) to determine the work plan of IT teams for the next 3 months *; three development offices with the distance between the most remote ones exceeding 6 thousand kilometers and corresponding working time difference of 5 hours; previous planning experience which implied usage of analogue boards / whiteboards / highlighters / sticky notes and respective physical presence of all key employees in the same room.

          * This heavyweight construct “The work plan of IT teams for the next 3 months” threatens to increase the size of the text significantly, so hereinafter I’m going to replace it with “the commitment”. Accordingly, to draw up and adopt a work plan will be “to commit”.

          Why do we need this?


          1) Fatigue with analog methods of work. While spaceships are plowing the Space, and Elon Musk is boring his tunnels, we, the IT guys, have been persistently writing with highlighters on sticky notes sticking them on the boards — there is really some kind of dissonance in this, isn’t there? That’s what our commitment looked like a while ago:

          image
          Read more →
        • From High Ceph Latency to Kernel Patch with eBPF/BCC



            There are a lot of tools for debugging kernel and userspace programs in Linux. Most of them have performance impact and cannot easily be run in production environments. A few years ago, eBPF was developed, which provides the ability to trace the kernel and userspace with low overhead, without needing to recompile programs or load kernel modules.

            There are now plenty of tools that use eBPF and in this article, we’ll explain how to write your own profiling tool using the PythonBCC library. This article is based on a real issue from the production environment. We’ll walk you through solving the problem and show how existing bcc tools could be used in some cases.
            Read more →
          • Windows Terminal Build 2019 FAQ

              Last week, Microsoft held its Build 2019 conference at the Washington State Convention Center in Seattle. Build is a large event with several thousand people from around the world attending to learn all about the current, newest, and future developer-oriented tech coming from Microsoft.


              We had the pleasure of meeting so many of you at our booth and answering all your questions!


              Read more →
            • Microsoft Kaizala enables Indian Railways to connect its three million employees with healthcare services

                India’s largest employer, Indian Railways, will be using Microsoft Kaizala to connect its employees across the country with quality healthcare facilities.  Microsoft Kaizala app will enable serving and retired railway employees to avail healthcare services of 125 railway and 133 private recognized hospitals. The Kaizala group, managed by doctors from South Central Railways will be complemented with focused groups of doctors, paramedical staff and nurses.


                On registering for the healthcare services, Indian Railway employees will be able to search on Microsoft Kaizala, nearest hospitals and doctors, list of empaneled diagnostic centers and health units. Employees can book doctor appointments, share diagnostic lab reports directly with their doctors and save digital records in ‘Me Chat’ of Microsoft Kaizala. They will also be able to access key announcements, share their feedback to improve quality of medical service with built in action cards.


                image
                Read more →
              • What is going to happen on February 1, 2020?

                  TL;DR: starting February 2020, DNS servers that don’t support DNS both over UDP and TCP may stop working.

                  Bangkok, in general, is a strange place to stay. Of course, it is warm there, rather cheap and some might find the cuisine interesting, along with the fact that about half of the world’s population does not need to apply for a visa in advance to get there. However, you still need to get acquainted with the smells, and the city streets are casting cyberpunk scenes more than anything else.

                  In particular, a photo to the left has been taken not far from the center of Thailand’ capital city, one street away from the Shangri-La hotel, where the 30th DNS-OARC organization meeting took place on May 12 and 13. It is a non-profit organization dedicated to security, stability, and overall development of the DNS — the Domain Name System.

                  Slides from the DNS-OARC 30 meeting are recommended for everyone interested in how the DNS works, though perhaps the most interesting is what is absent in those slides. Namely, a 45-minute round table with a discussion around the results of DNS Flag Day 2019, which occurred on February, 1, 2019.

                  And, the most impressive result of a round table is the decision to repeat DNS Flag Day once again.
                  Read more →
                • A selection of Datasets for Machine learning

                    Hi guys,

                    Before you is an article guide to open data sets for machine learning. In it, I, for a start, will collect a selection of interesting and fresh (relatively) datasets. And as a bonus, at the end of the article, I will attach useful links on independent search of datasets.

                    Less words, more data.

                    image

                    A selection of datasets for machine learning:


                    Read more →
                  • How to make possible micro-payments in your app

                    This week I spent coding my very first public pet-app based on Telegram chat bot which acts as a Bitcoin wallet and allows to send and receive tips between Telegram users and other so-called “Lightning Apps”. I assume that you are familiar with Bitcoin & Telegram in general, i’ll try to post short and without deep jump into details. More resources about Bitcoin can be found here and Telegram is simply an instant messenger that allows you to create your custom apps (chat-bots) using their platform.


                    What are the key points of such app?


                    • Allows to rate other users ideas and answers with real value instead of
                      ‘virtual likes’. This brings online conversation to completely new level
                    • Real example of working micro-payment app which can act with other entities
                      over internet using open protocol
                    • All the modules are open-source projects and can be easy re-used and adjusted
                      for your own project. App does not relay on third-party commercial services.
                      Even it falls under e-commerce field, which is currently almost closed, the app
                      is based on open solutions.

                    What are the use-cases?


                    something like this…

                    image
                    Read more →
                  • Even more secret Telegrams

                      We used to think of Telegram as a reliable and secure transmission medium for messages of any sort. But under the hood, it has a rather common combination of a- and symmetric encryptions. Where's fun in that? And anyway, why would anyone trust their messages to the third-party?
                      Spy vs Spy by Antonio Prohías
                      TL;DR — inventing a private covert channel over users blocking each other.

                      Read more →
                    • Build tools in machine learning projects, an overview

                        I was wondering about machine learning/data science project structure/workflow and was reading different opinions on the subject. And when people start to talk about workflow they want their workflows to be reproducible. There are a lot of posts out there that suggest to use make for keeping workflow reproducible. Although make is very stable and widely-used I personally like cross-platform solutions. It is 2019 after all, not 1977. One can argue that make itself is cross-platform, but in reality you will have troubles and will spend time on fixing your tool rather than on doing the actual work. So I decided to have a look around and to check out what other tools are available. Yes, I decided to spend some time on tools.

                        image

                        This post is more an invitation for a dialogue rather than a tutorial. Perhaps your solution is perfect. If it is then it will be interesting to hear about it.

                        In this post I will use a small Python project and will do the same automation tasks with different systems:


                        There will be a comparison table in the end of the post.
                        Read more →
                      • Indexes in PostgreSQL — 8 (RUM)

                        • Translation
                        We have already discussed PostgreSQL indexing engine, the interface of access methods, and main access methods, such as: hash indexes, B-trees, GiST, SP-GiST, and GIN. In this article, we will watch how gin turns into rum.

                        RUM


                        Although the authors claim that gin is a powerful genie, the theme of drinks has eventually won: next-generation GIN has been called RUM.

                        This access method expands the concept that underlies GIN and enables us to perform full-text search even faster. In this series of articles, this is the only method that is not included in a standard PostgreSQL delivery and is an external extension. Several installation options are available for it:

                        • Take «yum» or «apt» package from the PGDG repository. For example, if you installed PostgreSQL from «postgresql-10» package, also install «postgresql-10-rum».
                        • Build from source code on github and install on your own (the instruction is there as well).
                        • Use as a part of Postgres Pro Enterprise (or at least read the documentation from there).

                        Limitations of GIN


                        What limitations of GIN does RUM enable us to transcend?

                        First, «tsvector» data type contains not only lexemes, but also information on their positions inside the document. As we observed last time, GIN index does not store this information. For this reason, operations to search for phrases, which appeared in version 9.6, are supported by GIN index inefficiently and have to access the original data for recheck.

                        Second, search systems usually return the results sorted by relevance (whatever that means). We can use ranking functions «ts_rank» and «ts_rank_cd» to this end, but they have to be computed for each row of the result, which is certainly slow.

                        To a first approximation, RUM access method can be considered as GIN that additionally stores position information and can return the results in a needed order (like GiST can return nearest neighbors). Let's move step by step.
                        Read more →
                      • Introducing Windows Terminal

                          We are beyond excited to announce Windows Terminal! Windows Terminal is a new, modern, fast, efficient, powerful, and productive terminal application for users of command-line tools and shells like Command Prompt, PowerShell, and WSL.



                          Windows Terminal will be delivered via the Microsoft Store in Windows 10 and will be updated regularly, ensuring you are always up to date and able to enjoy the newest features and latest improvements with minimum effort.


                          Read more →