• A selection of Datasets for Machine learning

      Hi guys,

      Before you is an article guide to open data sets for machine learning. In it, I, for a start, will collect a selection of interesting and fresh (relatively) datasets. And as a bonus, at the end of the article, I will attach useful links on independent search of datasets.

      Less words, more data.

      image

      A selection of datasets for machine learning:


      Read more →
    • How to make possible micro-payments in your app

      This week I spent coding my very first public pet-app based on Telegram chat bot which acts as a Bitcoin wallet and allows to send and receive tips between Telegram users and other so-called “Lightning Apps”. I assume that you are familiar with Bitcoin & Telegram in general, i’ll try to post short and without deep jump into details. More resources about Bitcoin can be found here and Telegram is simply an instant messenger that allows you to create your custom apps (chat-bots) using their platform.


      What are the key points of such app?


      • Allows to rate other users ideas and answers with real value instead of
        ‘virtual likes’. This brings online conversation to completely new level
      • Real example of working micro-payment app which can act with other entities
        over internet using open protocol
      • All the modules are open-source projects and can be easy re-used and adjusted
        for your own project. App does not relay on third-party commercial services.
        Even it falls under e-commerce field, which is currently almost closed, the app
        is based on open solutions.

      What are the use-cases?


      something like this…

      image
      Read more →
    • Even more secret Telegrams

        We used to think of Telegram as a reliable and secure transmission medium for messages of any sort. But under the hood, it has a rather common combination of a- and symmetric encryptions. Where's fun in that? And anyway, why would anyone trust their messages to the third-party?
        Spy vs Spy by Antonio Prohías
        TL;DR — inventing a private covert channel over users blocking each other.

        Read more →
      • Build tools in machine learning projects, an overview

          I was wondering about machine learning/data science project structure/workflow and was reading different opinions on the subject. And when people start to talk about workflow they want their workflows to be reproducible. There are a lot of posts out there that suggest to use make for keeping workflow reproducible. Although make is very stable and widely-used I personally like cross-platform solutions. It is 2019 after all, not 1977. One can argue that make itself is cross-platform, but in reality you will have troubles and will spend time on fixing your tool rather than on doing the actual work. So I decided to have a look around and to check out what other tools are available. Yes, I decided to spend some time on tools.

          image

          This post is more an invitation for a dialogue rather than a tutorial. Perhaps your solution is perfect. If it is then it will be interesting to hear about it.

          In this post I will use a small Python project and will do the same automation tasks with different systems:


          There will be a comparison table in the end of the post.
          Read more →
        • Indexes in PostgreSQL — 8 (RUM)

          • Translation
          We have already discussed PostgreSQL indexing engine, the interface of access methods, and main access methods, such as: hash indexes, B-trees, GiST, SP-GiST, and GIN. In this article, we will watch how gin turns into rum.

          RUM


          Although the authors claim that gin is a powerful genie, the theme of drinks has eventually won: next-generation GIN has been called RUM.

          This access method expands the concept that underlies GIN and enables us to perform full-text search even faster. In this series of articles, this is the only method that is not included in a standard PostgreSQL delivery and is an external extension. Several installation options are available for it:

          • Take «yum» or «apt» package from the PGDG repository. For example, if you installed PostgreSQL from «postgresql-10» package, also install «postgresql-10-rum».
          • Build from source code on github and install on your own (the instruction is there as well).
          • Use as a part of Postgres Pro Enterprise (or at least read the documentation from there).

          Limitations of GIN


          What limitations of GIN does RUM enable us to transcend?

          First, «tsvector» data type contains not only lexemes, but also information on their positions inside the document. As we observed last time, GIN index does not store this information. For this reason, operations to search for phrases, which appeared in version 9.6, are supported by GIN index inefficiently and have to access the original data for recheck.

          Second, search systems usually return the results sorted by relevance (whatever that means). We can use ranking functions «ts_rank» and «ts_rank_cd» to this end, but they have to be computed for each row of the result, which is certainly slow.

          To a first approximation, RUM access method can be considered as GIN that additionally stores position information and can return the results in a needed order (like GiST can return nearest neighbors). Let's move step by step.
          Read more →
        • Introducing Windows Terminal

            We are beyond excited to announce Windows Terminal! Windows Terminal is a new, modern, fast, efficient, powerful, and productive terminal application for users of command-line tools and shells like Command Prompt, PowerShell, and WSL.



            Windows Terminal will be delivered via the Microsoft Store in Windows 10 and will be updated regularly, ensuring you are always up to date and able to enjoy the newest features and latest improvements with minimum effort.


            Read more →
          • Docker container for HP servers management with ILO

            • Tutorial
            Origin in Russian

            Well, you can wonder — why would I use docker container for such a purpose? What's the problem to enter web-interface of ILO and manage server as usual?

            The same thought I had when I've got a few old servers that required a reprovision. The servers are located in different continent and the only interface I had it was just a web interface of ILO. And when I had to enter a few manual commands via Virtual Console I discovered that it's hardly possible.

            For various sorts of Virtual Console of servers (both HP and Dells) usually Java web applets are used. But Firefox and Chrome don't support them anymore and the newest IcedTea doesn't work with those old system anyway. So I had a few options:
            Read more →
          • Setting up network sales channels for DO-RA gadgets

              image

              Introduction


              In early March 2019, Intersoft Eurasia team completed work on a test batch of DO-RA gadgets — personal, cross-platform dosimeters-radiometers to monitor the radiation situation at the measurement site, compatible with iOS and Android smartphones and tablets.

              By buying such a device, the user receives the following: reliable electronics which have undergone radiation testing in the factory laboratory, stylish colored case in the spirit of Malevich ;) for every taste, gift packaging, color insert instructions in Russian and English, special USB charging cable, a free updateable DO-RA.Pro application from the App Store and Google Play.

              The next step in our project implementation is to find the best sales channels for Made in Russia products in the challenging environment of stagnant purchasing power.
              Read more →
            • In-App Updates Flexible Flow: Speed Up the App Update Process on Android



                With a variety of new tools and features announced at Android Dev Summit, special attention should be given to the In-App Updates (IAUs) API allowing developers to increase the speed of delivering features, bug-fixes and performance improvements to active users. Since this feature was finally released after Google I/O 2019, in this article I’ll deep dive on IAUs API, describe in details recommended user flows and provide with some code samples. Moreover, I'll share some experience of IAUs integration in the Pandao app, a marketplace platform for Chinese goods.
                Read more →
              • Breaking UC Browser



                  Introduction


                  At the end of March we reported on the hidden potential to download and run unverified code in UC Browser. Today we will examine in detail how it happens and how hackers can use it.

                  Some time ago, UC Browser was promoted and distributed quite aggressively. It was installed on devices by malware, distributed via websites under the guise of video files (i.e., users thought they were downloading pornography or something, but instead were getting APK files with this browser), advertised using worrisome banners about a user’s browser being outdated or vulnerable. The official UC Browser VK group had a topic where users could complain about false advertising and many users provided examples. In 2016, there was even a commercial in Russian (yes, a commercial of a browser that blocks commercials).

                  As we write this article, UC Browser was installed 500,000,000 times from Google Play. This is impressive since only Google Chrome managed to top that. Among the reviews, you can see a lot of user complaints about advertising and being redirected to other applications on Google Play. This was the reason for our study: we wanted to see if UC Browser is doing something wrong. And it is! The application is able to download and run executable code, which violates Google Play’s policy for app publishing . And UC Browser doesn’t only download executable code; it does this unsafely, which can be used for a MitM attack. Let's see if we can use it this way.
                  Read more →
                • Visual Studio C++ Template IntelliSense Populates Based on Instantiations in Your Code

                    Ever since we announced Template IntelliSense, you all have given us great suggestions. One very popular suggestion was to have the Template Bar auto-populate candidates based on instantiations in your code. In Visual Studio 2019 version 16.1 Preview 2, we’ve added this functionality via an “Add All Existing Instantiations” option in the Template Bar dropdown menu. The following examples are from the SuperTux codebase. 


                    Read more →
                  • Improvements to Visual Studio App Center Distribution

                      Here at Visual Studio App Center, we try to incorporate customer obsession in our day to day. Earlier this year we started an effort for widespread customer outreach to understand our users and guide product prioritization. The effort helped us gain a lot of insight and helped our prioritization last quarter. However, as we continue to grow, we unfortunately don’t have the capacity to reach out to as many customers as we would like.


                      To continue to engage with as many customers are possible, we created a GitHub repo specifically for this purpose. We’ve been using the repo to track monthly iterations from the team, feature requests, and community interest for certain features. We are making changes to align our priorities for the upcoming quarters based on what our customers are requesting.


                      I wanted to highlight some of the changes we’ve made to the Distribution service based off what we learned from customer outreach and feedback. All of these changes are available now:


                      • Distributing releases to multiple destinations
                      • Distribution releases to individual testers
                      • Turning off email notification for releases
                      • Disabling a release
                      • Make releases sortable

                      Read more →
                    • How We Find Lambda Expressions in IntelliJ IDEA

                      • Translation

                      Type Hierarchy in IntelliJ IDEACode search and navigation are important features of any IDE. In Java, one of the commonly used search options is searching for all implementations of an interface. This feature is often called Type Hierarchy, and it looks just like the image on the right.


                      It's inefficient to iterate over all project classes when this feature is invoked. One option is to save the complete class hierarchy in the index during compilation since the compiler builds it anyway. We do this when the compilation is run by the IDE and not delegated, for example, to Gradle. But this works only if nothing has been changed in the module after the compilation. In general, the source code is the most up-to-date information provider, and indexes are based on the source code.


                      Finding immediate children is a simple task if we are not dealing with a functional interface. When searching for implementations of the Foo interface, we need to find all the classes that have implements Foo and interfaces that have extends Foo, as well as new Foo(...) {...} anonymous classes. To do this, it is enough to build a syntax tree of each project file in advance, find the corresponding constructs, and add them to an index.

                      Read more →
                    • How Many Developers Need to Create Service Like Airbnb

                         Back in 2007, Brian Chesky and Joe Gebbia shared a room in San Francisco and were unable to pay rent on time. As a way out, they decided to turn their living space into a simple bed-and-breakfast hotel to get some money from travelers. A year later, the venturers launched a website which evolved into the most famous peer-to-peer renting service called Airbnb.

                         Now, the company has 3,100 employees and generates insane revenues for its founders. The statistics say that Airbnb has 150 million registered users, 3 million hosts, and 4 million listed offers. The service covers 80,000 cities in 190 countries, and, interestingly, 50% of traffic comes from mobile applications.

                          These figures are so impressive that you may also want to create your own Airbnb clone and become successful. But slow down. This story is already written; do you really need to create a marketplace similar to Airbnb?
                        Read more →
                      • Citymobil — a manual for improving availability amid business growth for startups. Part 5



                          This is the final part of the series describing how we’re increasing our service availability in Citymobil (you can read the previous part here). Now I’m going to talk about one more type of outages and the conclusions we made about them, how we modified the development process, what automation we introduced.
                          Read more →