• High-Quality Text-to-Speech Made Accessible, Simple and Fast


      There is a lot of commotion in text-to-speech now. There is a great variety of toolkits, a plethora of commercial APIs from GAFA companies (based both on new and older technologies). There are also a lot of Silicon Valley startups trying to ship products akin to "deep fakes" in speech.

      But despite all this ruckus we have not yet seen open solutions that would fulfill all of these criteria:

      • Naturally sounding speech;
      • A large library of voices in many languages;
      • Support for 16kHz and 8kHz out of the box;
      • No GPUs / ML engineering team / training required;
      • Unique voices not infringing upon third-party licenses;
      • High throughput on slow hardware. Decent performance on one CPU thread;
      • Minimalism and lack of dependencies. One-line usage, no builds or coding in C++ required;
      • Positioned as a solution, not yet another toolkit / compilation of models developed by other people;
      • Not affiliated by any means with ecosystems of Google / Yandex / Sberbank;

      We decided to share our open non-commercial solution that fits all of these criteria with the community. Since we have published the whole pipeline we do not focus much on cherry picked examples and we encourage you to visit our project GitHub repo to test our TTS for yourself.

      Read more →
    • Converting text into algebra

      • Translation

      Algebra and language (writing) are two different learning tools. When they are combined, we can expect new methods of machine understanding to emerge. To determine the meaning (to understand) is to calculate how the part relates to the whole. Modern search algorithms already perform the task of meaning recognition, and Google’s tensor processors perform matrix multiplications (convolutions) necessary in an algebraic approach. At the same time, semantic analysis mainly uses statistical methods. Using statistics in algebra, for instance, when looking for signs of numbers divisibility, would simply be strange. Algebraic apparatus is also useful for interpreting the calculations results when recognizing the meaning of a text.

      Читать далее
    • Playing with Nvidia's New Ampere GPUs and Trying MIG

        Every time when the essential question arises, whether to upgrade the cards in the server room or not, I look through similar articles and watch such videos.

        Channel with the aforementioned video is very underestimated, but the author does not deal with ML. In general, when analyzing comparisons of accelerators for ML, several things usually catch your eye:

        • The authors usually take into account only the "adequacy" for the market of new cards in the United States;
        • The ratings are far from the people and are made on very standard networks (which is probably good overall) without details;
        • The popular mantra to train more and more gigantic models makes its own adjustments to the comparison;

        The answer to the question "which card is better?" is not rocket science: Cards of the 20* series didn't get much popularity, while the 1080 Ti from Avito (Russian craigslist) still are very attractive (and, oddly enough, don't get cheaper, probably for this reason).

        All this is fine and dandy and the standard benchmarks are unlikely to lie too much, but recently I learned about the existence of Multi-Instance-GPU technology for A100 video cards and native support for TF32 for Ampere devices and I got the idea to share my experience of the real testing cards on the Ampere architecture (3090 and A100). In this short note, I will try to answer the questions:

        • Is the upgrade to Ampere worth it? (spoiler for the impatient — yes);
        • Are the A100 worth the money (spoiler — in general — no);
        • Are there any cases when the A100 is still interesting (spoiler — yes);
        • Is MIG technology useful (spoiler — yes, but for inference and for very specific cases for training);
        Read more →
      • How to find an English teacher. Part 2


          This is a continuation of story about using Data Science for finding an English teacher. If you have not read it yet - there is an opportunity to become familiar with it

          Briefly  -  we had information about language teachers and tried to apply some basic ideas using pandas and our expectations. Unfortunately we got stuck on the third step, because there is not enough information for resolving our the last requirements  -  we need not more 3 candidates at the end.

          It is an approach based on my own experience and can be unsuitable to your point of view, ideas, or principles.
          Read more →
        • How to find an English teacher. Part 1

            In the modern world, here and there ideas are arising about using data science for an extra benefit. For instance, Google can use a history of watched videos for providing recommendations about new ones. Online shops are using a recommendation system for increasing your receipt. However… if companies use the data for their benefit, could we do the same for own needs such as looking an online English teacher?


            It is an approach based on my own experience and can be unsuitable to your point of view, ideas, or principles.

            Read more →
          • Ads
            AdBlock has stolen the banner, but banners are not teeth — they will be back

          • Keyword Tree: graph analysis for semantic extraction


              This post is a small abstract of full-scaled research focused on keyword recognition. Technique of semantics extraction was initially applied in field of social media research of depressive patterns. Here I focus on NLP and math aspects without psychological interpretation. It is clear that analysis of single word frequencies is not enough. Multiple random mixing of collection does not affect the relative frequency but destroys information totally — bag of words effect. We need more accurate approach for the mining of semantics attractors.

              Read more →