Pull to refresh
540.62

Machine learning *

The basis of artificial intelligence

Show first
Rating limit
Level of difficulty

Building a GPT-like Model from Scratch with Detailed Theory and Code Implementation

Reading time14 min
Views34K

Unlock the power of Transformer Neural Networks and learn how to build your own GPT-like model from scratch. In this in-depth guide, we will delve into the theory and provide a step-by-step code implementation to help you create your own miniGPT model. The final code is only 400 lines and works on both CPUs as well as on the GPUs. If you want to jump straight to the implementation here is the GitHub repo.

Transformers are revolutionizing the world of artificial intelligence. This simple, but very powerful neural network architecture, introduced in 2017, has quickly become the go-to choice for natural language processing, generative AI, and more. With the help of transformers, we've seen the creation of cutting-edge AI products like BERT, GPT-x, DALL-E, and AlphaFold, which are changing the way we interact with language and solve complex problems like protein folding. And the exciting possibilities don't stop there - transformers are also making waves in the field of computer vision with the advent of Vision Transformers.

Read more
Total votes 25: ↑25 and ↓0+25
Comments1

Our new public speech synthesis in super-high quality, 10x faster and more stable

Reading time3 min
Views4.2K

hero_image


In our last article we made a bunch of promises about our speech synthesis.


After a lot of hard work we finally have delivered upon these promises:


  • Model size reduced 2x;
  • New models are 10x faster;
  • We added flags to control stress;
  • Now the models can make proper pauses;
  • High quality voice added (and unlimited "random" voices);
  • All speakers squeezed into the same model;
  • Input length limitations lifted, now models can work with paragraphs of text;
  • Pauses, speed and pitch can be controlled via SSML;
  • Sampling rates of 8, 24 or 48 kHz are supported;
  • Models are much more stable — they do not omit words anymore;

This is a truly break-through achievement for us and we are not planning to stop anytime soon. We will be adding as many languages as possible shortly (the CIS languages, English, European languages, Hindic languages). Also we are still planning to make our models additional 2-5x faster.


We are also planning to add phonemes and a new model for stress, as well as to reduce the minimum amount of audio required to train a high-quality voice to 5 — 15 minutes.


As usual you can try our model in our repo or in colab.

Read more →
Total votes 13: ↑13 and ↓0+13
Comments0

Mode on: Comparing the two best colorization AI's

Reading time11 min
Views3.3K

This article continues a series of notes about colorization. During today's experiment, we’ll be comparing a recent neural network with the good old Deoldify to gauge the rate at which the future is approaching.

This is a practical project, so we won’t pay extra attention to the underlying philosophy of the Transformer architecture. Besides, any attempt to explain the principles of its operation to a wide public in hand waving terms would become misguiding.

A lecturer: Mr. Petrov! How does a transformer work?
Petrov with a bass voice: Hum-m-m-m.


Google Colorizing Transformer vs Deoldify

Read more →
Total votes 17: ↑17 and ↓0+17
Comments0

Toxic Comments Detection in Russian

Reading time17 min
Views7.5K

Currently, social network sites tend to be one of the major communication platforms in both offline and online space. Freedom of expression of various points of view, including toxic, aggressive, and abusive comments, might have a long-term negative impact on people’s opinions and social cohesion. As a consequence, the ability to automatically identify and moderate toxic content on the Internet to eliminate the negative consequences is one of the necessary tasks for modern society. This paper aims at the automatic detection of toxic comments in the Russian language. As a source of data, we utilized anonymously published Kaggle dataset and additionally validated its annotation quality. To build a classification model, we performed fine-tuning of two versions of Multilingual Universal Sentence Encoder, Bidirectional Encoder Representations from Transformers, and ruBERT. Finetuned ruBERT achieved F1 = 92.20%, demonstrating the best classification score. We made trained models and code samples publicly available to the research community.
Read more →
Total votes 18: ↑17 and ↓1+16
Comments0

More than a game: Mastering Mahjong with AI and machine learning

Reading time2 min
Views1.2K


Microsoft researchers have developed an artificial intelligence (AI) system that has taught itself the intricacies of Mahjong and can now match the skills of some of the world’s top players.

The complex board game of chance, bluff, and strategy was invented in China thousands of years ago and remains a passionate pastime for millions of Asians today, with many dedicated competitors playing online.

Computers have learned to play Chess and another ancient Chinese game, Go, amid much fanfare in the past. But scientists at Microsoft Research (MSR) Asia see their achievement as far more than just a case of technology mastering yet another game.

The researchers – who named their system Super Phoenix, or Suphx for short – developed a series of AI algorithmic breakthroughs to navigate the uncertain nature of Mahjong. With more work, these could potentially be applied in real situations to solve problems thrown up by unknown factors and random events.
Read more →
Total votes 11: ↑11 and ↓0+11
Comments0

How we made landmark recognition in Cloud Mail.ru, and why

Reading time11 min
Views2.4K


With the advent of mobile phones with high-quality cameras, we started making more and more pictures and videos of bright and memorable moments in our lives. Many of us have photo archives that extend back over decades and comprise thousands of pictures which makes them increasingly difficult to navigate through. Just remember how long it took to find a picture of interest just a few years ago.

One of Mail.ru Cloud’s objectives is to provide the handiest means for accessing and searching your own photo and video archives. For this purpose, we at Mail.ru Computer Vision Team have created and implemented systems for smart image processing: search by object, by scene, by face, etc. Another spectacular technology is landmark recognition. Today, I am going to tell you how we made this a reality using Deep Learning.
Read more →
Total votes 45: ↑44 and ↓1+43
Comments0

Internet of Things (IoT) is going to Change the World. Future of IoT

Reading time3 min
Views1.7K
For the past two years, there’s been a lot of buzzing about the Internet of Things (IoT). This has to lead to the rapid selection of connected devices over industries and is determined to pass the 11 billion mark by the end of the year. Major Companies including IoT software development as their major services.

All these “things” are now creating their things, namely, lots and lots of data. This data will be at the core of commercial and industrial digital transformation (which is essentially the underlying force behind the fourth industrial revolution).

In other words, life as we know it is about to change forever! How is it going to change? Let’s take a look.

1. AI (Artifical Intelligence) can Effectively Manage Oceans of information

We can’t talk about IoT without AI as the latter has the power to make IoT a whole lot smarter and more efficient.

In fact, consultants believe that AI is the brains behind IoT systems that may facilitate build them run power tool.

For example, as more and more connected devices start communicating with each other, enterprises will need to leverage deep learning, image recognition, natural language processes, and neural-network driven decisions to help them understand each other (and us humans) better.

So far, we can say that IoT has felt like an isolated experience where it was just about simple data. Going forward, businesses will strive to achieve highly integrated experiences by using AI to better understand their employees, customers, and the general public living in smart cities.
Read more →
Total votes 10: ↑10 and ↓0+10
Comments6

Machine Learning and Theory of Constraints

Reading time3 min
Views1.9K
Backlog prioritization requires simplification and weighting of tasks. Each one belongs to strategy like ads acquisition or CRO. We may consider turnover, operational costs, other metrics as input; profit margin, ROI — as output in case of retail. The perfect goal is to find 20/80 solution and focus resources on a single strategy at a time. Metrics tied to strategies gives the dimension of model. Sometimes unit economy relations are violated because of non-linearity. In practice it means low/insignificant correlation and poor regression. Example: it is impossible to separate acquisition and conversion — the quantity of acquisition affect its quality and vice versa. Decomposition of tasks/strategies assumes linear decomposition of nonlinear system. Besides nonlinear statistical evaluation of strategies is required when CJM can't be tracked or online/offline channels can't be separated.
Read more →
Total votes 13: ↑12 and ↓1+11
Comments2

Contextual Emotion Detection in Textual Conversations Using Neural Networks

Reading time10 min
Views3.7K

Nowadays, talking to conversational agents is becoming a daily routine, and it is crucial for dialogue systems to generate responses as human-like as possible. As one of the main aspects, primary attention should be given to providing emotionally aware responses to users. In this article, we are going to describe the recurrent neural network architecture for emotion detection in textual conversations, that participated in SemEval-2019 Task 3 “EmoContext”, that is, an annual workshop on semantic evaluation. The task objective is to classify emotion (i.e. happy, sad, angry, and others) in a 3-turn conversational data set.
Read more →
Total votes 37: ↑37 and ↓0+37
Comments0

AI-Based Photo Restoration

Reading time7 min
Views18K


Hi everybody! I’m a research engineer at the Mail.ru Group computer vision team. In this article, I’m going to tell a story of how we’ve created AI-based photo restoration project for old military photos. What is «photo restoration»? It consists of three steps:

  • we find all the image defects: fractures, scuffs, holes;
  • we inpaint the discovered defects, based on the pixel values around them;
  • we colorize the image.

Further, I’ll describe every step of photo restoration and tell you how we got our data, what nets we trained, what we accomplished, and what mistakes we made.
Read more →
Total votes 34: ↑33 and ↓1+32
Comments4

Improve your mobile application using machine learning technology

Reading time4 min
Views1.1K
Today, even mobile application developing company has begun to consolidate ML related to other cutting edge technologies, for example, AI and predictive analysis. This is on the grounds that ML empowers mobile applications to learn, adjust, and improve after some time.

It’s an incredible accomplishment when you consider the way that changes requested an express order from designers for gadgets to execute a particular activity. At the point when this was the standard, software engineers needed to estimate and record for each conceivable situation (and this was a fantastic test).

Be that as it may, with ML in portable applications, we have removed the speculating game from the condition. It can likewise upgrade User Experience (UX) by understanding client conduct. So you can wager that ML in versatile won’t be restricted to voice associates and chatbots.
Read more →
Total votes 16: ↑16 and ↓0+16
Comments0

A selection of Datasets for Machine learning

Reading time5 min
Views7K
Hi guys,

Before you is an article guide to open data sets for machine learning. In it, I, for a start, will collect a selection of interesting and fresh (relatively) datasets. And as a bonus, at the end of the article, I will attach useful links on independent search of datasets.

Less words, more data.

image

A selection of datasets for machine learning:


Read more →
Total votes 12: ↑11 and ↓1+10
Comments0

Google News and Leo Tolstoy: visualizing Word2Vec word embeddings using t-SNE

Reading time7 min
Views13K

Everyone uniquely perceives texts, regardless of whether this person reads news on the Internet or world-known classic novels. This also applies to a variety of algorithms and machine learning techniques, which understand texts in a more mathematical way, namely, using high-dimensional vector space.

This article is devoted to visualizing high-dimensional Word2Vec word embeddings using t-SNE. The visualization can be useful to understand how Word2Vec works and how to interpret relations between vectors captured from your texts before using them in neural networks or other machine learning algorithms. As training data, we will use articles from Google News and classical literary works by Leo Tolstoy, the Russian writer who is regarded as one of the greatest authors of all time.

We go through the brief overview of t-SNE algorithm, then move to word embeddings calculation using Word2Vec, and finally, proceed to word vectors visualization with t-SNE in 2D and 3D space. We will write our scripts in Python using Jupyter Notebook.

Read more →
Total votes 28: ↑28 and ↓0+28
Comments0

Announcing ML.NET 1.0 RC – Machine Learning for .NET

Reading time3 min
Views1.3K

ML.NET is an open-source and cross-platform machine learning framework (Windows, Linux, macOS) for .NET developers. Using ML.NET, developers can leverage their existing tools and skillsets to develop and infuse custom AI into their applications by creating custom machine learning models for common scenarios like Sentiment Analysis, Recommendation, Image Classification and more!.


Today we’re announcing the ML.NET 1.0 RC (Release Candidate) (version 1.0.0-preview) which is the last preview release before releasing the final ML.NET 1.0 RTM in 2019 Q2 calendar year.


Soon we will be ending the first main milestone of a great journey in the open that started on May 2018 when releasing ML.NET 0.1 as open source. Since then we’ve been releasing monthly, 12 preview releases so far, as shown in the roadmap below:



In this release (ML.NET 1.0 RC) we have initially concluded our main API changes. For the next sprint we are focusing on improving documentation and samples and addressing major critical issues if needed.


The goal is to avoid any new breaking changes moving forward.

Read more →
Total votes 17: ↑15 and ↓2+13
Comments0

Developer’s Guide to Building AI Applications

Reading time1 min
Views1.4K

Create your first intelligent bot with Microsoft AI


Artificial intelligence (AI) is accelerating the digital transformation for every industry, with examples spanning manufacturing, retail, finance, healthcare, and many others. At this rate, every industry will be able to use AI to amplify human ingenuity. In this e-book, Anand Raman and Wee Hyong Tok from Microsoft provide a comprehensive roadmap for developers to build their first AI-infused application.


Using a Conference Buddy as an example, you’ll learn the key ingredients needed to develop an intelligent chatbot that helps conference participants interact with speakers. This e-book provides a gentle introduction to the tools, infrastructure, and services on the Microsoft AI Platform, and teaches you how to create powerful, intelligent applications.

Read more →
Total votes 17: ↑15 and ↓2+13
Comments0

We're in UltraHD Morty! How to watch any movie in 4K

Reading time3 min
Views14K
You’ve probably heard about Yandex’s DeepHD technology they once used to improve the quality of old Soviet cartoons. Unfortunately, it’s not public yet, and we, regular programmers, don’t have the dedication to write our own solution. But I personally really wanted to watch Rick and Morty on my 2880x1880 Retina display. And I was deeply disappointed, as even 1080p video (the highest available for this series) looks really blurry on a Retina display! Don’t get me wrong, 1080p is often good enough, but Retina is designed in such a way that an animation with its pronounced outlines in 1080p looks awfully blurry, like 480p on a FullHD monitor.

I decided I want to see Rick and Morty in 4K, even though I can’t write neural networks. And, amazingly, I found a solution. You don’t even need to write any code: all you need is around 100GB of free space and a bit of patience. The result is a sharp 4K image that looks better than any interpolation.


Read more →
Total votes 21: ↑19 and ↓2+17
Comments7

Detecting Web Attacks with a Seq2Seq Autoencoder

Reading time7 min
Views5.5K
image

Attack detection has been a part of information security for decades. The first known intrusion detection system (IDS) implementations date back to the early 1980s.

Nowadays, an entire attack detection industry exists. There are a number of kinds of products—such as IDS, IPS, WAF, and firewall solutions—most of which offer rule-based attack detection. The idea of using some kind of statistical anomaly detection to identify attacks in production doesn’t seem as realistic as it used to. But is that assumption justified?
Read more →
Total votes 23: ↑22 and ↓1+21
Comments0

ML.NET Tutorial — Get started in 10 minutes

Reading time3 min
Views5.2K
Last year we announced ML.NET, cross-platform and open ML system for .NET developers. During this time, it has evolved greatly and has gone through many versions. Today we are sharing a guide on how to create your first ml.net application in 10 minutes.

Читать дальше →
Total votes 22: ↑19 and ↓3+16
Comments1

Progress and hype in AI research

Reading time19 min
Views4.6K

The biggest issue with AI is not that it is stupid but a lack of definition for intelligence and hence a lack of formal measure for it [1a] [1b].


Turing test is not a good measure because gorilla Koko [2a] and bonobo Kanzi [2b] wouldn't pass though they could solve more problems than many disabled human beings.


It is quite possible that people in the future might wonder why people back in 2019 thought that an agent trained to play a fixed game in a simulated environment such as Go had any intelligence [3a] [3b] [3c] [3d] [3e] [3f] [3g] [3h].


Intelligence is more about applying/transferring old knowledge to new tasks (playing Quake Arena good enough without any training after mastering Doom) than compressing agent's experience into heuristics to predict a game score and determining agent's action in a given game state to maximize final score (playing Quake Arena good enough after million games after mastering Doom) [4].


Human intelligence is about ability to adapt to the physical/social world, and playing Go is a particular adaptation performed by human intelligence, and developing an algorithm to learn to play Go is a more performant one, and developing a mathematical theory of Go might be even more performant.

Read more →
Total votes 24: ↑24 and ↓0+24
Comments3
1

Authors' contribution