Machine learning *

The basis of artificial intelligence

Articles Posts News Authors

Vitaly_net Aug 23 2022 at 18:40

Color image capturing device with pseudorandom patterns sets

4 min

764

Matlab*Machine learning*Image processing*

From sandbox

The present invention relates to an analog signal capturing devices generally and monochrome or color image capture sensors, such as a scanner or a Charge-Coupled-Device (“CCD”) for video and photo camera in particular, which are almost free from moiré and aliasing. The present invention relates to methods for enhancing the resolution of an image capture device and device for digital color/grey image displaying also.

burzzo Aug 3 2022 at 03:13

FL_PyTorch is publicly available on GitHub

2 min

1.3K

Mathematics*Machine learning*Artificial Intelligence

FL_PyTorch: Optimization Research Simulator for Federated Learning is publicly available on GitHub.

FL_PyTorch is a suite of open-source software written in python that builds on top of one of the most popular research Deep Learning (DL) frameworks PyTorch. We built FL_PyTorch as a research simulator for FL to enable fast development, prototyping, and experimenting with new and existing FL optimization algorithms. Our system supports abstractions that provide researchers with sufficient flexibility to experiment with existing and novel approaches to advance the state-of-the-art. The work is in proceedings of the 2nd International Workshop on Distributed Machine Learning DistributedML 2021. The paper, presentation, and appendix are available in DistributedML’21 Proceedings (https://dl.acm.org/doi/abs/10.1145/3488659.3493775).

The project is distributed in open source form under Apache License Version 2.0. Code Repository: https://github.com/burlachenkok/flpytorch.

To become familiar with that tool, I recommend the following sequence of steps:

-1

alexandervolchek Jul 12 2022 at 15:43

Metaverses: hype or the future to come?

5 min

1.5K

AR and VRSocial networks and communitiesArtificial IntelligenceMachine learning*Decentralized networks*

Alexander Volchek, IT entrepreneur, CEO educational platform GeekBrains

Pretty much everyone in the IT community is talking metaverses, NFTs, blockchain and cryptocurrency. This time we will discuss metaverses, and come back to everything else in the letters to follow. Entrepreneurs and founders of tech giants are passionate about this idea, and investors are allocating millions of dollars for projects dealing with metaverses. Let's start with the basics.

alexandervolchek Jul 11 2022 at 13:02

What are neural networks and what do we need them for?

4 min

4.5K

Data Engineering*Machine learning*Mathematics*

Explaining through simple examples

For a long time, people have been thinking on how to create a computer that could think like a person. The advent of artificial neural networks is a significant step in this direction. Our brain consists of neurons that receive information from sensory organs and process it: we recognize people we know by their faces, and we feel hungry when we see delicious food. All of this is the result of brain neurons working and interacting with each other. This is also the principle that artificial neural networks are based on, simulating the processes occurring in the human brain.

What are neural networks

Artificial neural networks are a software code that imitates the work of a brain and is capable of self-learning. Like a biological network, an artificial network also consists of neurons, but they have a simpler structure.

If you connect neurons into a sufficiently large network with controlled interaction, they will be able to perform quite complex tasks. For example, determining what is shown in a picture, or independently creating a photorealistic image based on a text description.

snakers4 Jun 30 2022 at 15:39

Multilingual Text-to-Speech Models for Indic Languages

5 min

2.6K

Machine learning*Natural Language Processing*Voice user interfaces*

In this article, we shall provide some background on how multilingual multi-speaker models work and test an Indic TTS model that supports 9 languages and 17 speakers (Hindi, Malayalam, Manipuri, Bengali, Rajasthani, Tamil, Telugu, Gujarati, Kannada).

It seems a bit counter-intuitive at first that one model can support so many languages and speakers provided that each Indic language has its own alphabet, but we shall see how it was implemented.

Also, we shall list the specs of these models like supported sampling rates and try something cool – making speakers of different Indic languages speak Hindi. Please, if you are a native speaker of any of these languages, share your opinion on how these voices sound, both in their respective language and in Hindi.

andrey78910 Jun 9 2022 at 12:10

Text-based CAPTCHA in 2022

7 min

5.8K

Machine learning*Information Security*Artificial Intelligence

Translation

The first text-based CAPTCHA ( we’ll call it just CAPTCHA for the sake of brevity ) was used in 1997 by AltaVista search engine. It prevented bots from adding Uniform Resource Locator (URLs) to their web search engine.

Back then it was a decent defense measure. However the progress can't be stopped, and this defense was bypassed using OCR available at those times (for example FineReader).

CAPTCHA became more complex, noise was added to it, along with distortions, so the popular OCRs couldn’t recognize this text. And then OCRs custom made for this task appeared. It costed extra money and knowledge for the attacking side. The CAPTCHA developers were required to understand the challenges the attackers met, what distortions to add, in order to make the automation of the CAPTCHA recognition more complex.

The misunderstanding of the principles the OCRs were based on, some CAPTCHAs were given such distortions, that they were more of a hassle for regular users than for a machine.

OCRs for different types of CAPTCHAs were made using heuristics, and the most complicated part of it was the CAPTCHA segmentation for the stand along symbols, that subsequently could be easily recognized by the CNN (for example LeNet-5), also SVM showed a good result even on the raw pixels.

In this article I’ll try to grasp the whole history of CAPTCHA recognition, from heuristics to the contemporary automated recognition systems. We’ll figure out, if a CAPTCHA is still alive.

I’ll review the yandex.com CAPTCHA. The Russian version of the same CAPTCHA is more complex.

snakers4 Apr 12 2022 at 21:08

Our new public speech synthesis in super-high quality, 10x faster and more stable

3 min

4.8K

Natural Language Processing*Voice user interfaces*Machine learning*

hero_image

In our last article we made a bunch of promises about our speech synthesis.

After a lot of hard work we finally have delivered upon these promises:

Model size reduced 2x;
New models are 10x faster;
We added flags to control stress;
Now the models can make proper pauses;
High quality voice added (and unlimited "random" voices);
All speakers squeezed into the same model;
Input length limitations lifted, now models can work with paragraphs of text;
Pauses, speed and pitch can be controlled via SSML;
Sampling rates of 8, 24 or 48 kHz are supported;
Models are much more stable — they do not omit words anymore;

This is a truly break-through achievement for us and we are not planning to stop anytime soon. We will be adding as many languages as possible shortly (the CIS languages, English, European languages, Hindic languages). Also we are still planning to make our models additional 2-5x faster.

We are also planning to add phonemes and a new model for stress, as well as to reduce the minimum amount of audio required to train a high-quality voice to 5 — 15 minutes.

As usual you can try our model in our repo or in colab.

+13

ddimitrov Nov 17 2021 at 12:54

ruDALL-E: Generating Images from Text. Facing down the biggest computational challenge in Russia

11 min

10K

SberDevices corporate blogСбер corporate blogArtificial IntelligenceMachine learning*Image processing*

Multimodality has led the pack in machine learning in 2021. Neural networks are wolfing down images, text, speech and music all at the same time. OpenAI is, as usual, top dog, but as if in defiance of their name, they are in no hurry to share their models openly. At the beginning of the year, the company presented the DALL-E neural network, which generates 256x256 pixel images in answer to a written request. Descriptions of it can be found as articles on arXiv and examples on their blog.

As soon as DALL-E flushed out of the bushes, Chinese researchers got on its tail. Their open-source CogView neural network does the same trick of generating images from text. But what about here in Russia? One might say that “investigate, master, and train” is our engineering motto. Well, we caught the scent, and today we can say that we created from scratch a complete pipeline for generating images from descriptive textual input written in Russian.

In this article we present the ruDALL-E XL model, an open-source text-to-image transformer with 1.3 billion parameters as well as ruDALL-E XXL model, an text-to-image transformer with 12.0 billion parameters which is available in DataHub SberCloud, and several other satellite models.

averkij Oct 31 2021 at 18:44

Lingtrain Aligner. How to make parallel books for language learning. Part 1. Python and Colab version

8 min

3.8K

Natural Language Processing*Open source*Learning languagesMachine learning*Programming*

Tutorial

title

If you're interested in learning new languages or teaching them, then you probably know such a way as parallel reading. It helps to immerse yourself in the context, increases the vocabulary, and allows you to enjoy the learning process. When it comes to reading, you most likely want to choose your favorite author, theme, or something familiar and this is often impossible if no one has published such a variant of a parallel book. It's becoming even worse when you're learning some cool language like Hungarian or Japanese.

Today we are taking a big step forward toward breaking this situation.

We will use the lingtrain_aligner tool. It's an open-source project on Python which aims to help all the people eager to learn foreign languages. It's a part of the Lingtrain project, you can follow us on Telegram, Facebook and Instagram. Let's start!

Find the texts

At first, we should find two texts we want to align. Let's take two editions of "To Kill a Mockingbird" by Harper Lee, in Russian and the original one.

Amrita123 Oct 20 2021 at 15:08

Using the Machine Learning model to detect credit card fraud

3 min

1.8K

Machine learning*

From sandbox

When we move towards the digital world, we shouldn’t forget that cybersecurity has been playing a major role in our life. Talks about digital security have been stiff. The main challenge we would face is abnormality.

During an online transaction, most of the product-lovers prefer credit cards. The credit limit available in credit cards would allow us to purchase even when our bank balance is insufficient. But this is great news for cyber attackers eyeing your money.

For tackling this problem, we should depend upon a system to make hardpressed transactions effortless.

This is where we need a system to track the transaction patterns. With AI, we can abort any abnormal transaction, precisely for credit card fraud detection AI.

As of now, we will come across a number of machine learning algorithms to classify unusual transactions where Artificial Intelligence detect fraud. We only need past data and the right algorithm to fit the data in the right form in case of credit card fraud detection ai.

How do we make this happen? Let’s look into the process of credit card fraud detection AI:

Import the needed libraries

The best step to detect credit card fraud detection with AI is to import the libraries. The best practice would be to import the necessary libraries in a single section for the purpose of quick modification. To use the credit card data, we can use the PCA’s transformed version or RFECV, RFE, VIF and SelectKBest to get the best model features.

Import Dataset

Machine learning helps with fraud detection. It’s quite simple to import the dataset when you use the pandas module in python. You can run the run command for importing your data.

snakers4 Oct 6 2021 at 17:20

We have published a model for text repunctuation and recapitalization for four languages

7 min

Big Data*Natural Language Processing*Python*Machine learning*

Working with speech recognition models we often encounter misconceptions among potential customers and users (mostly related to the fact that people have a hard time distinguishing substance over form). People also tend to believe that punctuation marks and spaces are somehow obviously present in spoken speech, when in fact real spoken speech and written speech are entirely different beasts.

Of course you can just start each sentence with a capital letter and put a full stop at the end. But it is preferable to have some relatively simple and universal solution for "restoring" punctuation marks and capital letters in sentences that our speech recognition system generates. And it would be really nice if such a system worked with any texts in general.

For this reason, we would like to share a system that:

Inserts capital letters and basic punctuation marks (dot, comma, hyphen, question mark, exclamation mark, dash for Russian);
Works for 4 languages (Russian, English, German, Spanish) and can be extended;
By design is domain agnostic and is not based on any hard-coded rules;
Has non-trivial metrics and succeeds in the task of improving text readability;

To reiterate — the purpose of such a system is only to improve the readability of the text. It does not add information to the text that did not originally exist.

man_of_letters Jul 22 2021 at 09:03

Mode on: Comparing the two best colorization AI's

11 min

3.8K

RUVDS.com corporate blogPython*TensorFlow*Machine learning*Image processing*

This article continues a series of notes about colorization. During today's experiment, we’ll be comparing a recent neural network with the good old Deoldify to gauge the rate at which the future is approaching.

This is a practical project, so we won’t pay extra attention to the underlying philosophy of the Transformer architecture. Besides, any attempt to explain the principles of its operation to a wide public in hand waving terms would become misguiding.

A lecturer: Mr. Petrov! How does a transformer work?
Petrov with a bass voice: Hum-m-m-m.

Google Colorizing Transformer vs Deoldify

+17

m31 Jul 1 2021 at 16:40

Data Phoenix Digest — 01.07.2021

5 min

1.9K

Python*Algorithms*Big Data*Machine learning*Artificial Intelligence

We at Data Science Digest have always strived to ignite the fire of knowledge in the AI community. We’re proud to have helped thousands of people to learn something new and give you the tools to push ahead. And we’ve not been standing still, either.

Please meet Data Phoenix, a Data Science Digest rebranded and risen anew from our own flame. Our mission is to help everyone interested in Data Science and AI/ML to expand the frontiers of knowledge. More news, more updates, and webinars(!) are coming. Stay tuned!

The new issue of the new Data Phoenix Digest is here! AI that helps write code, EU’s ban on biometric surveillance, genetic algorithms for NLP, multivariate probabilistic regression with NGBoosting, alias-free GAN, MLOps toys, and more…

If you’re more used to getting updates every day, subscribe to our Telegram channel or follow us on social media: Twitter, Facebook.

-1

m31 Jun 24 2021 at 13:09

DataScience Digest — 24.06.21

5 min

1.9K

Artificial IntelligenceMachine learning*Big Data*Algorithms*Python*

The new issue of DataScienceDigest is here!

The impact of NLP and the growing budgets to drive AI transformations. How Airbnb standardized metric computation at scale. Cross-Validation, MASA-SR, AgileGAN, EfficientNetV2, and more.

If you’re more used to getting updates every day, subscribe to our Telegram channel or follow us on social media: Twitter, LinkedIn, Facebook.

m31 Jun 10 2021 at 12:48

DataScience Digest — 10.06.21

5 min

1.3K

Machine learning*Artificial IntelligenceBig Data*Algorithms*Python*

The new issue of DataScienceDigest is here!

Machine learning in healthcare, the top 10 TED talks on AI, fraud detection in Uber, DatasetGAN, Text-to-Image generation via transformers, and more…

m31 Jun 2 2021 at 23:42

DataScience Digest — 02.06.21

5 min

1.1K

Artificial IntelligenceMachine learning*Big Data*Algorithms*Python*

New issue of DataScienceDigest is here! OpenAI is launching a $100 million startup fund, Albumentations 1.0 has been released, lessons on ML platforms, image cropping on Twitter, and more.

m31 May 28 2021 at 14:29

DataScience Digest — 28.05.21

5 min

870

Python*Algorithms*Big Data*Machine learning*Artificial Intelligence

The new issue of Data Science Digest is here! Hop to learn about the latest news, articles, tutorials, research papers, and event materials on DataScience, AI, ML, and BigData. All sections are prioritized for your convenience. Enjoy!

alexpatel Apr 26 2021 at 12:29

Flitter Your Business With AI Integrated Flutter App Development

5 min

3.1K

SoftwareDevelopment of mobile applications*Machine learning*Artificial IntelligenceFlutter*

From sandbox

As we all are aware of the fact that the digital market is heavily leaning towards a reliable UX-driven process, app development has become quite complex, especially for targeting the industry for mobile platforms.

For every organization, creating a product that is beneficial for their customer needs always comes up with a plethora of challenges.

From the technical point of time, there are various challenges that every business faces, including selecting the right platform for the app, the right technology stack or framework, and creating an app that fulfills the needs and expectations of customers.

Similarly, there are more challenges that every business faces and needs to cope with while creating its dream product.

So, what to do??

Well, what if I say that the answer to all your queries and questions is Flutter app development with Artificial Intelligence (AI) integration……

Surprised? Wondering how?

Well, AI in Flutter app development is one of the best advancements in the software market. The concept of AI was first introduced during the 20th century with loads of innovations and advancements that we are still integrating into our mobile app development.

But, what are Artificial Intelligence and Flutter app development?

m31 Apr 21 2021 at 12:38

Data Science Digest — 21.04.21

3 min

Artificial IntelligenceMachine learning*Big Data*Algorithms*Python*

Hi All,

I’m pleased to invite you all to enroll in the Lviv Data Science Summer School, to delve into advanced methods and tools of Data Science and Machine Learning, including such domains as CV, NLP, Healthcare, Social Network Analysis, and Urban Data Science. The courses are practice-oriented and are geared towards undergraduates, Ph.D. students, and young professionals (intermediate level). The studies begin July 19–30 and will be hosted online. Make sure to apply — Spots are running fast!

If you’re more used to getting updates every day, follow us on social media:

Telegram
Twitter
LinkedIn
Facebook

Regards,
Dmitry Spodarets.

gui_tar_gz Apr 19 2021 at 15:53

Neural network Telegram bot with StyleGAN and GPT-2

3 min

5.4K

Python*Artificial IntelligenceMachine learning*Social networks and communities

The Beginning

So we have already played with different neural networks. Cursed image generation using GANs, deep texts from GPT-2 — we have seen it all.

This time I wanted to create a neural entity that would act like a beauty blogger. This meant it would have to post pictures like Instagram influencers do and generate the same kind of narcissistic texts. \

Initially I planned to post the neural content on Instagram but using the Facebook Graph API which is needed to go beyond read-only was too painful for me. So I reverted to Telegram which is one of my favorite social products overall.

The name of the entity/channel (Aida Enelpi) is a bad neural-oriented pun mostly generated by the bot itself.

One of the first posts generated by Aida

3 4 5 6