Image processing *

Working with photos and videos

ArticlesPostsNewsAuthors

kapustinomm Aug 9 at 08:24

Docling in Working with Texts, Languages, and Knowledge

Medium

20 min

503

Python * Artificial IntelligencePDFImage processing *

Review

DocLing in Working with Texts, Languages, and Knowledge — an in-depth overview of the open-source DocLingtoolkit for extracting, structuring, and analyzing data from documents. The article covers approaches to processing multilingual texts, building language- and domain-specific knowledge models, and integrating DocLing into AI and NLP projects. Includes practical examples and recommendations for developers working with large volumes of unstructured data.

Markus_automation Feb 6 at 10:42

Image Recognition – Why AI is Still Not the Perfect Assistant in This Task, and How image captcha solver Helped

Easy

7 min

1.1K

Image processing * Artificial Intelligence

Case

Translation

Up to a certain point, I sincerely believed that in today’s world manual CAPTCHA recognition was gradually becoming an anachronism, especially when it came to such simple CAPTCHAs as image-based ones—where one merely needs to read text off a photograph and input it as plain text. But as it turns out, things aren’t quite so straightforward (no matter how it may sound).

gfx_pro Oct 21 2023 at 12:21

Do smartphone cameras need 12-bit ADCs, or my failed experiment

Medium

3 min

1.1K

Photographic equipmentGadgetsSmartphonesImage processing *

Analytics

Translation

Among photographers, it is known that on "big" cameras the use of 14-bit readout compared to 12-bit can have a positive impact on shadow detail. How does this apply to small sensors in smartphone cameras?

Let's find out

netsvetaev Dec 2 2022 at 15:02

InvokeAI 2.2: UI Outpainting, Embedding Management and more

2 min

6.4K

Artificial IntelligenceGraphic design * Machine learning * Image processing * Python *

InvokeAI 2.2 is now available to everyone. This update brings in exciting features, like UI Outpainting, Embedding Management and more. See highlighted updates below, or the full release notes for everything included in the release.

What’s new?

netsvetaev Nov 13 2022 at 00:16

I trained a neural network on my drawings and give the model for free (and teach you to create your own)

2 min

3.5K

Artificial IntelligenceGraphic design * Machine learning * Image processing * Python *

Tutorial

Great for seamless patterns, abstract drawings, and watercolor-styled images. How to use it and train a neural network on your own pictures?

Download the model here: https://huggingface.co/netsvetaev/netsvetaev-free

I wanna know!

netsvetaev Nov 4 2022 at 08:24

InvokeAI 2.1 Release

2 min

1.5K

Artificial IntelligencePython * Image processing *

The InvokeAI team is excited to share our latest feature release, with a set of new features, UI enhancements, and CLI capabilities.

Vitaly_net Aug 23 2022 at 15:40

Color image capturing device with pseudorandom patterns sets

4 min

785

Matlab * Machine learning * Image processing *

From sandbox

The present invention relates to an analog signal capturing devices generally and monochrome or color image capture sensors, such as a scanner or a Charge-Coupled-Device (“CCD”) for video and photo camera in particular, which are almost free from moiré and aliasing. The present invention relates to methods for enhancing the resolution of an image capture device and device for digital color/grey image displaying also.

ddimitrov Nov 17 2021 at 09:54

ruDALL-E: Generating Images from Text. Facing down the biggest computational challenge in Russia

11 min

11K

Сбер corporate blogSberDevices corporate blogImage processing * Machine learning * Artificial Intelligence

Multimodality has led the pack in machine learning in 2021. Neural networks are wolfing down images, text, speech and music all at the same time. OpenAI is, as usual, top dog, but as if in defiance of their name, they are in no hurry to share their models openly. At the beginning of the year, the company presented the DALL-E neural network, which generates 256x256 pixel images in answer to a written request. Descriptions of it can be found as articles on arXiv and examples on their blog.

As soon as DALL-E flushed out of the bushes, Chinese researchers got on its tail. Their open-source CogView neural network does the same trick of generating images from text. But what about here in Russia? One might say that “investigate, master, and train” is our engineering motto. Well, we caught the scent, and today we can say that we created from scratch a complete pipeline for generating images from descriptive textual input written in Russian.

In this article we present the ruDALL-E XL model, an open-source text-to-image transformer with 1.3 billion parameters as well as ruDALL-E XXL model, an text-to-image transformer with 12.0 billion parameters which is available in DataHub SberCloud, and several other satellite models.

man_of_letters Jul 22 2021 at 06:03

Mode on: Comparing the two best colorization AI's

11 min

4.1K

RUVDS.com corporate blogPython * TensorFlow * Machine learning * Image processing *

This article continues a series of notes about colorization. During today's experiment, we’ll be comparing a recent neural network with the good old Deoldify to gauge the rate at which the future is approaching.

This is a practical project, so we won’t pay extra attention to the underlying philosophy of the Transformer architecture. Besides, any attempt to explain the principles of its operation to a wide public in hand waving terms would become misguiding.

A lecturer: Mr. Petrov! How does a transformer work?
Petrov with a bass voice: Hum-m-m-m.

Google Colorizing Transformer vs Deoldify

+13

snakers4 Dec 5 2020 at 09:55

Playing with Nvidia's New Ampere GPUs and Trying MIG

11 min

4.4K

Big Data * Natural Language Processing * Computer hardwareMachine learning * Image processing *

Every time when the essential question arises, whether to upgrade the cards in the server room or not, I look through similar articles and watch such videos.

Channel with the aforementioned video is very underestimated, but the author does not deal with ML. In general, when analyzing comparisons of accelerators for ML, several things usually catch your eye:

The authors usually take into account only the "adequacy" for the market of new cards in the United States;
The ratings are far from the people and are made on very standard networks (which is probably good overall) without details;
The popular mantra to train more and more gigantic models makes its own adjustments to the comparison;

The answer to the question "which card is better?" is not rocket science: Cards of the 20* series didn't get much popularity, while the 1080 Ti from Avito (Russian craigslist) still are very attractive (and, oddly enough, don't get cheaper, probably for this reason).

All this is fine and dandy and the standard benchmarks are unlikely to lie too much, but recently I learned about the existence of Multi-Instance-GPU technology for A100 video cards and native support for TF32 for Ampere devices and I got the idea to share my experience of the real testing cards on the Ampere architecture (3090 and A100). In this short note, I will try to answer the questions:

Is the upgrade to Ampere worth it? (spoiler for the impatient — yes);
Are the A100 worth the money (spoiler — in general — no);
Are there any cases when the A100 is still interesting (spoiler — yes);
Is MIG technology useful (spoiler — yes, but for inference and for very specific cases for training);

HMS-service Aug 11 2020 at 10:21

Как с помощью HUAWEI ML Kit самостоятельно создать апплет для фото на документы

5 min

2.1K

DIYDevelopment for Android * Image processing * Machine learning *

Общая информация

В предыдущей статье мы рассказали о том, как создать камеру для улыбок с помощью HUAWEI ML Kit. В этот раз я собираюсь представить вам новую функцию HUAWEI ML Kit.

Вас когда-нибудь просили на учебе или работе принести фотографию определенного размера с цветным фоном для документов? В большинстве случаев у человека не окажется под рукой подходящей фотографии. Однажды в институте нам решили оформить персональные пропуска, но фотостудия оказалась закрыта. Тогда я сфотографировался на телефон, использовав простыню в качестве фона. И получил выговор от преподавателя. Но с помощью инструмента HUAWEI ML Kit вы сможете интегрировать SDK для сегментации изображений в ваше приложение и разработать апплет, чтобы создавать фото на документы самостоятельно и решить проблему отсутствия нужных фотографий.

Самое главное, что этот SDK абсолютно бесплатный и работает на всех телефонах на базе Android.

Разработка апплета для фото на документы самостоятельно

1. Подготовка

1.1 Добавьте репозиторий Maven Huawei в файл на уровне проекта build.gradle

Откройте файл build.gradle в корневом каталоге вашего проекта Android Studio.

Farwit Aug 10 2020 at 13:17

Neural networks in reality

2 min

Спецлаб corporate blogBig Data * The future is hereImage processing * Industrial Programming *

Recovery Mode

The mass of news and articles about artificial intelligence creates the illusion that we are living in a fantastic time. But when you start asking everyone what exactly is useful in real life from these high technologies, the answers come down to some Google features, mobile games and a story about Chinese videos. By the way, oh, these Chinese videos — for some reason, they are constantly shown by the central mass media when they demonstrate Moscow's intellectual technologies.

In words, it seems, all the «intellects» are installed already everywhere, the whole country has long been transferred to neural networks, but only in some kind of demonstration pictures, in diagrams, on fingers. There is a mental dissonance — why not take a video camera and shoot at least a fragment of how Russia's super mega technologies work?

As Nikita Sergeevich said, «science ceases to be self-indulgence when its fruits are applied in the national economy.» And today's artificial intelligence is familiar to us only from games. Many people really want to see something useful in reality. Therefore, we were not too lazy and recorded our video of the operation of neural networks from real objects.

-1

alegorov Dec 8 2019 at 22:08

The color of the Moon and the Sun from space in terms of RGB and color temperature

17 min

3.7K

PhysicsImage processing * Popular scienceAstronomy

It would seem that the question of the color of the Moon and the Sun from space for modern science is so simple that in our century there should be no problem at all with the answer. We are talking about colors when observing precisely from space, since the atmosphere causes a color change due to Rayleigh light scattering. «Surely somewhere in the encyclopedia about this in detail, in numbers it has long been written,» you will say. Well, now try searching the Internet for information about it. Happened? Most likely no. The maximum that you will find is a couple of words about the fact that the Moon has a brownish tint, and the Sun is reddish. But you will not find information about whether these tints are visible to the human eye or not, especially the meanings of colors in RGB or at least color temperatures. But you will find a bunch of photos and videos where the Moon from space is absolutely gray, mostly in photos of the American Apollo program, and where the Sun from space is depicted white and even blue.

Especially my personal opinion is nothing but a consequence of the intervention of politics in science. After all, the colors of the Moon and the Sun from space directly relate to the flights of Americans to the Moon.

I searched through many scientific articles and books in search of information about the color of the Moon and the Sun from space. Fortunately, it turned out that even though they do not have a direct answer to RGB, there is complete information about the spectral density of the solar radiation and the reflectivity of the Moon across the spectrum. This is quite enough to get accurate colors in RGB values. You just need to carefully calculate what, in fact, I did. In this article I will share the results of calculations with you and, of course, I will tell you in detail about the calculations themselves. And you will see the Moon and the Sun from space in real colors!

andrewbo29 Sep 18 2019 at 14:22

How we made landmark recognition in Cloud Mail.ru, and why

11 min

2.6K

VK corporate blogAlgorithms * Artificial IntelligenceMachine learning * Image processing *

With the advent of mobile phones with high-quality cameras, we started making more and more pictures and videos of bright and memorable moments in our lives. Many of us have photo archives that extend back over decades and comprise thousands of pictures which makes them increasingly difficult to navigate through. Just remember how long it took to find a picture of interest just a few years ago.

One of Mail.ru Cloud’s objectives is to provide the handiest means for accessing and searching your own photo and video archives. For this purpose, we at Mail.ru Computer Vision Team have created and implemented systems for smart image processing: search by object, by scene, by face, etc. Another spectacular technology is landmark recognition. Today, I am going to tell you how we made this a reality using Deep Learning.

+43

gregpost Jul 30 2019 at 11:49

Automatic respiratory organ segmentation

8 min

2.3K

Inobitec corporate blogAlgorithms * Data visualization * Image processing * 3D-graphics *

Manual lung segmentation takes about 10 minutes and it requires a certain skill to get the same high-quality result as with automatic segmentation. Automatic segmentation takes about 15 seconds.

I assumed that without a neural network it would be possible to get an accuracy of no more than 70%. I also assumed, that morphological operations are only the preparation of an image for more complex algorithms. But as a result of processing of those, although few, 40 samples of tomographic data on hand, the algorithm segmented the lungs without errors. Moreover, after testing in the first five cases, the algorithm didn’t change significantly and correctly worked on the other 35 studies without changing the settings.

Also, neural networks have a disadvantage — for their training we need hundreds of training samples of lungs, which need to be marked up manually.

kitashov Jul 12 2019 at 08:11

AI-Based Photo Restoration

7 min

18K

VK corporate blogAlgorithms * Machine learning * Image processing *

Hi everybody! I’m a research engineer at the Mail.ru Group computer vision team. In this article, I’m going to tell a story of how we’ve created AI-based photo restoration project for old military photos. What is «photo restoration»? It consists of three steps:

we find all the image defects: fractures, scuffs, holes;
we inpaint the discovered defects, based on the pixel values around them;
we colorize the image.

Further, I’ll describe every step of photo restoration and tell you how we got our data, what nets we trained, what we accomplished, and what mistakes we made.

+32

FizpokPak Apr 11 2019 at 12:57

Dog Breed Identifier: Full Cycle Development from Keras Program to Android App. on Play Market

25 min

16K

Java * Python * Artificial IntelligenceImage processing * Development for Android *

With the recent progress in Neural Networks in general and image Recognition particularly, it might seem that creating an NN-based application for image recognition is a simple routine operation. Well, to some extent it is true: if you can imagine an application of image recognition, then most likely someone have already did something similar. All you need to do is to Google it up and to repeat.

However, there are still countless little details that… they are not insolvable, no. They simply take too much of your time, especially if you are a beginner. What would be of help is a step-by-step project, done right in front of you, start to end. A project that does not contain «this part is obvious so let's skip it» statements. Well, almost :)

In this tutorial we are going to walk through a Dog Breed Identifier: we will create and teach a Neural Network, then we will port it to Java for Android and publish on Google Play.

For those of you who want to see a end result, here is the link to NeuroDog App on Google Play.

Web site with my robotics: robotics.snowcron.com.
Web site with: NeuroDog User Guide.

Here is a screenshot of the program:

+11