Pull to refresh
1024K+

Artificial Intelligence

AI, ANN and other forms of an artificial Intelligence

2 190,14
Rating
Show first
Rating limit
Level of difficulty

I Taught a Virtual Camera to Behave Like a Human Operator: How a Face Tracking Algorithm for Shorts/Reels Works

Level of difficultyHard
Reading time14 min
Reach and readers4.7K

In the previous article I described my “anime factory” in detail — a pipeline that automatically turns episodes into finished Shorts. But inside that system there is one especially important module that deserves a separate deep dive: a virtual camera for automatic reframing.

In this article, I will break down not just an “auto-crop function,” but a full virtual camera algorithm for vertical video. This is exactly the kind of task that looks simple at first glance: you have a horizontal video, you need to turn it into 9:16, keep a person in frame, and avoid making the result look like a jittery autofocus camera from the early 2010s.

But as soon as you try to build it not for a demo, but for a real pipeline, engineering problems immediately show up:

Read more

How I Built an “Anime Factory”: a System That Automatically Turns Episodes into YouTube Shorts

Level of difficultyMedium
Reading time18 min
Reach and readers2.5K

Hi, Habr!

Over the past few months, I have been building a system that I internally call an “anime factory”: it takes a source episode as input and produces a ready-to-publish YouTube Short with dynamic reframing, subtitles, post-processing, and metadata.

What makes it interesting is not just the fact that editing can be automated, but that a significant part of this work can be decomposed into engineering stages: transcription, audio and scene analysis, strong-moment discovery, “virtual camera” control, and a feedback loop based on performance metrics.

In this article, I will show how this pipeline is structured, why I chose a modular architecture instead of an end-to-end black box, where the system broke, and which decisions eventually made it actually usable.

Read more

Can AI Help You Remember What You Learn? My Experience Testing Recall

Level of difficultyEasy
Reading time4 min
Reach and readers11K

Every week, I watch tutorials, save articles, bookmark tools, and collect ideas I want to come back to later. But a few days later, the problem shows up: I may remember the topic, but not the details. I know I saw something useful, but I cannot explain it clearly or apply it with confidence.

Read more

Top 12 Best AI Video Generators (2026)

Level of difficultyEasy
Reading time13 min
Reach and readers8K

AI video generators in 2026 allow anyone to create high-quality videos from text prompts, images, or scripts in just minutes. 

This guide explains how the technology works, compares the leading tools on the market, and highlights their strengths, limitations, and best use cases to help you choose the right solution for your creative or business needs.

Read more

Why LeCun's World Model Won't Save AI

Level of difficultyEasy
Reading time19 min
Reach and readers6.6K

After the unexpected divorce between LeCun and Meta, there is a lot of talk that the dead-end in LLM progress will be overcome through the physics of the world. That is, having a neural network work with physical data from the surrounding environment will allow the model to acquire meaning and an understanding of its actions. LeCun has a foundational paper that nobody is going to read. So, I'll summarize it as best I can. Essentially, the idea is that the current trajectory of LLM development is doomed. As long as they are predicting the next token, real understanding — the emergence of real meaning — is impossible. LeCun proposes training neural networks on physical world data, assuming that building a model of it will allow the system to discard details and focus on meaning.

I agree with LeCun that using world data will partially solve the data scarcity problem. But here I see a problem that engineers might not understand. A physical model of the world is actually much poorer than human knowledge. Newton described the entire infinite number of possible falls with a few lines of formulas. I doubt LeCun wants to spend billions of dollars on this wonderful deduction.

Read more

Neurosymbolic AI: The Architecture of a Semantic Neural Network. How to Teach LLMs to Calculate

Level of difficultyEasy
Reading time17 min
Reach and readers4.5K

LLMs fail at elementary math. Corporations spend billions, but ultimately are forced to attach calculators to computing machines of incredible power. All attempts to fix this via Chain-of-Thought, fine-tuning on arithmetic tasks, or context expansion have failed.

I conducted a series of experiments to understand why, and came to the conclusion that neural networks are simply not meant for discrete arithmetic. Their true purpose is continuous transformations.

This article describes the implementation of a novel neural network architecture that combines the precision of symbolic AI with the generalization capabilities of LLMs. As always, experiments and code are included.

Read more

How to access Claude (web/api) from Russia in 2024?

Level of difficultyEasy
Reading time9 min
Reach and readers5.3K

Accessing Claude from Russia might seem like a daunting task due to the service's regional restrictions. In this article, I will explain in detail how to register for the web version and API of Claude, what tools are needed to bypass the restrictions, and how to use the service safely in the future. This guide is based on personal experience and includes registration methods that are current as of late 2024 and have been tested in practice.

After searching Russian-language resources, I realized that there is very little practically useful information, or it might not account for changes in the service's operation. Ordinary users have to gather information piece by piece - from various YouTube videos, forums, or superficial articles.

Therefore, I decided to create an up-to-date guide based on my personal experience and knowledge, taking into account all the pitfalls that an inexperienced Claude user might encounter.

Read more

Deepseek is not working: Bypassing the 'Deepseek service is busy' error with clever methods

Level of difficultyEasy
Reading time3 min
Reach and readers2.7K

DeepSeek is increasingly unavailable due to server overload. In this article, we will solve the problem in an original way – by installing Deepseek locally so that it works without any internet connection at all.

Read more

Top 10 Free Neural Networks for Photo Generation

Reading time4 min
Reach and readers988

Hi, I'm Dima and, like many, I've been using neural networks to create images for many years. But over the last year and a half, the market for generative models has changed so much and become so huge that it's impossible to try all the models. However, the question of which neural network is the best still remains. Benchmarks are poorly suited for a real assessment of creativity and cost, so I've gathered 10 models in one place, what they are best suited for, and what the price is (or rather, the free limit). I hope this helps you choose the most optimal neural network for your tasks.

Below is an honest rating based on quality, realism, convenience, and price.

Read more

Ollama from A to Z: How to Choose a Model, Configure, and Integrate

Level of difficultyMedium
Reading time9 min
Reach and readers1.5K

In this article, we take a detailed look at Ollama — a tool for running large language models (LLMs) locally. You will learn how to install the program, choose a suitable model, understand formats and quantization, configure the system for your hardware, and work both via the CLI, and via the API. Practical tips, configuration examples, and VRAM recommendations will help you use Ollama as effectively as possible for dialogues, text generation, code, and other tasks.

Read more

Local Chatbot Without Limits: A Guide to LM Studio and Open LLMs

Level of difficultyEasy
Reading time11 min
Reach and readers1.7K

In this article, we will not only install a local (and free) alternative to ChatGPT, but also review several open LLMs, delve into the advanced settings of LM Studio, connect the chatbot to Visual Studio Code, and teach it to assist us with programming. We will also look at how to fine-tune the model's behavior using system prompts.

Read more

Roskomnadzor tries to block EVERYTHING, plus a red alert level at OpenAI

Reading time9 min
Reach and readers723

The most interesting finance and tech news from Russia and the world for the week: RKN blocked FaceTime, Snapchat, and Roblox, visa-free travel with China and Saudi Arabia, Russia was added to the EU's money laundering blacklist, home surveillance cameras were hacked in South Korea, Musk's Twitter was fined in Europe, and rumors of a 'garlic' model from OpenAI.

Read more

It's all been figured out for us: Libraries with thousands of ready-made GPT prompts for work, study, and leisure

Level of difficultyEasy
Reading time3 min
Reach and readers454

A collection for those who have tried using neural networks for their tasks but were disappointed: it's unclear how a chatbot can help with anything serious.

Read more

TOP 6 Free GPT Chats on Telegram for Work, Study, and Other Purposes

Level of difficultyEasy
Reading time3 min
Reach and readers1K

Many readers, especially those living in Russia, want to know the best way to use GPT chat. When I'm asked about this, I give a clear answer: if the volumes are not very large, it's better to use Telegram bots.

GPT chatbots on Telegram are analogous to applications that use GPT (Generative Pre-trained Transformer) technology to dialogue with users. Here are the secrets to their popularity:

Read more
1
23 ...