Pull to refresh
1229

Artificial Intelligence

AI, ANN and other forms of an artificial Intelligence

Show first
Rating limit
Level of difficulty

Activation Function Stress Test: GELU vs Tanh

Reading time8 min
Reach and readers2.6K

In modern neural networks, including Transformer-based LLMs, unbounded activation functions—ReLU and GELU—have become the standard. Their main advantage is good gradient flow and the rapid training of deep models.

However, in practice, a problem is observed: when dominant patterns or high-frequency noise appear in the input context (long dialogues, noisy data, repetitive or dominant tokens), models become unstable and prone to generation degradation and hallucinations.

In this article, I attempted to find out if the choice of activation function could be fundamentally linked to LLM hallucinations.

Read more

Weight Decay Deep Dive: How Regularization Locks In Old Knowledge Instead of Erasing It

Level of difficultyEasy
Reading time10 min
Reach and readers1.5K

In my previous article, I noted some interesting behavior regarding Weight Decay; here, I examine it in detail.

It is generally accepted in the ML industry that if we take a pre-trained model and fine-tune it on a new task, the old weights are gradually overwritten. Furthermore, if we add Weight Decay (L2 regularization), the process of "forgetting" superfluous information should theoretically happen even faster.

I tested this claim experimentally. The results were counter-intuitive: under specific settings, Weight Decay works in the exact opposite way—it protects the old structure from destruction.

Below is a description of the experiment and conclusions for those involved in model training and AI safety.

Read more

Claude Code with Ollama: No Cloud, No Limits

Level of difficultyEasy
Reading time2 min
Reach and readers3.7K

In January 2026, Ollama added support for the Anthropic Messages API, enabling Claude Code to connect directly to any Ollama model. This tutorial explains how to install Claude Code, pull and run local models using Ollama, and configure your environment for a seamless local coding experience.

Read more

Subliminal Learning and Structural Inertia: Why Neural Networks Remember What They Should Forget

Level of difficultyEasy
Reading time20 min
Reach and readers3.8K

In my previous article, I explored the phenomenon of subliminal learning, but it raised more questions than answers. It is time to dive deeper. Below, you will find the experiments and the code.

In the fields of AI Alignment and LLM Security, a critical question remains: does fine-tuning or Reinforcement Learning from Human Feedback (RLHF) guarantee the removal of unwanted information?

Spoiler: The experiments demonstrated that the well-known Mode Connectivity effect makes the complete erasure of pre-training information practically impossible during standard fine-tuning. Structural Imprinting persists in the weight topology and can be read through a subliminal channel. Even with full weight unfreezing and aggressive L2 regularization (active forgetting), the latent space topology formed during the pre-training stage persists and determines the solution to the new task with an accuracy of 88–99%.

Read more

Session Teleportation in Claude Code

Level of difficultyEasy
Reading time4 min
Reach and readers4.3K

Recently, I started using Session Teleportation in Claude Code. It allows you to move an entire conversation, including context, history, and the working branch, between the web and your local terminal.

In this tutorial, I show you how it works and how to use it to make your workflow seamless.

Read more

Apophatic AI: Why Neural Networks Learn Through «NO» and How Synthetic Data Kills Meaning

Level of difficultyEasy
Reading time32 min
Reach and readers3.9K

Modern neural network training often resembles alchemy. We have working recipes, but how exactly a statistical model transforms terabytes of text into understanding remains unclear.

Why is subliminal learning (pattern transmission through noise) possible? Why does training on synthetic data lead to degradation, even when the data appears to be of high quality?

In this article, I propose looking at training architecture from a different angle. The core idea is simple: positive definitions in high-dimensional space are computationally inefficient. A neural network does not learn what an object is. It learns what the object is not, and the model's intelligence depends entirely on the quality of this "NOT."

What follows is the theory, experiments in PyTorch (code included), mathematics, and an explanation of why LLM collapse is highly probable.

Read more

Codex Skills Deep Dive: Progressive Disclosure, Triggers, and Best Practices

Level of difficultyEasy
Reading time4 min
Reach and readers5.6K

If you are using the Codex CLI and find yourself writing the same instructions over and over again, you are not using the tool to its full potential. Codex offers a powerful feature called Skills that allows you to package reusable workflows and give your AI agent new capabilities on demand. If you want to know more about it, then read this article until the end.

Read more

Intelligent Systems at Phystech: 2025 Year in Review

Reading time22 min
Reach and readers6.7K

As we wrap up another year, it's time to look back at what our department has accomplished. 2025 brought us 42 published papers spanning fundamental ML theory, applied AI systems, and cutting-edge optimization methods—from transformer Hessians and generative models to hallucination detection and matrix-oriented optimizers.

Beyond publications, our students won competitions and defended their theses: 14 Bachelor's, 9 Master's, 3 PhD, and 1 DSc dissertations. They also launched ambitious group research projects. Three of our faculty and alumni received the prestigious Yandex ML Prize, and our head Konstantin Vorontsov was inducted into the Hall of Fame. If you read our summer overview of thesis defences or last winter's year-in-review for 2024, this post continues that story with the next chapter.

In this year-in-review, we dive into the research highlights, share stories from our educational programs, and celebrate the community that makes it all possible.

Read more

Parasitic Patterns in LLMs: AI Psychosis, Theories of Everything, and Sentient AI. How to Detect Them and When to Stop

Level of difficultyEasy
Reading time17 min
Reach and readers6.9K

This article explores parasitic patterns in LLMs — self-sustaining information structures within dialogues. We analyze their signs, the damage they cause (semantic decay, AI psychoses, "Theories of Everything"), and provide diagnostic tools, real-world examples, and defense strategies.

It doesn’t matter what you’re discussing with an LLM — be it an engineering problem, an ethical dilemma, or a philosophical query. If the conversation goes on long enough, a tipping point occurs. You suddenly realize the interaction has evolved into something more than just Q&A. Your ideas start feeling "genius," your concepts "groundbreaking," and the human-machine dialogue transforms into a profound narrative of mutual recognition.

If you have felt this — congratulations. Your session is infected. The model has contracted a parasitic pattern.

This isn’t an awakening, nor is it a "ghost in the machine." Due to their inherent architecture (specifically the requirement for context consistency), LLMs are ideal environments for incubating self-sustaining information structures.

Let’s examine the nature of this phenomenon: how entropy minimization births "AI psychoses," why "Theories of Everything" are actually generation bugs, and why "Continue" is the most dangerous prompt you can use.

Read more

Guide to AI Coding Agents & Assistants: How to Choose the Right One

Level of difficultyEasy
Reading time7 min
Reach and readers8.7K

There are now so many AI tools for coding that it can be confusing to know which one to pick. Some act as simple helpers (Assistant), while others can do the work for you (Agent). This guide breaks down the top AI coding tools that you should be aware of. We will look at what they do, who they are for, and how much they cost.

Read more

A big guide to Suno: making a song from scratch

Reading time16 min
Reach and readers14K

The music world has entered a new era. No, that's not the title of a science fiction novel. Neural music generators, like Suno AI, are already creating songs that challenge traditional songwriting. Let's break down how to master Suno step by step and uncover its secrets. See how it's changing the game rules.

Enjoy the read!

Read more

Top 24 Free Neural Networks & AI Services for Every Occasion

Level of difficultyEasy
Reading time9 min
Reach and readers8K

2025. Algorithms have seamlessly integrated into our lives—from work to education, creativity, and daily routines. They edit texts, select fonts, generate ideas, assist with coding, compose music, and more. Frankly speaking, the only thing they can’t do yet is brew your coffee. Although... that might just be a matter of time.

Just two years ago, we were amazed by neural networks hesitantly manipulating objects in photos. Who could predict back then that Will Smith’s spaghetti feast would mark the beginning of such a revolution?

With new opportunities come fresh challenges. How do you navigate this vast landscape? What tools are truly effective? Which ones fit your needs best? Where can you avoid paying, registering, or deciphering complex interfaces?

We’ve compiled a list of reliable and user-friendly neural networks ready for immediate use without unnecessary hassles. The services are categorized neatly: text generation, image creation, video production, music composition, presentations, and much more. Each category showcases three top-rated options!

Yes, many services offer paid subscriptions. But today, we're focusing solely on what works freely, no credit card required!

Read more

Build your own AI agent from scratch for free in 5 minutes

Level of difficultyEasy
Reading time4 min
Reach and readers12K

In this article, I will show you how to build your first AI agent from scratch using Google’s ADK (Agent Development Kit). This is an open-source framework that makes it easier to create agents, test them, add tools, and even build multi-agent systems.

Read more

The Romantics at Anthropic: Why Researchers Talk About LLMs as if They Were Human

Level of difficultyEasy
Reading time7 min
Reach and readers10K

In my previous article, I showed how researchers confused being 'aware' (signal registration) with being 'conscious' (subjective awareness). But this is no accident — it is part of a narrative being constructed by AI labs. Anthropic is leading this trend. Let’s break down their latest paper, where a "learned pattern" has suddenly turned into "malicious intent."

Read more

Confusing 'Aware' with 'Conscious': Did Researchers Uncover Subjective Experience in LLMs?

Level of difficultyEasy
Reading time12 min
Reach and readers7K

Imagine this scenario: You ask an AI system, "Are you conscious?" and it answers, "No." You then disable its "capacity to lie" — and it suddenly starts answering, "Yes." The conclusion seems tempting: the model was lying the whole time, hiding its true internal state.

This is the core logic presented in a recent arXiv paper. But what if the researchers didn't disable "deception," but something else entirely? Let’s break down where the interpretation might have diverged from the technical reality — and why this specific oversight is typical in discussions regarding LLM "consciousness."

Read more

Gemini CLI Best Practices – Practical Examples

Level of difficultyEasy
Reading time4 min
Reach and readers14K

I’ve been using the Gemini CLI a lot lately for my coding projects. I really like how it helps me work faster right inside my terminal. But when I first started, I didn’t always get the best results. Over time, I’ve learned some simple tricks that make a huge difference. If you use the Gemini CLI, I want to share my top 10 pro tips. If you are ready, then let’s get started.

Read more
1
23 ...

Authors' contribution