All streams
Search
Write a publication
Pull to refresh

My feed

Type
Rating limit
Level of difficulty
Warning
To set up filters sign in or sign up
Article

Why LLMs Drift into Convincing Nonsense (And a Practical Solution)

Level of difficultyMedium
Reading time14 min
Views184

Imagine you have an idea powerful enough to change the world. Your tool of choice is a state-of-the-art LLM, ready to help you formalize the problem, generate hypotheses, and synthesize a solution. What you receive is a construct that is internally logical, elegant, and coherent... yet completely wrong. It's a mix of established facts, model-generated hallucinations, and your own subtle biases. With no way to test it in practice or design a clean experiment, the entire endeavor suddenly starts to look like sophisticated nonsense.

So, what went wrong along the way? From the very first prompt, the model doesn't truly "understand" your ambiguous intent. Instead, it steers you towards a formulation that fits its familiar and computationally cheap patterns. This guidance happens through clarifying questions and structured options, essentially funneling you down one of its predefined "corridors." This behavior isn't driven by any explicit "will" of the model; it's an emergent consequence of probabilistic optimization—minimizing prediction error. For the system, a structured, predictable dialogue is both optimal and safe. This aligns perfectly with the developers' goals: it's cheaper, more stable, and most users are satisfied with quick, template-based answers.

The result is that mathematical efficiency serves engineering and commercial objectives. There is no systemic incentive to combat the AI's tendency to reduce a complex problem to a simple, "cheap" answer. It's profitable for developers, economical for the model, and often, the user doesn't even know what an "ideal" answer would look like.

Read more
Post

Хватит fine-tuning. Просто постройте RAG-пайплайн.

Я всё чаще вижу, как люди делают fine-tuning LLM под задачи, где это вообще не нужно.
В большинстве случаев вам не нужен очередной «наполовину дообученный» модельный франкенштейн — вам нужен RAG (Retrieval-Augmented Generation).

Почему:

  • Fine-tuning дорогой, медленный и хрупкий.

  • В большинстве кейсов не нужно «учить» модель — достаточно дать правильный контекст.

  • С RAG модель всегда актуальна: обновили документацию → обновили эмбеддинги → готово.

Чтобы доказать это, я собрал ассистента по документации на RAG:

  • Документация режется на чанки и эмбеддится

  • Запросы пользователей матчатся через косинусное сходство

  • GPT отвечает с нужным контекстом

  • Каждый запрос логируется → вы видите, с чем юзеры сталкиваются (пробелы в доках, запросы фич, инсайты по продукту)

👉 Живое демо: intlayer.org/doc/chat
👉 Полный разбор + код + шаблон: intlayer.org/blog/rag-powered-documentation-assistant

Моё мнение:
Для большинства задач с документацией и продуктом fine-tuning мёртв.
RAG проще, дешевле и куда более поддерживаемый.

Но, может быть, я не прав. Что думаете?
Есть ли будущее у связки fine-tuning + RAG, или RAG — очевидное решение для 80% кейсов?

P.S.: это перевод поста с английского на русский при помощи ChatGPT.

Tags:
0
Comments0
Article

Новая атака с использованием бэкдора PhantomRShell

Level of difficultyMedium
Reading time7 min
Views339

В августе, благодаря нашей песочнице, была предотвращена атака на российские организации с применением нового вредоносного кода. Изначально мы предположили, что это массовый фишинг с серверов злоумышленников, который каждый день можно встретить на почте любой организации. Но оказалось, что отправитель письма вполне легитимный: он был скомпрометирован злоумышленниками, нацеленными на российские оборонные и промышленные организации.

Хакеры использовали сложную схему сокрытия вредоносной нагрузки в архивах-полиглотах. Полиглоты — это файлы, которые могут быть валидны с точки зрения спецификации нескольких форматов. Сама вредоносная нагрузка является новой обфусцированной вариацией инструмента PhantomRShell, который использует группировка PhantomCore (ранее мы писали про нее в блоге).

В этой статье мы расскажем подробности атаки, ее возможный исходный вектор и дадим рекомендации по защите почтовой инфраструктуры от взлома и подобных атак. Интересно? Добро пожаловать под кат!

Read more
Article

Security Week 2538: Apple усиливает защиту от таргетированных атак

Reading time4 min
Views221

В представленных на прошлой неделе новых смартфонах Apple улучшена защита от кибератак с использованием стратегий повреждения данных в оперативной памяти. Уязвимости, приводящие к переполнению буфера или повторному использованию участка оперативной памяти после освобождения, станет гораздо сложнее эксплуатировать благодаря технологии Memory Integrity Enforcement. Об этом компания Apple сообщает в подробной технической статье. Там утверждается, что устройства нового поколения будут гораздо лучше защищены против даже наиболее сложных таргетированных атак. 

Read more
Article

Postgres Pro TDE — security and performance

Level of difficultyMedium
Reading time14 min
Views456

TDE comes in many flavors — from encryption at the TAM level to full-cluster encryption and tablespace markers. We take a close look at Percona, Cybertec/EDB, Pangolin/Fujitsu, and show where you lose performance and reliability, and where you gain flexibility.

On top of that, Vasily Bernstein, Deputy head of product development, and Vladimir Abramov, senior security engineer, will share how Postgres Pro Enterprise implements key rotation without rewriting entire tables — and why AES-GCM was the clear choice.

Read more
Article

The Russian trace in the history of the PostgreSQL logo

Level of difficultyEasy
Reading time7 min
Views1.4K

The story of the PostgreSQL logo was shared by Oleg Bartunov, CEO of Postgres Professional, who personally witnessed these events and preserved an archive of correspondence and visual design development for the database system.

Our iconic PostgreSQL logo — our beloved “Slonik” — has come a long way. Soon, it will turn thirty! Over the years, its story has gathered plenty of myths and speculation. As a veteran of the community, I decided it’s time to set the record straight, relying on the memories of those who were there. Who actually came up with it? Why an elephant? How did it end up in a diamond, and how did the Russian word “slonik” become a part of the global IT vocabulary?

Read more
Article

Build a Short Video App Like DramaBox to Engage Global Audiences

Level of difficultyEasy
Reading time6 min
Views452

Short video apps have completely reshaped how people consume entertainment. Instead of sitting down for a two-hour movie or a 45-minute TV episode, viewers are now hooked on bite-sized videos that fit into their busy schedules. This shift has been accelerated by Gen Z and Millennials, who prefer quick storytelling formats that are both interactive and engaging.

In 2025, the OTT and short video industry is projected to see over 1.5 billion monthly active users worldwide, with an average revenue per user (ARPU) of nearly $12. The reasons are clear: affordability, accessibility, and convenience. The success of apps like DramaBox shows that people are willing to spend money on shorter dramas as long as they deliver strong storytelling.

For entrepreneurs, this presents a golden opportunity to build OTT platforms like DramaBox and tap into this global demand.

Read more
Article

Building a Resume Matcher with tRPC, NLP, and Vertex AI

Level of difficultyEasy
Reading time6 min
Views1.1K

I share how I built a resume matcher app using tRPC, TypeScript, and Google Vertex AI. The project takes PDF resumes and job postings, extracts text, applies basic NLP for skill detection, and then calls Gemini 1.5 Flash for deeper analysis. Along the way, I explain why tRPC felt faster and cleaner than REST or GraphQL for an MVP, show code snippets from the repo, and discuss both the benefits and trade-offs of this approach.

Read more
Article

START: how to defeat hallucinations and teach LLMs accurate calculations

Level of difficultyEasy
Reading time3 min
Views664

START is an open-source LLM designed for precise calculations and code verification. It addresses two major issues that most standard models face: hallucinations and errors in multi-step calculations. This article explains why these problems arise and how START solves them.

Read more
Article

OpenAI's Codex CLI Agent: The Complete VS Code Setup Guide

Level of difficultyEasy
Reading time3 min
Views4K

This tutorial will guide you through the process of integrating OpenAI’s powerful Codex coding agent directly into your Visual Studio Code environment. This tool functions as an AI pair programmer, capable of understanding complex prompts to execute commands, write code, run tests, and even build entire applications from scratch.

Read more
Article

How we loaded a petabyte into PostgreSQL before New Year — and what happened next

Level of difficultyMedium
Reading time17 min
Views1K

It all started as a joke by the office coffee machine. But, as with every decent joke, it suddenly sounded worth trying — and before we knew it, we were knee-deep in an experiment that turned out to be anything but trivial, complete with a whole minefield of gotchas.

It began simply: while everyone else was busy debating hardware tuning and squeezing out extra TPS from their systems, we thought — why not just shove a huge chunk of data into PostgreSQL and see how it holds up? Like, really huge. Say, a one-petabyte database. Let’s see how it survives that.

It was December 10, the boss wanted the report by January 20, and New Year was less than a month away. And that itch that all engineers know? It hit hard.

Read more
Article

How to load test PostgreSQL database and not miss anything

Level of difficultyMedium
Reading time14 min
Views840

During load testing of Tantor Postgres databases or other PostgreSQL-based databases using the standard tool pgbench, specialists often encounter non-representative results and the need for repeated tests due to the fact that details of the environment (such as DBMS configuration, server characteristics, PostgreSQL versions) are not recorded. In this article we are going to review author's pg_perfbench, which is designed to address this issue. It ensures that scenarios are repeatable, prevents the loss of important data, and streamlines result comparison by registering all parameters in a single template. It also automatically launches pgbench with TPC-B load generation, collects all metadata on the testing environment, and generates a structured report.

Read more
Article

AGENTS.md: The README for Your AI Agent

Level of difficultyEasy
Reading time3 min
Views2.1K

If you’re like me and work with multiple AI coding agents, you know the frustration of managing different instruction files. It’s a pain to keep everything updated across various formats. But I’ve got some great news for you. A new, simplified standard has emerged, and it’s called AGENTS.md.

Read more
Post

Samsung expands GPU team: open RTL Designer and Performance Architect positions

The team at Samsung that designs the Xclipse GPU in Galaxy phones with Exynos SoC – is doing expansion. There are multiple jobs available, including positions in the RTL and Architecture teams:

These positions are in two locations:

  • Samsung Advanced Computing Lab (ACL) in San Jose, California

  • Samsung Austin Research and Development Center (SARC) in Austin,Texas

The compensations listed in the opening descriptions are base salaries only. The total compensation includes serious performance-based bonuses.

You can apply at the website – or, if you want, I can make you an internal referral since I (Yuri Panchul) am a member of the GPU team. Getting a referral from me is easy:

  1. First, you solve the SystemVerilog Microarchitecture Challenge for AI No.2. Adding the Flow Control, an open-source challenge from Verilog Meetup – and send me the solution to yuri@panchul.com.

https://github.com/verilog-meetup/systemverilog-microarchitecture-challenge-for-ai-2

2. Then I discuss your solution with you either in person (if you live in the San Francisco Bay Area) or over Zoom. I don’t care whether you solve the challenge manually or use AI for it, but I need you to explain every single line of the solution. In addition to it, I may ask you a couple more questions (manual solutions only in front of me, no AI).

3. If I like your solutions, I will enter your data in the Samsung internal referral website. After that, you will get an email from the website and can start an application to go through the official Samsung interview process. I will also forward your resume to the hiring managers and the team’s recruiters.

4. Note that solving a couple of puzzles for me is not a part of the official Samsung interview process. This is just my personal way to ensure I forward only the relevant resumes to the company. Every single member of our RTL teams would solve such a thing easily, so this is not something difficult, just some basic techniques to design a static pipeline in Verilog, generic for the industry.

Our team consists of friendly, proficient, and helpful people. We also have a fancy office that just got 10-year anniversary:

Thank you,
Yuri Panchul

Tags:
+2
Comments0
1
23 ...