Artificial Intelligence

AI, ANN and other forms of an artificial Intelligence

Open source * DevOps * Artificial IntelligenceNatural Language Processing *

Back in July I wrote about Gaunt Sloth Assistant hitting 0.9.2. Today we finally get to say version 1.0.0 is out. This is the release where we upgraded our primary dependency the LangChain/LangGraph to v1, moved the runtime baseline to Node 24/npm 11, and declared the tool ready for daily automation work.

What changed since the last post?

Reviews concluded with a call to the built-in rating tool. By default the scale is 10/10, the pass threshold is 6/10, and rates below 6 cause the review command to return non-zero exit codes. If you prefer warnings-only mode, set commands.review.rating.enabled (and/or commands.pr.rating.enabled) to false in .gsloth.config.*.
Identity profiles are now part of the core workflow, letting you swap prompts, models, and providers per folder with a simple -i profile-name flag.
Middleware is now first-class. You can stack built-ins such as anthropic-prompt-caching or summarization, or point at your own JS middleware objects, and the CLI shows what runs alongside every command.
Deep merging for command configs fixes the annoying situation when overriding the content provider deleted the rating settings. Defaults now survive partial overrides.
OAuth caching, documentation, and the README were refreshed so newcomers can get productive faster, and dependencies were hardened while we were here.

Identity profiles are the everyday quality-of-life feature in 1.0.0. They let me flip between system prompts, model presets, and tool chains per task. gth pr 555 PP-4242 still reads .gsloth/.gsloth-settings, but gth -i devops pr 555 PP-4242 automatically switches to .gsloth/.gsloth-settings/devops/ with whatever prompts and providers that folder declares.

Need to talk to Jira through MCP? Drop a profile such as jira-mcp that contains its own config and call gth -i jira-mcp chat. A trimmed example looks like this:

{
  "llm": {
    "type": "vertexai",
    "model": "gemini-2.5-pro"
  },
  "mcpServers": {
    "jira": {
      "url": "https://mcp.atlassian.com/v1/sse",
      "authProvider": "OAuth",
      "transport": "sse"
    }
  },
  "requirementsProviderConfig": {
    "jira": {
      "cloudId": "YOUR-JIRA-CLOUD-ID-UUID",
      "displayUrl": "https://YOUR-BUSINESS.atlassian.net/browse/"
    }
  },
  "commands": {
    "pr": {
      "contentProvider": "github",
      "requirementsProvider": "jira"
    }
  }
}

Switching between those folders is now just a flag, so I can keep separate personas for DevOps, documentation, or any remote MCP I need to reach.

The rater tool is the other big unlock. Reviews always included qualitative feedback, but 1.0.0 makes the score actionable: we share it with the review module through an artifact store and wire it to setExitCode, so CI can fail automatically when quality is below the goal. Setting guardrails for production services now takes seconds and no longer depends on custom scripts.

Finally, the middleware registry and artifact store give me nicer hooks for future automation. I can wrap model/tool calls, log exactly what ran, and still let Gaunt Sloth handle the chat, code, PR, or init commands it already mastered. The CLI remains a small TypeScript binary you can ship through npm or run via npx gth, but it now has the internal architecture to evolve without hacks.

If you want to try the release, the quickest path is still
npm install -g gaunt-sloth-assistant

The GitHub repo at https://github.com/Galvanized-Pukeko/gaunt-sloth-assistant is there for reference and issues. File an issue, drop feedback in Discussions, or wire the new rater tool into your CI and tell me how it behaves—I would love help pushing 1.1 features.

Huge thanks to all contributors for their PRs and testing.

@aymericzip

Sep 17 at 09:3227K

Artificial IntelligenceData Engineering *

Хватит fine-tuning. Просто постройте RAG-пайплайн.

Я всё чаще вижу, как люди делают fine-tuning LLM под задачи, где это вообще не нужно.
В большинстве случаев вам не нужен очередной «наполовину дообученный» модельный франкенштейн — вам нужен RAG (Retrieval-Augmented Generation).

Почему:

Fine-tuning дорогой, медленный и хрупкий.
В большинстве кейсов не нужно «учить» модель — достаточно дать правильный контекст.
С RAG модель всегда актуальна: обновили документацию → обновили эмбеддинги → готово.

Чтобы доказать это, я собрал ассистента по документации на RAG:

Документация режется на чанки и эмбеддится
Запросы пользователей матчатся через косинусное сходство
GPT отвечает с нужным контекстом
Каждый запрос логируется → вы видите, с чем юзеры сталкиваются (пробелы в доках, запросы фич, инсайты по продукту)

👉 Живое демо: intlayer.org/doc/chat
👉 Полный разбор + код + шаблон: intlayer.org/blog/rag-powered-documentation-assistant

Моё мнение:
Для большинства задач с документацией и продуктом fine-tuning мёртв.
RAG проще, дешевле и куда более поддерживаемый.

Но, может быть, я не прав. Что думаете?
Есть ли будущее у связки fine-tuning + RAG, или RAG — очевидное решение для 80% кейсов?

P.S.: это перевод поста с английского на русский при помощи ChatGPT.

@YuriPanchul

Aug 20 at 03:1819K

High performance * Machine learning * FPGA * Artificial IntelligenceCPU

Samsung expands GPU team: open RTL Designer and Performance Architect positions

The team at Samsung that designs the Xclipse GPU in Galaxy phones with Exynos SoC – is doing expansion. There are multiple jobs available, including positions in the RTL and Architecture teams:

These positions are in two locations:

Samsung Advanced Computing Lab (ACL) in San Jose, California
Samsung Austin Research and Development Center (SARC) in Austin,Texas

The compensations listed in the opening descriptions are base salaries only. The total compensation includes serious performance-based bonuses.

You can apply at the website – or, if you want, I can make you an internal referral since I (Yuri Panchul) am a member of the GPU team. Getting a referral from me is easy:

First, you solve the SystemVerilog Microarchitecture Challenge for AI No.2. Adding the Flow Control, an open-source challenge from Verilog Meetup – and send me the solution to yuri@panchul.com.

https://github.com/verilog-meetup/systemverilog-microarchitecture-challenge-for-ai-2

2. Then I discuss your solution with you either in person (if you live in the San Francisco Bay Area) or over Zoom. I don’t care whether you solve the challenge manually or use AI for it, but I need you to explain every single line of the solution. In addition to it, I may ask you a couple more questions (manual solutions only in front of me, no AI).

3. If I like your solutions, I will enter your data in the Samsung internal referral website. After that, you will get an email from the website and can start an application to go through the official Samsung interview process. I will also forward your resume to the hiring managers and the team’s recruiters.

4. Note that solving a couple of puzzles for me is not a part of the official Samsung interview process. This is just my personal way to ensure I forward only the relevant resumes to the company. Every single member of our RTL teams would solve such a thing easily, so this is not something difficult, just some basic techniques to design a static pipeline in Verilog, generic for the industry.

Our team consists of friendly, proficient, and helpful people. We also have a fancy office that just got 10-year anniversary:

Thank you,
Yuri Panchul

@YuriPanchul

Jun 22 at 15:3317K

Algorithms * Machine learning * FPGA * Programming microcontrollers * Artificial Intelligence

Некоторые товарищи, например Олег Чирухин на Фейсбуке утверждают, что LLM хорошо пишет на JavaScript / TypeScript и плохо на Verilog / SystemVerilog потому что первого в мире больше. Однако у верилога есть два критических фактора, которых у JS вообще нет. В JS если программа работает долго - это просто неудобство, в Verilog-е если нарушается тайминг внутри такта (задержки в пикосекундах) или результат приходит через 5 тактов, а он ожидался через 4 - все нафиг ломается. LLM не понимает ни тайминг внутри тактов, ни тайминг по тактам (латентность).

Никто не будет из-за этого дурацкого LLM ставить везде hanshake не зависящий от латентности подблоков, снижать пропускную способность блока на порядок или снижать в разы тактовую частоту - скажем вместо процессора с частотой 2 GHz выкатывать на рынок процессор с частотой 20 MHz. И с производительностью по тактам 2 СoreMark / MHz вместо 12 СoreMark / MHz. И при этом большой в разы и с высоким энергопотреблением. Это как продавать автомобили со скоростью и грузоподъемостью велосипеда и весом как самосвал - такое никто не купит.

Тайминг внутри такта (задержки в пикосекундах) только из кода определить нельзя, нужна процедура статического анализа тайминга, который знает задержки конкретной библиотеки ASIC (LLM не умеет делать STA (static timing analysis) и не знает задержек конкретной версии библиотеки скажем на 2 нанометра low power такого-то вендора).

С неумением LLM понимать что происходит в каком такте все интереснее. В принципе это понять можно, но это требует довольно вдумчивого анализа конкретного кода, а LLM это не просто не умеет, а в наглую пишет "for illustration, assume the latency is 1" - типа тоном профессора "для иллюстрации, предположим латентность подблока - 1 такт". А если не предполагать? С предположением все поломается.

Конечно можно писать код с handshake, который не зависит от латентности, а просто ждет результата, но это принципиально усложняет дизайн, а также требует введение крупных очередей FIFO с непонятным размером.

Написал Олегу:

Тут есть два других фактора: 1. в реальных бизнес-задачах необходимо, чтобы разработчик мог понять например латентность кода подблоков - количество тактов на получение результата. LLM этого не понимает - оно из чужого кода часто и опытному разработчику не очевидно, а запустить симулятор и посмотреть это на диаграммах после симуляции LLM не может. 2. в верилоге есть составляющая которой вообще нет в программировании - таминг внутри такта в пикосекундах. Нужно чтобы схема синтезированная из кода в этот тайминг влезала. И если латентность (количество тактов) из чужого кода еще можно определить (если проанализировать цепочку присваиваний между комбинационной логикой и D-триггерами), то с таймингом вообще напряг. Хотя с таймингом у дизайнера вырабатывается интуиция, например что комбинационное умножение 4-х битных чисел в бюджет на 400 пикосекунд точно влезет, а вот комбинационное деление 32-битных точно не влезет - но все это нужно подтверждать запуском программы статического анализа тайминга, который (та-дам!!) LLM делать не может.

+11

@YuriPanchul

Jun 19 at 15:2418K

FPGA * Programming microcontrollers * Education abroadComputer hardwareArtificial Intelligence

У меня есть коллега (не по Самсунгу, а по образовательным программам), который влюблен в ИИ. У меня есть опасения что он может использовать ИИ для написания некой инструкции, которая включает теоретическую базу SystemVerilog-а. С моей точки зрения это очень дурная идея, так как LLM не следует стандарту, а генерит то, что людям интуитивно "кажется". Для иллюстрации спросил у ChatGPT 4.0 чем отличается wire, reg и logic. Словил 3 ошибки и 2 недочета:

1. Недочет: LLM (как и большинство людей, даже экспертов) забыл упомянуть про разницу в контексте инициализации ("wire a = b" это continuous assignment то есть "wire a; assign a = b;", а вот "logic a = b" это инициализация в момент 0, то есть "logic a; initial a = b;")

2. Ошибка: LLM почему-то думал что "wire a = 1'b0" несинтезируемо в Verilog, но синтезируемо в SystemVerilog.

3. Ошибка: LLM думал, что "always_ff" можно использовать для создания D-защелки (D-latch).

4. Ошибка: LLM думал, что "always_comb" может infer latch.

5. Недочет: LLM забыл про "always_latch".

То есть если скажем преподаватель ленится читать стандарты и книги, но вздумал писать методичку с помощью ChatGPT, то его студенты жестоко пострадают (баг от (1) трудно отлаживать) и будут понимать все "приблизительно".

@YuriPanchul

May 12 at 16:2417K

Machine learning * FPGA * Programming microcontrollers * Education abroadArtificial Intelligence

У меня есть знакомый энтузиаст LLM, который также изучает верилог. Я попросил его написать инструкцию к упражнению с неким сенсором, который он интегрировал. Он разумеется сбросил это на LLM, я почитал и понял, что LLM нужно запретить как распостранение Экстази и "солей" среди молодежи. Точно так же как "дизайнерские наркотики" дают ощущение счастья и достижения без труда, сгенеренная LLM документация выглядит как реальная, вот только читателю она не поможет.

Что нужно читателю? Картинку как подцепить сенсор к плате, временную диаграмму сигналов которые от него выходят и пару слов про проблемы, которые у него возникнут (дребезг) и как их стоит решать. Так чтобы было достаточно информации, чтобы сесть и написать код на верилоге.

Что выдал LLM? Сначала пять абзацев мутного словестного описания что "изменения переключателей проходят некоторую последовательность, позволяющую определить направление", с галлюцинациями что движется и что неподвижно. Потом не имеющую отношения к задаче информацию, из каких материалов делаются эти сенсоры в разных странах мира, чтобы быть дешевыми для хоббистов и образовательных учреждений. Далее про разные способы решения проблемы дребезга, в том числе способы, не имеющие отношения к данной ситуации. И наконец, куски определения пинов из QSF и XDC файлов из случайных примеров в интернете, которые не имеют отношения к описанному примеру, так как в нем во-первых эти файлы не используются (другой вендор, другой способ задания пинов), а во-вторых, в нем эта часть проекта абстрагирована (пользователю вообше это не нужно это делать).

То есть текст просто водит читателя за нос, не давая ему никакой полезной информации для решения проблемы. Но даже это не важно, потому что читатель этот текст читать не будет, так как учует LLM в заголовке и убедится в третьем предложении, после чего перестанет читать. Текст является иллюстрацией терминов "сделать на отцепись" и "из дерьма и палок".

UPD: И самое страшное: это 27 страниц вместо 1 страницы полезной инструкции, которую я ожидал. ДВАДЦАТЬ СЕМЬ СТРАНИЦ ЛАБУДЫ !!!

Я хочу обратно в годы, когда этого ужаса не было. Нашу цивилизацию ждут тяжелые времена. Я уже видел в ЖЖ посты агитирующие на перевод всей порноиндустрии на generative AI.

+15

@Oksenija

May 3 at 09:3314K

BiotechnologiesArtificial IntelligenceHealthCybersportBiology

ChatGPT: LOOKING FOR A CAFFEINE SUBSTITUTE

Although caffeine stimulates mental activity and aids in eSports/games and late-night programming, it has many side effects, such as increased blood pressure and crazy heart rate, a sharp rise and quick drop in stimulation. Therefore, we need a high-quality alternative to caffeine, and we will search for it using AI. Potential candidates to replace caffeine are Theacrine (or TeaCrine) and N-Phenethyldimethylamine Citrate (USA FDA said Ok).

ChatGPT successfully created a very complex table, even with a calculated column based on FUZZY criteria (if you can do this in SQL — you're a genius!), but it struggled with sorting the table. Attention: there is an image below, links are not clickable.

this is an image, links are not clickable

In the first numeric column, it failed to sort the numbers in descending order. I spent about 15-20 minutes trying. I experimented with various prompts and explanations. This is strange.

This tool (ChatGPT) understands table manipulation commands very well. In this example, I asked it to create a table based on data from large stores, specified which columns were needed and what information they should contain, indicated the order of the columns, including relative positioning — for instance, "insert a column with such-and-such data before this column" — and even more.

IT was able to create a SUMMARY column based on previously generated columns — this is the column with weighted sums of substance weights from other columns, and IT independently found the weighting coefficients quite accurately.

Moreover, for each product, IT managed to identify the substance composition based on specific criteria and listed them, creating a separate column. Not all substances, but only those filtered by certain criteria (only those that are not caffeine but have an effect similar to caffeine — try programming such a query in SQL manually without AI, taking into account the fuzzy criterion of similarity of effects, and also determine the similarity coefficient for creating the weighted sum of substance masses per serving of the dietary supplement). And it even partially managed to sort by the weighted sum.

But despite completing so much complex work, it still made a small mistake with sorting.