Комментарии / Профиль DariaSatco / Хабр

Дарья Шатько@DariaSatco

ML Team Lead @ Yandex Crowd

Ловим ошибки в диалогах поддержки с помощью LLM: опыт команды Yandex Crowd

DariaSatco 3 ноя 2025 в 05:12

На регулярных проверках на потоке видим цифры выше 0,8 (значение естественно плавает, зависит от распределения ошибок). При этом периодически ловим кейсы, которые текущие промпты плохо ловят и тогда обновляем промпты, либо препроцессинги.

Ловим ошибки в диалогах поддержки с помощью LLM: опыт команды Yandex Crowd

DariaSatco 20 окт 2025 в 10:45

В промпте судью просим проставить оценку по шкале. Например, от 0 до 1 или от 0 до 100. Например, промпт с HF (https://huggingface.co/learn/cookbook/llm_judge):

JUDGE_PROMPT = """
You will be given a user_question and system_answer couple.
Your task is to provide a 'total rating' scoring how well the system_answer answers the user concerns expressed in the user_question.
Give your answer as a float on a scale of 0 to 10, where 0 means that the system_answer is not helpful at all, and 10 means that the answer completely and helpfully addresses the question.

Provide your feedback as follows:

Feedback:::
Total rating: (your rating, as a float between 0 and 10)

Now here are the question and answer.

Question: {question}
Answer: {answer}

Feedback:::
Total rating: """

Информация

Специализация