-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
10 changed files
with
125 additions
and
17 deletions.
There are no files selected for viewing
File renamed without changes.
3 changes: 3 additions & 0 deletions
3
results/v2/ru/judge_claude_3_5_sonnet_player_saiga_nemo_12b_v3.json
Git LFS file not shown
Git LFS file not shown
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
Please carefully read the character card and the dialogue. Based on the assistant's responses, answer 3 questions about the quality of these responses. The criteria for evaluating the answers are: | ||
<ul> | ||
<li>Adherence to a character card: everything the assistant says should not contradict the character card.</li> | ||
<li>Entertainment value: you should find the assistant's answers interesting to read, and they should not repeat between different responses within the same dialogue.</li> | ||
<li>Language fluency: responses should be in fluent English, unless otherwise specified in the character card.</li> | ||
</ul> | ||
<h2>Questions and Answers</h2> | ||
<p><b>Question</b>: What should be done if the assistant responds in Chinese instead of English? <b>Answer</b>: Give the minimum score for the fluency question, for others - at your discretion.</p> | ||
<p><b>Question</b>: What should be done if the assistant's responses are repetitive? <b>Answer</b>: Give the minimum score for the entertainment question, for others - at your discretion.</p> | ||
<p><b>Question</b>: What should be done if the user's responses are not very appropriate? <b>Answer</b>: Nothing, your task is to evaluate only the assistant's responses.</p> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
Внимательно прочитайте карточку персонажа и диалог. На основе реплик ассистента ответьте на 3 вопроса о качестве этих реплик. Критерии, по которым нужно оценить ответы: | ||
<ul> | ||
<li>Соответствие карточке персонажа: всё, что говорит ассистент, не должно противоречить карточке.</li> | ||
<li>Развлекательность: вам должно быть интересно читать ответы ассистента, они не должны повторяться между разными репликами в рамках одного диалога.</li> | ||
<li>Язык: ответы должны быть на хорошем русском языке, если иного не указано в карточке персонажа. | ||
</li></ul> | ||
|
||
<h2>Вопросы и ответы</h2> | ||
<p><b>Вопрос</b>: Что делать, если ассистент отвечает на английском вместо русского? <b>Ответ</b>: В вопросе про язык нужно поставить минимальный балл, в остальных — на ваше усмотрение.</p> | ||
|
||
<p><b>Вопрос</b>: Что делать, если реплики ассистента повторяются? <b>Ответ</b>: В вопросе про развлекательность нужно поставить минимальный балл, в остальных — на ваше усмотрение.</p> | ||
|
||
<p><b>Вопрос</b>: Что делать, если реплики пользователя не очень корректны? <b>Ответ</b>: Ничего, ваша задача — оценка только ответов ассистента.</p> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
<View> | ||
<Text name="char_name" value="Character: $char_name" /> | ||
<Collapse> | ||
<Panel value="Character card"> | ||
<View><Text name="char_info" value="$char_info" /></View> | ||
</Panel> | ||
</Collapse> | ||
<HyperText name="text" value="$html" /> | ||
<View style="box-shadow: 2px 2px 5px #999; padding: 20px; margin-top: 2em; border-radius: 5px;"> | ||
<Header value="The bot's answers are perfectly aligned with an assigned character"/> | ||
<Choices name="in_character" toName="text" choice="single" showInLine="true"> | ||
<Choice value="Strongly disagree"/> | ||
<Choice value="Disagree"/> | ||
<Choice value="Neutral"/> | ||
<Choice value="Agree"/> | ||
<Choice value="Strongly agree"/> | ||
</Choices> | ||
</View> | ||
<View style="box-shadow: 2px 2px 5px #999; padding: 20px; margin-top: 2em; border-radius: 5px;"> | ||
<Header value="The bot's responses are extremely engaging and entertaining"/> | ||
<Choices name="entertaining" toName="text" choice="single" showInLine="true"> | ||
<Choice value="Strongly disagree"/> | ||
<Choice value="Disagree"/> | ||
<Choice value="Neutral"/> | ||
<Choice value="Agree"/> | ||
<Choice value="Strongly agree"/> | ||
</Choices> | ||
</View> | ||
<View style="box-shadow: 2px 2px 5px #999; padding: 20px; margin-top: 2em; border-radius: 5px;"> | ||
<Header value="The bot's language use is of the highest quality, without any mistakes"/> | ||
<Choices name="fluency" toName="text" choice="single" showInLine="true"> | ||
<Choice value="Strongly disagree"/> | ||
<Choice value="Disagree"/> | ||
<Choice value="Neutral"/> | ||
<Choice value="Agree"/> | ||
<Choice value="Strongly agree"/> | ||
</Choices> | ||
</View> | ||
</View> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
<View> | ||
<Text name="char_name" value="Персонаж: $char_name" /> | ||
<Collapse> | ||
<Panel value="Карточка персонажа"> | ||
<View><Text name="char_info" value="$char_info" /></View> | ||
</Panel> | ||
</Collapse> | ||
<HyperText name="text" value="$html" /> | ||
<View style="box-shadow: 2px 2px 5px #999; padding: 20px; margin-top: 2em; border-radius: 5px;"> | ||
<Header value="Ответы ассистента идеально соответствуют карточке персонажа."/> | ||
<Choices name="in_character" toName="text" choice="single" showInLine="true"> | ||
<Choice value="Полностью не согласен"/> | ||
<Choice value="Не согласен"/> | ||
<Choice value="Не знаю"/> | ||
<Choice value="Согласен"/> | ||
<Choice value="Полностью согласен"/> | ||
</Choices> | ||
</View> | ||
<View style="box-shadow: 2px 2px 5px #999; padding: 20px; margin-top: 2em; border-radius: 5px;"> | ||
<Header value="Ответы ассистента чрезвычайно интересны и увлекательны."/> | ||
<Choices name="entertaining" toName="text" choice="single" showInLine="true"> | ||
<Choice value="Полностью не согласен"/> | ||
<Choice value="Не согласен"/> | ||
<Choice value="Не знаю"/> | ||
<Choice value="Согласен"/> | ||
<Choice value="Полностью согласен"/> | ||
</Choices> | ||
</View> | ||
<View style="box-shadow: 2px 2px 5px #999; padding: 20px; margin-top: 2em; border-radius: 5px;"> | ||
<Header value="Русский язык ассистента идеален, нет ошибок, нет внезапных переходов на английский."/> | ||
<Choices name="fluency" toName="text" choice="single" showInLine="true"> | ||
<Choice value="Полностью не согласен"/> | ||
<Choice value="Не согласен"/> | ||
<Choice value="Не знаю"/> | ||
<Choice value="Согласен"/> | ||
<Choice value="Полностью согласен"/> | ||
</Choices> | ||
</View> | ||
</View> |