From 639db919a1342576b2b5cd9eedb23eb00242186d Mon Sep 17 00:00:00 2001 From: KairuiHu Date: Wed, 27 Nov 2024 16:44:12 +0800 Subject: [PATCH] correct hyperlink errors --- docs/lmms-eval-0.3.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/lmms-eval-0.3.md b/docs/lmms-eval-0.3.md index b12bea0d..165e624d 100644 --- a/docs/lmms-eval-0.3.md +++ b/docs/lmms-eval-0.3.md @@ -159,7 +159,7 @@ AIF refers to Audio Instruction Following, and ASR refers to Audio Speech Recogn The result might be inconsistent with the reported result as we do not have the original prompt and we have to maintain the fair environment for all the models. For the base model, we do not test on the Chat Benchmarks. -Certain datasets face alignment challenge: Datasets with WER, CIDEr, BLEU as metrics cannot accurately align due to their rigid output formats. Model responses are sensitive to prompt, we will investigate more deeply in Section [Robustness of the model](https://www.notion.so/Robustness-of-the-model-b89c005d3e044cb6aff51165929cea45?pvs=21) . +Certain datasets face alignment challenge: Datasets with WER, CIDEr, BLEU as metrics cannot accurately align due to their rigid output formats. Model responses are sensitive to prompt, we will investigate more deeply in the section [Robustness of the model](#robustness-of-the-model). ## Evaluation Analysis and Thinking: