diff --git a/evals/evaluation/HELMET/README.md b/evals/evaluation/HELMET/README.md index 4cb23e49..245cf0b2 100644 --- a/evals/evaluation/HELMET/README.md +++ b/evals/evaluation/HELMET/README.md @@ -1,5 +1,6 @@ # HELMET: How to Evaluate Long-context Language Models Effectively and Thoroughly HELMET + [[Paper](https://arxiv.org/abs/2410.02694)] HELMET HELMET (How to Evaluate Long-context Models Effectively and Thoroughly) is a comprehensive benchmark for long-context language models covering seven diverse categories of tasks.