Few Shot configuration #12

Nkluge-correa · 2024-08-09T12:45:15Z

Hello!

Is there a way to control how many examples are used to evaluate the models? Also, how are the evaluations currently set up? Are all benchmarks (ARC, MMLU, HellaSwag) running in a zero-shot fashion? If not, what is the configuration used?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Few Shot configuration #12

Few Shot configuration #12

Nkluge-correa commented Aug 9, 2024

Few Shot configuration #12

Few Shot configuration #12

Comments

Nkluge-correa commented Aug 9, 2024