Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AI2d gpt和claude3.5官方分数非常高 #577

Open
Violettttee opened this issue Nov 5, 2024 · 1 comment
Open

AI2d gpt和claude3.5官方分数非常高 #577

Violettttee opened this issue Nov 5, 2024 · 1 comment

Comments

@Violettttee
Copy link

您好~
想请问下你们对于openai和claude3.5在ai2d上特别高的分数有任何建议和想法吗?我这边修改姿势和prompt(添加cot)评测了gpt多次,都无法复现出0.942的超高分数。(加了cot后的最高分也就0.83),想请问你们对于这个gap有什么想法?(我看你们这边的ai2d的评测分数也没有任何高于0.9以上的,很好奇claude和gpt是怎么测出来将近满分的

@kennymckormick
Copy link
Member

Hi, @Violettttee ,
You can try the AI2D_TEST_NO_MASK dataset we provided, which generally display better performance compared to AI2D_TEST due to the different setting. However, we still cannot reproduce the numbers reported by OpenAI or Anthropic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants