We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
你好,我在使用该框架测试gsm8k时遇到了复现不一致的问题。 用minicpm-2b-sft-bf16模型在gsm8k任务上测的值只有38.13,使用的配置参数如下:
{ "task_name": "gsm8k_gsm8k_gen", "path": "datasets/gsm8k/data/gsm8k.jsonl", "description": "", "transform": "datasets/gsm8k/transform_gen_v0.py", "fewshot": 8, "batch_size": 1, "generate": { "method": "generate", "params": "models/model_params/vllm_sample_v1.json", "args": { "temperature": 0.1, "top_p": 0.95, "max_tokens": 300, "sampling_num": 1 } }, "model_postprocess": "general_torch", "task_postprocess": "gsm8k_post", "metric": { "accuracy": { "evaluation": { "type": "exact_match" } } }, "log_dir": "logs/2024-06-14_11-07-57" }
minicpm论文里报告的gsm8k值为53.83,并且也是使用UltraEval框架完成的测试,为什么会相差这么大呢?感谢您的回复
The text was updated successfully, but these errors were encountered:
No branches or pull requests
你好,我在使用该框架测试gsm8k时遇到了复现不一致的问题。
用minicpm-2b-sft-bf16模型在gsm8k任务上测的值只有38.13,使用的配置参数如下:
minicpm论文里报告的gsm8k值为53.83,并且也是使用UltraEval框架完成的测试,为什么会相差这么大呢?感谢您的回复
The text was updated successfully, but these errors were encountered: