-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Evaluations and Prompts #3
Comments
Hi @prince14322. Thank you for your attention. Prompts are included in MainframeBench. We call 3 times for each model and get an average for the result. |
Thank you for the prompts. Model :
Output :
Attaching the screenshot for the same |
Also tried another prompt variation taking inspiration from here Model : Here are the results |
Could you please share the evaluation scripts and prompts that were used to generate the reported results in the paper?
Various parameters are involved in generating outputs, and it is crucial to get these prompts correct, as large language models (LLMs) are highly sensitive to even minor changes in input.
Having access to these scripts and prompts would be invaluable for replicating the experiments accurately and exploring different variations in the evaluation process. This would enable a more precise fine-tuning of models and methodologies, leading to a deeper understanding and potentially novel insights.
The text was updated successfully, but these errors were encountered: