Adjust mean_std_column to accept both pandas and numpy stdev #126

ian-coccimiglio · 2024-09-09T06:50:27Z

This PR contains:

a new test-case for the benchmark
- I hereby confirm that NO LLM-based technology (such as github copilot) was used while writing this benchmark
new dependencies in requirements.txt
- The environment.yml file was updated using the command conda env export > environment.yml
new generator-functions allowing to sample from other LLMs
new samples (sample_....jsonl files)
new benchmarking results (..._results.jsonl files)
documentation update
bug fixes

Related github issue (if relevant): closes #124

Short description:

How do you think will this influence the benchmark results?

Why do you think it makes sense to merge this PR?

haesleinhuepf

I just reran the notebook. Otherwise it LGTM

ian-coccimiglio and others added 2 commits September 8, 2024 23:46

More flexible standard deviation acceptance

c2dac0a

clean rerun notebook

128fe25

haesleinhuepf approved these changes Sep 13, 2024

View reviewed changes

haesleinhuepf changed the base branch from main to development-collecting-new-test-cases September 13, 2024 08:43

haesleinhuepf merged commit 4b1cc18 into haesleinhuepf:development-collecting-new-test-cases Sep 13, 2024

haesleinhuepf mentioned this pull request Sep 13, 2024

Collection of new use-cases and bug-fixes #93

Open

9 tasks

Provide feedback