Specifying numpy dtype in test functions #122

ian-coccimiglio · 2024-09-09T02:16:21Z

This PR contains:

a new test-case for the benchmark
- I hereby confirm that NO LLM-based technology (such as github copilot) was used while writing this benchmark
new dependencies in requirements.txt
- The environment.yml file was updated using the command conda env export > environment.yml
new generator-functions allowing to sample from other LLMs
new samples (sample_....jsonl files)
new benchmarking results (..._results.jsonl files)
documentation update
bug fixes

Related github issue (if relevant): would close #115

Short description:

I think it'd be best to specify a plausible data-type for images/arrays in the test-check.

How do you think will this influence the benchmark results?

For Otsu's threshold + positive pixel counting, overall model pass-rate goes from 10/230 to 75/230.
I haven't done other tasks yet. I would assume it would have a generally positive effect in cases where OpenCV is type-sensitive (applies to at least 2-3 more tests as far as I've seen).

Why do you think it makes sense to merge this PR?

I'm waiting to see whether we decide to do this method or go for prompt-editing (or both). Not to be merged yet.
Also it fixes the spelling of the test-case which I'm not sure we want to merge as I can't tell what it might break.

ian-coccimiglio · 2024-09-09T02:39:41Z

My bad, Github and I don't always cooperate.

ian-coccimiglio and others added 5 commits September 7, 2024 02:03

fix sum images notebook to add two images rather than lists

5208a15

Fixed otsu threshold to test on a 16-bit numpy array rather than 64

deee20e

Corrected the function name

495c930

specified dtype

4199355

Delete test_cases/sum_images.ipynb (Addressed in other PR)

81677e5

ian-coccimiglio closed this Sep 9, 2024

Provide feedback