Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New test: Filtering tracks by length. Fixing a typo. #138

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

hinderling
Copy link

This PR contains:

  • a new test-case for the benchmark
    • I hereby confirm that NO LLM-based technology (such as github copilot) was used while writing this benchmark
  • new dependencies in requirements.txt
    • The environment.yml file was updated using the command conda env export > environment.yml
  • new generator-functions allowing to sample from other LLMs
  • new samples (sample_....jsonl files)
  • new benchmarking results (..._results.jsonl files)
  • documentation update
  • bug fixes

Related github issue (if relevant): ---

Short description:

  • Added a new test-case for filtering tracks by length filter_tracks.
  • Fixed a typo in a previous test apply_otsu_threshold_and_count_postiive_pixels -- > apply_otsu_threshold_and_count_positive_pixels.

How do you think will this influence the benchmark results?

  • Adding more diversity to the tasks.
  • Fixing the typo in the prompt might slightly improve the performance of some models.

Why do you think it makes sense to merge this PR?

  • This is an often used task in my workflows. I usually use the trackpy library for this (as in the reference implementation I provided), and noticed that copilot seems to struggle to use this library or recreate the functionality. This is why I think it's an interesting test case. Generally I think processing tracking data is an integral part of bioimage analysis and not represented in this study yet, can try to add more examples if this is of interest!
  • Fixing the typo increases legibility in result figures, and is more "fair" for model testing because there is no typo that could be confusing.

@haesleinhuepf
Copy link
Owner

Hi @hinderling ,

just a quick question regarding the notebook you sent. In the function definition there is a parameter "id_label='id'". In the docstring it says "id for each track (default: 'particle')". Isn't that a contradiction? Shouldn't it be "id" instead of particle?

Thanks!

Best,
Robert

@hinderling
Copy link
Author

Oh no, thats a mistake, sorry! Should be fixed now, also changed the name of the test so other track filtering tasks (e.g. filter_tracks_by_value) can be added with less conflicting naming.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants