haesleinhuepf · pr4deepr · Jul 8, 2024 · Jul 17, 2024 · Jul 17, 2024 · Jul 31, 2024
diff --git a/.github/pull_request_template.md b/.github/pull_request_template.md
@@ -1,6 +1,7 @@
 This PR contains:
 * [ ] a new test-case for the benchmark
   * [ ] I hereby confirm that NO LLM-based technology (such as github copilot) was used while writing this benchmark
+  * [ ] I have added my function into the `data/human-eval-bia-categories.yaml` file and specified the category. If it is a new category, justify it below.
 * [ ] new dependencies in requirements.txt
   * [ ] The environment.yml file was updated using the command `conda env export > environment.yml`
 * [ ] new generator-functions allowing to sample from other LLMs

diff --git a/README.md b/README.md
@@ -96,7 +96,7 @@ def sum(a, b):
     return a + b
 ```
 * This function must have a meaningful docstring between """ and """. It must be so meaningful that a language model could possibly write the entire function.
-* There must be another code cell that starts with `def check(candiate):` and contains test code to test the generated code.
+* There must be another code cell that starts with `def check(candidate):` and contains test code to test the generated code.
 * The text code must use `assert` statements and call the `candidate` function. E.g. if a given function to test is `sum`, then a valid test for `sum` would be:
 ```
 def check(candidate):

diff --git a/data/human-eval-bia-categories.yaml b/data/human-eval-bia-categories.yaml
@@ -0,0 +1,189 @@
+#Define categories for each function. Function name should be the filename of your test_case
+#Categories can be a combination of:
+#'segmentation','morphological_operations', 'statistical_analysis', 'feature_extraction', 'measurement', 'image_preprocessing','file_i_o', 'hello_world', 'workflow_automation'
+apply_otsu_threshold_and_count_positive_pixels:
+- segmentation
+binary_closing:
+- morphological_operations
+binary_skeleton:
+- morphological_operations
+bland_altman:
+- statistical_analysis
+combine_columns_of_tables:
+- data_wrangling
+convex_hull_measure_area:
+- feature_extraction
+- measurement
+convolve_images:
+- image_filtering
+count_number_of_touching_neighbors:
+- feature_extraction
+- measurement
+count_objects_over_time:
+- measurement
+count_overlapping_regions:
+- measurement
+create_umap:
+- feature_extraction
+- measurement
+crop_quarter_image:
+- image_transformation
+deconvolve_image:
+- image_filtering
+detect_edges:
+- segmentation
+expand_labels_without_overlap:
+- segmentation_post_processing
+extract_surface_measure_area:
+- measurement
+fit_circle:
+- measurement
+label_binary_image_and_count_labels:
+- segmentation_post_processing
+- measurement
+label_sequentially:
+- segmentation_post_processing
+list_image_files_in_folder:
+- file_i_o
+map_pixel_count_of_labels:
+- measurement
+mask_image:
+- segmentation_post_processing
+maximum_intensity_projection:
+- image_transformation
+mean_squared_error:
+- statistical_analysis
+mean_std_column:
+- statistical_analysis
+measure_aspect_ratio_of_regions:
+- feature_extraction
+- measurement
+measure_intensity_of_labels:
+- feature_extraction
+- measurement
+measure_intensity_over_time:
+- feature_extraction
+- measurement
+measure_mean_image_intensity:
+- feature_extraction
+- measurement
+measure_pixel_count_of_labels:
+- measurement
+measure_properties_of_regions:
+- feature_extraction
+- measurement
+open_image_read_voxel_size:
+- file_i_o
+open_image_return_dimensions:
+- file_i_o
+open_nifti_image:
+- file_i_o
+open_zarr:
+- file_i_o
+pair_wise_correlation_matrix:
+- statistical_analysis
+radial_intensity_profile:
+- image_transformation
+region_growing_segmentation:
+- segmentation
+remove_labels_on_edges:
+- segmentation_post_processing
+remove_noise_edge_preserving:
+- image_preprocessing
+remove_small_labels:
+- segmentation_post_processing
+return_hello_world:
+- hello_world
+rgb_to_grey_image_transform:
+- image_transformation
+rotate_image_by_90_degrees:
+- image_transformation
+subsample_image:
+- image_transformation
+subtract_background_tophat:
+- image_filtering
+sum_images:
+- image_transformation
+sum_intensity_projection:
+- image_transformation
+t_test:
+- statistical_analysis
+tiled_image_processing:
+- image_filtering
+- workflow_automation
+transpose_image_axes:
+- image_transformation
+workflow_batch_process_folder_count_labels:
+- workflow_automation
+- measurement
+workflow_batch_process_folder_measure_intensity:
+- workflow_automation
+- measurement
+workflow_segment_measure_umap:
+- feature_extraction
+- measurement
+- segmentation
+- workflow_automation
+workflow_segmentation_counting:
+- measurement
+- segmentation
+- workflow_automation
+workflow_segmentation_measurement_summary:
+- measurement
+- segmentation
+- workflow_automation
+workflow_watershed_segmentation_correction_measurement:
+- measurement
+- segmentation
+- workflow_automation
+convert_points_polygon:
+- data_wrangling
+create_multipolygon_from_coordinates:
+- data_wrangling
+dataframe_column_rename:
+- data_wrangling
+detect_ellipse:
+- segmentation
+distance_between_maxima:
+- measurement
+fft_spectrum:
+- image_transformation
+find_closest_neighbors:
+- measurement
+fit_gaussian_to_spot:
+- segmentation
+flow_field_deformation:
+- image_transformation
+generate_image_histogram:
+- measurement
+identify_centroids:
+- measurement
+interpolate_stack:
+- image_transformation
+linear_intensity_profile:
+- image_transformation
+load_tif_and_output_rgb:
+- file_i_o
+local_maxima_from_distance_transform:
+- segmentation
+read_imagej_tif_metadata:
+- file_i_o
+read_ome_metadata_from_ome_xml:
+- file_i_o
+register_timelapse:
+- image_transformation
+reshape_array:
+- image_transformation
+roi_imagej_to_ezomero:
+- data_wrangling
+save_image_with_voxel_size:
+- file_i_o
+scale_image_affine_transform:
+- image_transformation
+select_coexpressing_cells:
+- measurement
+- segmentation_post_processing
+stack_and_merge:
+- image_transformation
+translate_3d_image_along_vector:
+- image_transformation