Should we represent images using default numpy.asarray()? #115

ian-coccimiglio · 2024-09-06T21:45:20Z

In many (possibly most) of our test cases, we consider an image generated with np.asarray([...]) to be our representation of an image. However, this generates numpy arrays of 64-bit precision.

As I was continuing work on #76, I noticed some problems emerging from this abstraction. I think most computer-vision algorithms are written expecting 8-32bit images (I really don't if many images are specified to 64-bit precision). I'll focus on one question which surprised me, which was "why are so many models failing to perform an Otsu's Threshold?'

The following error is super common across many LLMs.

"OpenCV(4.10.0) /io/opencv/modules/imgproc/src/thresh.cpp:1559: error: (-2:Unspecified error) in function 'double cv::threshold(cv::InputArray, cv::OutputArray, double, double, int)'
THRESH_OTSU mode:
>     'src_type == CV_8UC1 || src_type == CV_16UC1'
> where
>     'src_type' is 4 (CV_32SC1)"

Essentially, these models fail because OTSU_Thresholding only allows for 8- or 16- bit inputs. Perhaps we expect the LLMs to perform this type-checking, or perhaps we don't. But, in my experience, it's uncommon for image analysts to work on 64-bit images, so perhaps we should avoid failing LLMs if they don't assume this either.

My idea: For every test-case that generates a numpy array, specify a plausible data-type. I'd vote on Unsigned 16-bit as the default. What do others think?

The text was updated successfully, but these errors were encountered:

haesleinhuepf · 2024-09-07T06:41:52Z

Hey @ian-coccimiglio ,

great point! I must admit, in daily practice, I never use OpenCV (because most of the image data I work with are not compatible). We also expressed that in the paper:

On the other hand, in natural image processing, libraries such as OpenCV \citep{itseez2015opencv} are common, while our community often uses scikit-image \citep{scikit-image} for similar purposes. As natural image processing is a very active research field, the LLM’s training data may contain more examples from that domain.

If you think this statement does not reflect reality, we need to formulate it differently. We could also introduce new test-cases for opencv specifically. I'm just certainly the wrong person for writing them ;-)

My idea: For every test-case that generates a numpy array, specify a plausible data-type. I'd vote on Unsigned 16-bit as the default. What do others think?

Also, I hardly convert image dtypes in routine. There was a similar discussion in #111 where we concluded, that we want to try to modify all/many prompts so that variable types are clearly specified, e.g. "image provided as numpy array" instead of just "image".

Please note, I did some of you proposed modifications in #118. It would be great if you could express your opinion there!

ian-coccimiglio · 2024-09-07T08:46:19Z

Also, I hardly convert image dtypes in routine. There was a similar discussion in #111 where we concluded, that we want to try to modify all/many prompts so that variable types are clearly specified, e.g. "image provided as numpy array" instead of just "image".

I wanted to answer this particular point here. I also rarely convert image datatypes - I think that's because I normally start processing images and end up with images - numpy arrays just act intermediately. My problem is that I think sometimes, our test-cases conflate "all images are really just numpy arrays" with "all numpy arrays are really just images". This statement is 'usually true', but when it fails, it dramatically lowers the success rate. There's a similar thing that's happening with "lists of lists of ints" being treated as images, which is why I think sum_images.ipynb is failing (I'll be sending in a PR for that one shortly).

haesleinhuepf mentioned this issue Sep 7, 2024

Prompt refinement (discussion about if this should be merged or not) #118

Merged

10 tasks

This was referenced Sep 9, 2024

Specifying numpy dtype in test functions #122

Closed

Standardizing numpy dtypes #123

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should we represent images using default numpy.asarray()? #115

Should we represent images using default numpy.asarray()? #115

ian-coccimiglio commented Sep 6, 2024 •

edited

Loading

haesleinhuepf commented Sep 7, 2024

ian-coccimiglio commented Sep 7, 2024

Should we represent images using default numpy.asarray()? #115

Should we represent images using default numpy.asarray()? #115

Comments

ian-coccimiglio commented Sep 6, 2024 • edited Loading

haesleinhuepf commented Sep 7, 2024

ian-coccimiglio commented Sep 7, 2024

ian-coccimiglio commented Sep 6, 2024 •

edited

Loading