Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datatype Conversion Bug #1

Closed
jmdelahanty opened this issue May 4, 2022 · 2 comments
Closed

Datatype Conversion Bug #1

jmdelahanty opened this issue May 4, 2022 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@jmdelahanty
Copy link

Gogolla Lab member @StoyoKaramihalev discovered a bug with the dask implementation of HOG calculations! The data that was written to disk was discovered to be entirely full of zeros! Not very helpful for doing facial expression analysis.

When you looked at the output of the script, you ended up with things like this:
failed_hog

@ParticularMiner and I spent some time this evening hacking out what the cause could be..

It turns out that the reason for this was in the current version's code on line 194: The dtype was specified to be the same dtype as the gray_frames block.

We initially thought that since we were getting images out of the hog() function, we might as well stay consistent with the data type we were using, but we did this without double checking what the datatypes were of the returned objects from hog()! hog() returns objects as float64 and the original images are uint8. For reasons I don't quite understand yet, converting from float64 to uint8 yields a 0 value for all the data in the returned objects! Perhaps hog() returns values that are all decimals or something?

If we instead specify the datatype to be float64 by stating dtype = first_hog_image.dtype, we retain the correct data type when all the downstream computations are performed by Dask!

If we specify dtype as the value from gray_frames, the returned value of the dask array will be converted to that type. See here in the docs.

This was tested by @ParticularMiner through the use of compute() on a small test video. The values of the data were all correct before being written to disk via to_zarr! So that clued us in that we were doing something incorrectly not with the make_hogs function but rather with our call to_zarr. I checked out some datatypes of the written zarr vs. the compute result that @ParticularMiner found and noticed they were different! When we tried keeping the datatype as float64 from the output of make_hogs, everything worked like it was supposed to!

Here's the kind of result you can get now:
bacon

@jmdelahanty jmdelahanty self-assigned this May 4, 2022
@jmdelahanty jmdelahanty added the bug Something isn't working label May 4, 2022
@ParticularMiner
Copy link

Hi @jmdelahanty !

See @jakirkham’s suggestion here about how to broadcast ImageIOReader's priority to all processes. This will enable you to run the script quietly without the annoying intervening error messages appearing, as it did for me!

@jmdelahanty
Copy link
Author

I'll be testing this out this weekend to try it out for myself on some videos! I think it would be an excellent PR to the repo if you want to make it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants