Speed up mesoscope TIFF splitting #473

danielsf · 2022-03-23T18:12:23Z

The new mesoscope TIFF splitting code merged in #464 (ticket #460) appears to be a factor of ~ 2 slower than the legacy code. This is probably not a show-stopper, but it would be nice to speed it up.

The likeliest way to speed it up would be to refactor the TimeSeriesSplitter class defined here

https://github.com/AllenInstitute/ophys_etl_pipelines/blob/main/src/ophys_etl/modules/mesoscope_splitting/tiff_splitter.py#L372

so that, instead of generating the HDF5 file for one (ROI, z) pair at a time, it can just loop through all of the (ROI, z) pairs associated with the ophys_session, writing all of the expected videos at once. The advantage is that, instead of having to loop through the time series TIFF file once for each video (i.e. for each call of write_output_file), it could loop through the TIFF file once, writing frames to the appropriate video files.

Tasks

Replace TimeSeriesSplitter.write_output_file with a method that accepts a dict mapping (i_roi, z) pairs to output file paths and then loops through the TIFF file once, writing frames from the TIFF to the appropriate output paths).

Validation

Make sure unit tests still pass (the unittests in tests/modules/mesoscope_splitting_cli simulate different real world use cases and check the actual output against expected output, so if those tests pass, everything should still work).
Verify that refactoring TimeSeriesSplitter.write_output_file to loop through the time series TIFF once for the entire ophys_session would, in fact, be faster than the current implementation. @danielsf has some real datasets that can be used for this test. If the code is not faster, it is probably not worth merging this ticket.

The text was updated successfully, but these errors were encountered:

danielsf · 2022-03-31T23:04:13Z

this might also be a good opportunity to experiment with the HDF5 compression options mentioned in #389

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up mesoscope TIFF splitting #473

Speed up mesoscope TIFF splitting #473

danielsf commented Mar 23, 2022

danielsf commented Mar 31, 2022

Speed up mesoscope TIFF splitting #473

Speed up mesoscope TIFF splitting #473

Comments

danielsf commented Mar 23, 2022

danielsf commented Mar 31, 2022