Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write Pika module for denoising movies #374

Open
2 tasks
danielsf opened this issue Nov 18, 2021 · 0 comments
Open
2 tasks

Write Pika module for denoising movies #374

danielsf opened this issue Nov 18, 2021 · 0 comments

Comments

@danielsf
Copy link
Contributor

The command line interface provided in AllenInstitute/deepinterpolation is too unstable to be reliably used for denoising videos in production. Among its deficiencies

  1. Rather than read the input movie into memory once at the start of inference, it reads data from the input HDF5 one chunk at a time, opening and closing a new file handle as needed.

https://github.com/AllenInstitute/deepinterpolation/blob/master/deepinterpolation/generator_collection.py#L934-L998

This can cause the job to fail on an OSError if the connection to the NFS system becomes stale while the job is running (as often happens on our SLURM cluster)

  1. Rather than storing output to a cache and periodically flushing it to disk as results build up in memory, the CLI opens a handle to the output file that must remain open for the entire compute job

https://github.com/AllenInstitute/deepinterpolation/blob/master/deepinterpolation/generator_collection.py#L934-L998

This, again, can cause failures due to the slow connection between our NFS file system and our SLURM compute notes

  1. It relies on tensorflow's native parallelization, which means that many CPUs remain idle during I/O processes. This could be sped up by reading in all of the data at once, intelligently chunkingt it, and farming the work for each chunk out to a different CPU

This branch of a fork of the deepinterpolation repository implements fixes to these problems

https://github.com/danielsf/deepinterpolation/commits/danielsf/by_hand_parallelization

However, beyond all of this, AllenInstitute/deepinterpolation tries to support many use cases (FMRI imaging, ECEPhys traces) that we do not need to support yet. Since, for denoising, all we need to do is take a precomputed tensorflow model and apply it to a 2 Photon movie, I propose that Pika write our own denoising module that can live in ophys_etl_pipelines. This will give us ownership and control over the code and allow us to decouple our pipeline from the still rapid development going on in AllenInstitute/deepinterpolation.

Tasks

  • Write an ophys_etl_pipeline module to accept the HDF5 stored models solved for by AllenInstitute/deepinterpolation and apply it to a motion corrected 2 Photon movie using tensorflow

Validation

  • Verify that the results of running inference with deepinterpolation's CLI and the new module are equivalent
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant