Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve top camera ID tracking using SLEAP on the quadrant cameras #440

Open
4 of 14 tasks
anayapouget opened this issue Oct 20, 2024 · 4 comments
Open
4 of 14 tasks
Assignees

Comments

@anayapouget
Copy link

anayapouget commented Oct 20, 2024

After tests on Aeon 3 social 0.2, we found that a SLEAP ID model trained on quadrant cameras performs better than the current SLEAP ID model we are using trained on the top camera. This makes sense since obviously the quadrant cameras are more zoomed in, making the difference between the tattooed and not tattooed mice easier to pick up on. Below is a comparison of their performance on unseen videos (i.e., videos neither of the models were trained on, specifically 2024-02-25T18-00-00 and 2024-02-28T15-00-00):

Top camera SLEAP ID model:
BAA-1104045 Accuracy: 0.917
BAA-1104047 Accuracy: 0.776
ID accuracy: 0.844
Total tracks: 33536
Tracks identified: 33513
Tracks correctly identified: 28286

Quadrant camera SLEAP ID model:
BAA-1104045 Accuracy: 0.965
BAA-1104047 Accuracy: 0.837
ID accuracy: 0.897
Total tracks: 33519
Tracks identified: 28051
Tracks correctly identified: 22523

Maybe with some extra work on the model parameters we could make it even better? @lochhh it would be great to discuss this at some point if you have time!

As a result we have decided to make a new set of full pose ID data using quadrant camera SLEAP models for all arenas and social experiments. The steps we need to do to are outlined below:

  • Calculate homography
  • Generate composite videos to be used for training and evaluating the sleap models.
    • We need to dynamically update the number of mice for the blob tracking so that we can avoid running sleap on the top cameras entirely
  • Make SLEAP files and do manual labelling. If possible it would be super helpful if those who are familiar with SLEAP (just @jkbhagatio and @lochhh I think?) could assist in the labelling bit. Note that the individual sessions should already be labelled thanks to the Bonsai blob tracking, but still need to be checked because the transformation of the top camera coordinates to the quadrant cameras isn't exact and some points are off to the side of the mice.
  • Train the models using the same or similar set of parameters determined from my tests on Aeon 3 social 0.2.
  • Select frames from the composite videos set aside for evaluation and their corresponding top camera videos. Generate ground truth SLEAP files using the bonsai blob tracking and run inference on the frames using the newly trained quadrant camera ID models as well as the original top camera ID model. Evaluate their performance and compare.

If the performance of the quadrant camera ID model is consistently better than that of the top camera ID model as expected, continue on to the next steps.

  • Generate composite videos for the entirety of the social sessions. (COMPUTE HEAVY)
  • Create a new Bonsai worflow. Aeon 4 social 0.2 2024-02-17 12:00:00 is an example chunk that can be used for testing.
    It will need to:
    • Take in as inputs: the path to a composite video (and/or to the corresponding top camera video? we will need both so either both are provided or one can be inferred from the other), the path to the csv that relates to composite video frames to the top camera video frames (again this could maybe be inferred), the path to the relevant quadrant camera ID model, the path to my top camera full pose model, the path to the relevant homographies, the path to the output directory.
    • Run full pose inference on the top camera, one frame at a time.
    • Run ID inference on the corresponding composite camera frames (it can be 1 or 2, depending on whether or not both mice are in the same quadrant camera frame).
    • Convert the quandrant camera ID coordinates to top camera coordinates using the homographies. If no mice are detected for a frame or a quadrant camera frame is missing (becuase in some epochs some quadrant cameras failed to start up), the ID can possibly be pulled from the existing top camera full pose ID data.
    • Match the identities to the poses (we can re-use my exisitng C# code for this).
    • Save the pose data with the matched identities to a binary file in the output directory.
  • Create a python file that generates the correct SLURM scripts to run the bonsai workflow (we can re-use a lot of the code from my existing social bonsai sleap python file) and complete inference on the entirety of the social sessions. (COMPUTE HEAVY)
@glopesdev
Copy link
Contributor

glopesdev commented Oct 20, 2024

@anayapouget this looks great, happy to help with this. As a side note to investigate whether we could even run this potentially online, have you ever tried running the distributed version where you run a SLEAP model for all cameras separately and then stitch the resulting tracks (as opposed to stitching the video)?

That would be amenable to GPU parallelism and we could even try running the entire batch of 4 cameras into a single GPU call.

@lochhh
Copy link
Contributor

lochhh commented Oct 21, 2024

Thanks for putting this together @anayapouget !

Maybe with some extra work on the model parameters we could make it even better? @lochhh it would be great to discuss this at some point if you have time!

Yep. Automated hyperparameter tuning (e.g. Optuna, Ray tune) is something we can look into as well.

Make SLEAP files and do manual labelling. If possible it would be super helpful if those who are familiar with SLEAP (just @jkbhagatio and @lochhh I think?) could assist in the labelling bit. Note that the individual sessions should already be labelled thanks to the Bonsai blob tracking, but still need to be checked because the transformation of the top camera coordinates to the quadrant cameras isn't exact and some points are off to the side of the mice.

We should be able to use existing CameraTop ID models to automatically label social session frames, which we can proofread, and then apply the same CameraTop-to-Quad transformation - this will be easier than manual labelling.

@anayapouget
Copy link
Author

Yes @lochhh good point - I was planning on modifying the code for generating the labelled SLEAP files to use the DJ full pose ID SLEAP data soon. However we have some setbacks in generating the composite videos #442... We'll have to fix this before moving forward, although we can test automated hyperparameter tuning on my Aeon 3 social 02 dataset already if you'd like? That definitely sounds like it would be good to explore!

@lochhh
Copy link
Contributor

lochhh commented Nov 20, 2024

@anayapouget:

I wouldn't use Aeon 3 social 02, because that was the first one I did and it's not done in exactly the same one as the others. You can use Aeon 3 social 03 or 04 though, and Aeon 4 social 02 - those are all good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants