Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quality check not being completed correctly on images #588

Open
AntoniaR opened this issue Jul 29, 2021 · 4 comments
Open

Quality check not being completed correctly on images #588

AntoniaR opened this issue Jul 29, 2021 · 4 comments
Assignees
Labels

Comments

@AntoniaR
Copy link
Contributor

RMS quality checks that take into account the distribution of image rms values in the dataset have been implemented. i.e. a gaussian function is fitted to the rms distribution obtained from a specified number of images (given by the rms_history option in the job parameters file)

rms_est_history = 100 ; how many images used for calculating rms histogram

This was to address Issue #512

However, during testing of the implementation of a new parameter to threshold the fitted rms values, it became clear that the no quality checks are conducted until you reach the number of images in the rms_history parameter. This is not correct at all for batch mode and is only partially correct for streaming mode.

Batch mode:

  • all images should be used for the histogram
  • all images should have all of the quality checks conducted on them
  • the rms_est_max/rms_est_min check should be for all images
  • the sigma thresholds from the histogram fit should be used for all images
  • the beam parameters should be tested for all images

Streaming mode:

  • The first x number of images should be used for the histogram, where x is the number of images in the rms_history parameter
  • the rms_est_max/rms_est_min check should be for all images
  • the sigma thresholds from the histogram fit should be used after the first x number of images
  • the beam parameters should be tested for all images

This issue requires some redesign in the implementation of the quality checks. Namely, there should be separate versions for batch and streaming for the histogram fitted thresholds. Also, some of the quality checks (the rms_est_max/rms_est_min and beam parameters) should be separated out for all images.

@AntoniaR
Copy link
Contributor Author

AntoniaR commented Aug 2, 2021

The beam parameters and rms_est_max and rms_est_min checks are now incorporated for all images on branch Issue588 https://github.com/transientskp/tkp/tree/Issue588

The histogram fit is correct for streaming data but need to adapt for batch data. For batch data we need to measure all of the rms values, fit and then reassess all of the images to reject those outside of the allowed range.

Current situation:

  • if n_images >= rms_est_history then
  • fit a gaussian to the rms values of the last n images
  • reject image if outside of the sigma threshold given by rms_rej_sigma

Requirement for batch mode:

  • fit a gaussian to the rms values of all the images
  • reject all images outside of the sigma threshold given by rms_rej_sigma
    This probably needs to be after all of the other quality checks are complete and then update the reject reason in the database as required.

n.b. the streaming mode is likely to be inefficient and require speed up. I propose that we don't just refit every image, but only refit every n images where n is the rms_est_history number.

@AntoniaR
Copy link
Contributor Author

AntoniaR commented Aug 2, 2021

To do the Gaussian fit on all the images will require changing the current logic flow in TraP for batch mode. The current set up works fine for streaming mode.

Current setup:

  • The dataset is split into individual time steps.
  • For each timestep the processing conducts: 1. get images, 2. store images (if used), 3. quality control filtering, 4. source extraction, 5. source association, 6. forced fitting, 7. update variability metrics

Setup needed for batch mode:

  • All images are loaded and checked as a function of timestep or as most efficient, i.e. 1. get images, 2. store images (if used), 3. basic quality control filtering (n.b. need to remove the historical rms part)
  • Fit Gaussian to rms values for all images (as a function of their observing band) and conduct rejection based on this. This is step 4.
  • For each timestep now run the following: 5. source extraction, 6. source association, 7. forced fitting, 8. update variability metrics

As this is a change in the main logic of TraP, I propose we leave it for now and just ensure that the basic quality control steps are included for all images.

@AntoniaR
Copy link
Contributor Author

AntoniaR commented Aug 2, 2021

This issue is partially completed by the pull request #591 but there remains an issue with the historical rms in Batch mode.

@AntoniaR
Copy link
Contributor Author

This code is currently very inefficient and not working correctly. This is a very useful option for R7 so we should assess how easy it would be to implement correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant