Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements to dealing with masked/nonfinite data in Specreduce operations #192

Closed
cshanahan1 opened this issue Oct 26, 2023 · 3 comments · Fixed by #216
Closed

Improvements to dealing with masked/nonfinite data in Specreduce operations #192

cshanahan1 opened this issue Oct 26, 2023 · 3 comments · Fixed by #216

Comments

@cshanahan1
Copy link
Contributor

cshanahan1 commented Oct 26, 2023

This issue is meant to be a catchall issue to summarize meetings and internal tickets at Space Telescope on the topic of improving masks / treatment of NaNs in data across the package. Link any related issues here, and the floor is open for discussion on this!

Current issue(s) with masking

  • In both Horne and Boxcar extract, 2D masks are allowed as input on NDData. However, these masks are then collapsed to 1D and entire columns are excluded in the presence of one NaN. This has been creating issues for JWST data which are littered with NaNs, so there should be an options to treat this to avoid holes in extractions/traces. (Improper use of mask in HorneExtract #167)

  • In FitTrace, fits to individual columns within a bin filter masked values, and fall back to an the 'all-bin fit' when fully masked (which is different default behavior than Extract). This can produce strange results, and would benefit from additional masking options (interpolation, or setting to 0).

Options for treating NaNs

In all operations where masking is relevant (trace / extract at least), provide options for treatment of NaNs:

  1. When possible, filter nonfinite values before computation (e.g in FitTrace).
  2. Omit columns with nonfinite values when the mask is collapsed from 1D to 2D (current behavior for Extract)
  3. Have a fill value of 0 for non-finite values
  4. Interpolate between good values.

Proposal:

  • New arg mask_treatment on all operations (at least extractions, fit trace)
  • options = ['filter', 'omit', interpolate', 'zero-fill'], set to either 'filter' or 'omit' to maintain current behavior as default for each operations
    extract = specreduce.extract.HorneExtract(image-bg, trace, variance=var_array, nan_treatment='zero-fill')
@cshanahan1
Copy link
Contributor Author

@tepickering
Copy link
Contributor

my take is that tracing and extraction are different use cases that should handle NaN's differently, or at least use different defaults. for extraction, being conservative and throwing out any column with any NaN's is an appropriate default. otherwise columns with less valid data can create artificial absorption features. doing anything other than that at best complicates determining uncertainties.

tracing, however, is often defined independent of any given data. FlatTrace being one example. future examples will include edge detection in flat-field images for multi-slit or multi-order data.

FitTrace is kind of a special case that uses the data itself. since it's a process that already involves interpolation/extrapolation, it can be a lot more lenient in how it handles masked data. either interpolating NaNs or using np.nansum to bin along the dispersion axis seem appropriate.

i should also note that saturated values probably shouldn't always be masked as NaN's. they are numbers and you know the lower limit to their actual values. so there is information there that can be used up to a point. in the case of FitTrace, including saturated values when centroiding can often lead to better results than leaving them masked.

@cshanahan1
Copy link
Contributor Author

I am currently working on a PR to add a new argument to all specreduce operations called mask_treatment with two implemented options - 'omit' and 'zero-fill'. A follow up effort to add an 'interpolate' option will be next. Thoughts?

@cshanahan1 cshanahan1 changed the title Improvements to dealing with masks/NaNs in Specreduce operations Improvements to dealing with masked/nonfinite data in Specreduce operations Jan 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants