Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reference pixels set to DO_NOT_USE in DQ is fragile #7015

Open
jdavies-st opened this issue Sep 2, 2022 · 19 comments · May be fixed by #7017
Open

Reference pixels set to DO_NOT_USE in DQ is fragile #7015

jdavies-st opened this issue Sep 2, 2022 · 19 comments · May be fixed by #7017

Comments

@jdavies-st
Copy link
Collaborator

Currently, the way that reference pixels get set to DO_NOT_USE in calwebb_detector1 is by relying on them being set to zero in the GAIN reference file. If they are set to zero there (and there's not a strong reason they should be), then ramp_fit flags any gain values <0 or NaN as NO_GAIN_VALUE and DO_NOT_USE here

https://github.com/spacetelescope/stcal/blob/8e00460560eea757f6959ef17b5877583bc60de8/src/stcal/ramp_fitting/utils.py#L981-L1017

If they are not set to zero in the GAIN reference file, as was the case briefly with NIRCam gain reference files this week, then the reference pixels never get set to DO_NOT_USE, ramp_fitting fits them, and they end up in the output _i2d mosaic produced by calwebb_image3, producing unwanted artifacts. See right side below.

Screen Shot 2022-09-01 at 12 00 15

Setting GAIN reffile ref pix to zero so that they get flagged as DO_NOT_USE in ramp_fitting is a fragile and roundabout way to get that job done.

@hbushouse is there a good reason to not set the reference pixels DQ values to DO_NOT_USE at the end of the refpix step? They've been corrected and now can be ignored from here on out.

@hbushouse
Copy link
Collaborator

This is another one of those little details who's reasons are lost to the mists of time (at least it's pretty misty in my mind). I recall at least one Cal WG meeting in which the question was asked whether we should bother doing ramp fitting to reference pixels and the answer was along the lines of "Sure, why not; you never know what kind of interesting info that may turn up. The slopes don't get used for anything later anyway, so what can it hurt?" So they were not flagged and left as "normal" pixels.

And then of course some years later it was discovered that they were not getting removed when doing mosaics and I'm sure we implemented some kind of fix back then (when it was discovered the first time). Will have to dig around to see what the method was back then. Perhaps that's when we added the "REFERENCE PIXEL" DQ flag label and assumed that steps like resample would reject based on that flag alone (i.e. not also requiring DO_NOT_USE). Will need to go back and try to find the history of this. I'm 99% sure we supposedly fixed this already.

@nden
Copy link
Collaborator

nden commented Sep 2, 2022

That's my recollection too. It would be good to start a kind of algorithms subsection to each step in RTD to document things like this so the knowledge is not lost.

@stscirij
Copy link
Contributor

stscirij commented Sep 2, 2022

I vaguely remember the addition of the REFERENCE_PIXELS DQ bit was to make the reference pixel processing more straightforward, but it could certainly be used to exclude regions from subsequent processing. Perhaps the NON_SCIENCE DQ bit would be more appropriate to set after the refpix calculations are done?

@hbushouse
Copy link
Collaborator

But, but, but ... here in the resample step it sure looks to me like pixels flagged with REFERENCE_PIXEL are set to be rejected: https://github.com/spacetelescope/jwst/blob/master/jwst/resample/resample.py#L284

Is this somehow not working as intended?

@hbushouse
Copy link
Collaborator

But at the same time the top level resample_step module has https://github.com/spacetelescope/jwst/blob/master/jwst/resample/resample_step.py#L20 which does not pay attention to REFERENCE_PIXEL and gets stored in the kwargs passed around to lower-level routines: https://github.com/spacetelescope/jwst/blob/master/jwst/resample/resample_step.py#L212

So perhaps what we have here is failure to communicate between the various levels of the resample step?

@jdavies-st
Copy link
Collaborator Author

jdavies-st commented Sep 5, 2022

I think the thing to remember is that DO_NOT_USE needs to be set if the pixels are to not be used in subsequent (or the current) step, previous hacks aside. All the other higher bits are informational only. So we can set many bits, but in the end, resample should be using all the bits except DO_NOT_USE. This is the design architecture at least. The question then becomes which is the best place to set the reference pixels to DO_NOT_USE. At the end of refpix? Or later? ramp_fitting? Later still?

Of course resample has special problems because the code was pulled from drizzlepac which has a very different bit usage architecture, and it's never fully transitioned to the correct jwst one. I.e. the whole goodbits architecture. It should be jettisoned to be consistent with the rest of the pipeline. There used to be an open issue on this which I cannot find that I know @stscieisenhamer worked on.

The behavior up until this past week for the pipeline for NIRCam was that the GAIN reffile had no gain value in the reference pixels, and therefore there was no ramp fit in those areas - they were set to zero and flagged as DO_NOT_USE and NO_GAIN_VALUE. The newest GAIN reffiles that were delivered Friday, the current bestrefs, also have this feature. But it is fragile. Hence the current issue. And other instruments, such as MIRI do not set reference pixels in the GAIN reffile to zero. So for those, reference pixels will show up in the output mosaics.

@tapastro
Copy link
Contributor

tapastro commented Aug 7, 2024

@stscijgbot-jp

1 similar comment
@nden
Copy link
Collaborator

nden commented Aug 7, 2024

@stscijgbot-jp

@stscijgbot-jp
Copy link
Collaborator

This issue is tracked on JIRA as JP-3703.

@stscijgbot-jp
Copy link
Collaborator

Comment by Melanie Clarke on JIRA:

Thanks, Nadia Dencheva and Tyler Pauly!

David Law - I'm reviving this issue originally filed on GitHub a couple years ago because I think it might still be relevant.  

It looks to me like most of the time, reference pixels are set to DO_NOT_USE by the end of detector1, but it might be a good idea to explicitly make sure they are NaN/DO_NOT_USE after the reference pixel step so they don't accidentally get used for science.

There is currently an option for the reference_pixel step that allows preserving reference pixel values specifically for NIRSpec IRS2 data (preserve_irs2_refpix), for calibration purposes.  If desired, we could modify that argument to be preserve_refpix, and set reference pixels for all instruments to DO_NOT_USE only if it is set to False.

What do you think?

@jdavies-st
Copy link
Collaborator Author

I don't think setting the reference pixels to NaN during or at the end of refpix is needed. I don't see the purpose or use case, and that is not done currently. Only setting them to DO_NOT_USE is important, so that for ramp_fit, the result will be NaN if all groups are DO_NOT_USE. The result would be the same as currently done, but also allowing IRS2 data to preserve the values in the reference pixels.

@melanieclarke
Copy link
Collaborator

There is a separate effort (JP-3570, #8345) to ensure that whenever DO_NOT_USE is set, the data is also set to NaN. This is intended to help data integrity and consistency, and reduce similar fragility issues across the code base.

This is why I suggested adapting the optional argument to preserve the reference pixels if desired -- that way you can have the values if you need them for calibration or diagnostic purposes, but they won't be there to interfere with science by default.

@stscijgbot-jp
Copy link
Collaborator

Comment by Howard Bushouse on JIRA:

In a JWST Calibration meeting about 10 years ago, it was suggested that reference pixels retain their values all the way through processing and even be processed in the ramp fit step, based on the assertion that "you never know what kind of interesting or useful info they might provide". I don't know if they're processed (i.e. have slopes computed) in the ramp fit step, but I agree that they should not have their values wiped out after being used. DQ flags were implemented for a purpose, which is to be used in all situations and at all times. We shouldn't have to set pixel values to NaN in order to reduce the risk of bad data getting used. Software that pays attention to DQ flags should do that for us. Getting users to pay attention to DQ flags, on the other hand, is out of our control.

@stscijgbot-jp
Copy link
Collaborator

Comment by Howard Bushouse on JIRA:

Another option would be to take the approach used in other STScI cal pipelines for IR data (calnica, calwf3), which is to strip off the reference pixels from calibrated images once they've been used. In this case that would imply having them stripped away in rate/rateints files, which is already done for NIRSpec IRS2 data in terms of removing the extra interleaved ref pixels.

@stscijgbot-jp
Copy link
Collaborator

Comment by David Law on JIRA:

Melanie Clarke Picking up this thread.

I'm not sure what instrument James was using above (NIRCam?) but for MIRI at least the rate files contain finite values with DQ=REFERENCE_PIXEL.  The flatfield indicates that these have no valid flat, setting NO_FLAT_FIELD, NON_SCIENCE, and DO_NOT_USE, ensuring that they don't get further in the pipeline.

The specific issue to address really seems to be the ability of reference pixels to propagate to science data products.  The breakpoint to my mind in where it makes sense to propagate these any further is the flatfield step, after which they no longer have much if any meaning.  One potential solution may therefore be for the flatfield step to set DO_NOT_USE for anything with DQ=REFERENCE_PIXEL.  Most of the time this shouldn't make any difference since REFERENCE_PIXEL is typically also flagged as DO_NOT_USE in the flat reference files.

We can bounce off the teams at a JP call.

@stscijgbot-jp
Copy link
Collaborator

Comment by David Law on JIRA:

Melanie Clarke Digging into this a little further after the JP meeting, I'm no longer sure whether we actually need to change anything.

For NIRSpec IRS2 data, it looks like irs2_subtract_refpix explicitly strips out reference pixels if preserve_refpix=False, so it should be impossible for them to propagate.

For all other instruments and modes, the base class in reference_pixels calls get_pixeldq which sets DO_NOT_USE for all reference pixels.  These would then be caught by the new code in the flatfield step setting SCI=NaN for things with DQ=DO_NOT_USE.

As such I don't see a path for things to propagate accidentally any longer, and thus no need to make any changes?  Let me know if I'm missing something.

@stscijgbot-jp
Copy link
Collaborator

Comment by Melanie Clarke on JIRA:

David Law - it's been a minute since I thought about this issue, so I'll need to refresh my memory on why I thought this was still a potential problem.  I'll look into it in more detail, but a couple quick responses now...

For IRS2, the interleaved reference pixels are stripped by default, but the border pixels are left in place.  I don't know for sure if it's possible for them to have a non-empty value, though.  I'll look into it.

For the other instruments and modes, it looks like get_pixeldq sets DO_NOT_USE only for subarray data. For full frame data, the pixeldq array is copied as is.

@stscijgbot-jp
Copy link
Collaborator

Comment by Melanie Clarke on JIRA:

I reduced a variety of uncal files from the jwst regression test data set with detector1, saving the refpix and ramp fit products to survey what flags and values the reference pixels get after those pipeline steps.  I attached a quick table with a few different data points: number of reference pixels with DO_NOT_USE set, flags for the first pixel in the array (at 0,0 - always a border reference pixel for the files I checked), and the value at that pixel, in both refpix and rate files.

Results vary a bit by mode, but none of them have DO_NOT_USE set for all reference pixels after the refpix step.  Most modes have DO_NOT_USE set for all reference pixels after the ramp fit step, but not all.  All modes that have DO_NOT_USE set for the reference pixels also have those values set to NaN in the rate file.

In particular, for the NIRCam image jw01538046001_03105_00001_nrcalong, it looks like it is still true that the DO_NOT_USE flag is set for the reference pixels only because the NO_GAIN_VALUE flag is also set.

In short, I think the inconsistency by mode shown here is still a potential concern.  I don't know of any existing problems with reference pixels leaking into science calculations with current reference files, but it would be less fragile to explicitly exclude them somewhere along the line.  The flat field step is fine by me, if some modes like having reference pixel values in the rate files.

@stscijgbot-jp
Copy link
Collaborator

Comment by David Law on JIRA:

Got it- thanks for that example, that's helpful.  Can see now that after the refpix step only the NO_SAT_CHECK and REFERENCE_PIXEL flags are set.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants