JP-3757: Use squared drizzle coeffs for variance and error arrays #8997

mcara · 2024-12-09T15:11:58Z

This changes the way variance and error arrays in the output (resampled) images are computed by resampling them directly using squared drizzle coefficients basically to enable standard error propagation.

Tasks

Regression test(s):
https://github.com/spacetelescope/RegressionTests/actions/runs/12239324062
https://github.com/spacetelescope/RegressionTests/actions/runs/12293444575 (after likely final revision of drizzle PR)
https://github.com/spacetelescope/RegressionTests/actions/runs/12300631913 (post 7531f55)

news fragment change types...

changes/<PR#>.general.rst: infrastructure or miscellaneous change
changes/<PR#>.docs.rst
changes/<PR#>.stpipe.rst
changes/<PR#>.datamodels.rst
changes/<PR#>.scripts.rst
changes/<PR#>.fits_generator.rst
changes/<PR#>.set_telescope_pointing.rst
changes/<PR#>.pipeline.rst

stage 1

changes/<PR#>.group_scale.rst
changes/<PR#>.dq_init.rst
changes/<PR#>.emicorr.rst
changes/<PR#>.saturation.rst
changes/<PR#>.ipc.rst
changes/<PR#>.firstframe.rst
changes/<PR#>.lastframe.rst
changes/<PR#>.reset.rst
changes/<PR#>.superbias.rst
changes/<PR#>.refpix.rst
changes/<PR#>.linearity.rst
changes/<PR#>.rscd.rst
changes/<PR#>.persistence.rst
changes/<PR#>.dark_current.rst
changes/<PR#>.charge_migration.rst
changes/<PR#>.jump.rst
changes/<PR#>.clean_flicker_noise.rst
changes/<PR#>.ramp_fitting.rst
changes/<PR#>.gain_scale.rst

stage 2

changes/<PR#>.assign_wcs.rst
changes/<PR#>.badpix_selfcal.rst
changes/<PR#>.msaflagopen.rst
changes/<PR#>.nsclean.rst
changes/<PR#>.imprint.rst
changes/<PR#>.background.rst
changes/<PR#>.extract_2d.rst
changes/<PR#>.master_background.rst
changes/<PR#>.wavecorr.rst
changes/<PR#>.srctype.rst
changes/<PR#>.straylight.rst
changes/<PR#>.wfss_contam.rst
changes/<PR#>.flatfield.rst
changes/<PR#>.fringe.rst
changes/<PR#>.pathloss.rst
changes/<PR#>.barshadow.rst
changes/<PR#>.photom.rst
changes/<PR#>.pixel_replace.rst
changes/<PR#>.resample_spec.rst
changes/<PR#>.residual_fringe.rst
changes/<PR#>.cube_build.rst
changes/<PR#>.extract_1d.rst
changes/<PR#>.resample.rst

stage 3

changes/<PR#>.assign_mtwcs.rst
changes/<PR#>.mrs_imatch.rst
changes/<PR#>.tweakreg.rst
changes/<PR#>.skymatch.rst
changes/<PR#>.exp_to_source.rst
changes/<PR#>.outlier_detection.rst
changes/<PR#>.tso_photometry.rst
changes/<PR#>.stack_refs.rst
changes/<PR#>.align_refs.rst
changes/<PR#>.klip.rst
changes/<PR#>.spectral_leak.rst
changes/<PR#>.source_catalog.rst
changes/<PR#>.combine_1d.rst
changes/<PR#>.ami.rst

other

changes/<PR#>.wfs_combine.rst
changes/<PR#>.white_light.rst
changes/<PR#>.cube_skymatch.rst
changes/<PR#>.engdb_tools.rst
changes/<PR#>.guider_cds.rst

schlafly · 2024-12-09T15:31:30Z

Hi Mihai,

FYI, I'm at a conference this week and so won't be prompt to review here. But more generally, what part of this PR are you most interested in having reviewed? This PR, or something more like this bit on drizzle?
https://github.com/spacetelescope/drizzle/pull/163/files#diff-7406b81a9ebbfb8c8f4914f6a9f5090e60f96b9c977c1ee1a156fd65ab5de9d1R52-R70

mcara · 2024-12-09T15:36:40Z

Hi Mihai,

FYI, I'm at a conference this week and so won't be prompt to review here. But more generally, what part of this PR are you most interested in having reviewed? This PR, or something more like this bit on drizzle? https://github.com/spacetelescope/drizzle/pull/163/files#diff-7406b81a9ebbfb8c8f4914f6a9f5090e60f96b9c977c1ee1a156fd65ab5de9d1R52-R70

That is exactly the part that I was thinking about for you in this PR.

codecov · 2024-12-09T16:14:52Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 66.60%. Comparing base (e38bf84) to head (a29c7b2).
Report is 96 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #8997      +/-   ##
==========================================
+ Coverage   64.56%   66.60%   +2.03%     
==========================================
  Files         375      375              
  Lines       38890    37898     -992     
==========================================
+ Hits        25109    25241     +132     
+ Misses      13781    12657    -1124

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

melanieclarke

I like these changes a lot! Thank you for working on this!

I think it would be a good idea to add some unit tests for the output error values when a pixel scale is applied - it looks like currently only the data values are tested in that case.

Documentation will need to be updated, in docs/jwst/resample/main.rst -- there is a note there about error propagation that will need replacing.

jwst/resample/resample.py

drlaw1558 · 2024-12-10T21:41:34Z

Thanks @mcara ! I'm seeing some very strange results though in testing; here's the VAR_RNOISE extension from b11.1 (left) and the revised drizzle (right) for resampling a single MIRI input frame. Perhaps related to @melanieclarke 's comments on pixel scale?

mcara · 2024-12-10T21:52:23Z

@drlaw1558 I have addressed @melanieclarke comments about pixel scale last night. So if your tests are recent, this should not be an issue.

mcara · 2024-12-10T21:54:13Z

the image on the right is gorgeous though (or fascinating)

drlaw1558 · 2024-12-10T21:57:34Z

@mcara Indeed! I've confirmed the structure in a hacked input with uniform variances, and that's even more impressive. This is, however, with what should be the latest version (as of 30 minutes ago) of both your jwst and drizzle branches. My best guess without digging into the code more closely yet is something to do with the input pixel scale as it shows up even when the input variance values are completely uniform.

mcara · 2024-12-10T22:02:04Z

May I ask what is the variation in the resampled (variance) image when input variance is constant?

drlaw1558 · 2024-12-10T22:05:18Z

In this case I set VAR_POISSON=0.013 everywhere in the input cal file, and the output VAR_POISSON values are varying from about 0.003 to 0.013

mcara · 2024-12-11T09:07:21Z

The scale that @melanieclarke refers to is a constant factor common to all pixels. It cannot be responsible for the observed pattern. So I investigated this to understand what is causing it. I did this investigation using a MIRI image that I had available locally.

Please refer to equations (4) and (5) in Fruchter and Hook paper on Drizzle. For simplicity, let's ignore total weight W, that is, let's assume w_{ij}=1:

F_{xo,yo} = (s**2) * sum_{xi, yi}(  a_{xi,yi,xo,yo} * d_{xi,yi} )

where a_{xi,yi,xo,yo} is the area of overlap between the input pixel with coordinates xi,yi with the output pixel with coordinates xo,yo. If pixel scale ratio is close to 1, normally anywhere between 1 to 4 input pixels will contribute to one output pixel. Sum of all a_{xi,yi,xo,yo} for a given output pixel (xo,yo is fixed) should be close to 1 - the area of an output pixel (in pixel coordinate system). It may not be exactly 1 due to approximations in computing overlaps but it should be pretty close. I have verified that this is indeed the case for my test image.

Using propagation of errors and assuming independent uncertainties for input pixels, we get:

Var(F_{xo,yo}) = (s**4) * sum_{xi, yi}(  (a_{xi,yi,xo,yo})**2 * Var(d_{xi,yi}) )

Assuming a constant variance for input pixels (just like you did in your simulation, @drlaw1558 ), the variance of a resampled (output) pixel is proportional to a sum of areas of overlap squared. When I did this for all output pixels (for my MIRI test image), I got this image:

Basically, this is caused by contributions from varying fraction of overlap between the output pixels and overlapping input pixels (weights). These weights are squared for variance computations (uncertainty propagation). To illustrate this, refer to Fig 2 in the above paper. I modified it to label overlaps from different input pixels:

So, as I mentioned, a1 + a2 + a3 + a4 = 1 while a1*a1 + a2*a2 + a3*a3 + a4*a4 != 1. For example, assume pixel scale ratio (output scale / input scale) is 1 and there are no distortions and the only difference between the input grid and the output grid is a simple shift (sx, sy) along axes and let's generate several random shifts:

         (sx, sy)             a1         a2          a3           a4     sum(a)  sum(a**2)

(0.96820787, 0.85810016)  0.13738854, 0.0045113 , 0.02728083, 0.83081933   1     0.70990097
(0.31189278, 0.6741815 )  0.10162044, 0.22419806, 0.46390915, 0.21027234   1     0.32001765
(0.6763692,  0.31097321)  0.4660365 , 0.22299029, 0.10064051, 0.2103327    1     0.32128305
(0.54457195, 0.09407183)  0.49334307, 0.4125851 , 0.04284295, 0.05122888   1     0.41807377
(0.58139802, 0.73719368)  0.15279507, 0.11001124, 0.30859074, 0.42860295   1     0.31437754

So, you can see that sum_{ix,iy}( (a_{xi,yi,xo,yo})**2 ) is not 1 and it varies quite a bit. The pattern that you are seeing in the variance image is caused by variable (due to WCS distortion) overlap between input image pixels and output pixels and because of adding squares of the overlap areas. I don't know what to do about it: it is what it is.

melanieclarke · 2024-12-11T15:05:01Z

In this case I set VAR_POISSON=0.013 everywhere in the input cal file, and the output VAR_POISSON values are varying from about 0.003 to 0.013

I didn't anticipate the large scale pattern, but these values do make sense - we would expect the variance to be at most 0.013, for pixels with only one input pixel contributing, and less for pixels with up to 4 input pixels contributing. I'll test on some NIRSpec spectral data - I bet these patterns will align nicely with some residual resampling effects we see sometimes.

schlafly · 2024-12-11T15:40:38Z

I am still at a conference and haven't gotten a chance to dig in. But I'd mention:

for pixfrac = 1, my expectation is that the output variance varies by a factor of 4, equal to the input variance when the input & output pixels line up exactly and equal to the input variance divided by 4 when the input pixels exactly hit the corners of the output pixels.
If you drizzle random noise the expectation is that output image / sqrt(output variance) is Gaussian with sigma = 1. i.e., the RMS of the output image should neatly match the these very interesting shapes (which do look plausibly like distortion + moire-like effects as the pixel grids slowly shift relative to one another). That would seem like a good way to test whether this code is behaving as expected. The low variance regions will have higher pixel-pixel correlations so the variance of the sum of groups of pixels isn't actually decreasing, but we aren't tracking that.

melanieclarke · 2024-12-11T20:18:59Z

The NIRSpec example I had in mind is not as clear as I imagined it would be, but I think it does still illustrate Eddie's point.

Here's an image of a very low-signal MOS slitlet, for which a user asked about the cyclic striping and blurring that appears in the s2d SCI extension. I highlighted a striped region in the image, which is really just alternating detector column noise in the rate image that propagates through resampling in regions where there are fewer input pixels combined to the output. Reduced with the old variance propagation, you can see a slightly higher Poisson variance in the striped regions of the s2d, compared to the blurred regions, if you squint. Reduced with the new drizzle variance propagation, the difference is a little more pronounced, and you can see more of the pattern in the overall error image as well. Overall, error levels are a little lower than before, which is as expected.

jwst/resample/resample.py

drlaw1558 · 2024-12-12T17:11:11Z

I've had a chance to test this properly now, and I'm convinced that the structures are indeed real, if rather more impressive than I'm used to from the tiny IFU fields of view.

In order to test this, I set up a mock MIRI imaging input in which read and flat noise were zero everywhere, the poisson variance was 0.05*0.05 everywhere, and the SCI array was drawn from a gaussian random distribution about 1.0 with the sigma of the gaussian distribution equal to 0.05. In order to eliminate covariance effects from pixel-to-pixel comparisons, I then ran 100 Monte Carlo realizations of the noise in the input cal file, and ran all 100 individually through the resampling code. It should then be possible to measure the actual RMS noise in the SCI extension for each resampled pixel across all 100 different noise realizations.

Attached are the results. As expected, current pipeline gives a constant ERR=0.05 everywhere. The new drizzle code represented by this PR gives the plot in the middle, with ERR values ranging from about 0.025 to 0.05. The plot on the right is what I get from the Monte Carlo test, which looks dead on for agreement with the new pipeline.

mcara · 2024-12-12T18:22:33Z

@drlaw1558 Thanks for validating the math.

mcara · 2024-12-12T18:42:38Z

Can this PR now be formally approved?

melanieclarke · 2024-12-12T18:55:59Z

Can this PR now be formally approved?

I'll review again for any last clean up details, and check out the regtest results. We should merge and release the drizzle changes first, though, and update the pin in pyproject.toml before this is merged.

drlaw1558 · 2024-12-12T19:03:08Z

I think I'd like to understand the implications of this a little more before we merge it.

mcara · 2024-12-12T19:14:30Z

@melanieclarke Regression tests are listed in the top message. One is running right now.

melanieclarke · 2024-12-12T19:35:06Z

@drlaw1558 - should we defer this change for the next build then?

drlaw1558 · 2024-12-12T20:12:53Z

I think it's clear that the variance information provided by this change is indeed more correct; my concern is the covariance. I.e., most people tends to ignore the covariance between resampled pixels when doing photometry or other kinds of analyses. Given the larger than (in hindsight naively) anticipated difference with doing the mathematically correct thing for variances, it would be good to make sure that the improved variances don't interact in unexpected ways with the lack of covariance treatment when looking at the statistics derived for actual extracted sources. That's not too onerous a task, but nonetheless probably longer than the next few days. I'd thus be inclined to defer merging this until 11.3 so that we can study the implications more fully now that we have a working implementation.

melanieclarke · 2024-12-12T20:15:44Z

I'd thus be inclined to defer merging this until 11.3 so that we can study the implications more fully now that we have a working implementation.

Okay, sounds good. I'll move it to the next build milestone then.

mcara · 2024-12-12T21:26:09Z

I think it's clear that the variance information provided by this change is indeed more correct; my concern is the covariance. I.e., most people tends to ignore the covariance between resampled pixels when doing photometry or other kinds of analyses. Given the larger than (in hindsight naively) anticipated difference with doing the mathematically correct thing for variances, it would be good to make sure that the improved variances don't interact in unexpected ways with the lack of covariance treatment when looking at the statistics derived for actual extracted sources. That's not too onerous a task, but nonetheless probably longer than the next few days. I'd thus be inclined to defer merging this until 11.3 so that we can study the implications more fully now that we have a working implementation.

I agree with these concerns, however, proposed changes are correct mathematically and the current variance computation is just wrong (and still correlated, just not visibly obvious). IMO, we should at least get better variance estimation at this time and think about covariance in a different PR. Resampled variance (and data) are known to be correlated.

melanieclarke

I reviewed the last set of regression tests, and changes look appropriate to me. Error and variance values for resampled and downstream products are generally a little smaller. There are also some unrelated changes to NIRSpec spec3 products in the results, due to some outlier detection parameter updates, so we will need to run again before we merge this.

There are a couple irrelevant comments in the new unit test that can be removed, but otherwise this all looks good to me.

I officially approve, from my perspective, but I will mark this as "Request changes" to make sure this doesn't get merged before the drizzle update is complete and INS is comfortable with the change.

jwst/resample/tests/test_resample_step.py

drlaw1558 · 2024-12-16T17:47:39Z

Ok, have had a chance to do some more simulations, this time introducing two point sources, one atop a ridge in the moire pattern, and one atop a trough. I.e., one for which the output grid is well matched to the input grid, and the other for which it's almost completely out of phase with the sampling. Inputs have poisson and read noise, and confirmed that the nominal noise in the Monte Carlo cal.fits files matches the actual noise in the cal files. Run these Monte Carlo tests through resampling and source_catalog steps and extract the photometry.

With the current pipeline:
Star 1 Pipeline SNR: 159
Star 1 Monte-Carlo SNR: 156
Star 2 Pipeline SNR: 162
Star 2 Monte-Carlo SNR: 194

All pretty close to correct, although some possible minor gains for Star 2 that the pipeline isn't recognizing.

The reason we got into this effort was to fix the scaling when drizzling to large pixels. With large (0.2 arcsec) pixels in the current pipeline:
Star 1 Pipeline SNR: 88
Star 1 Monte-Carlo SNR: 154
Star 2 Pipeline SNR: 89
Star 2 Monte-Carlo SNR: 158

Pipeline SNR estimates on both extracted point sources are off by ~ a factor of two because variance isn't propagated correctly.

The new code fixes the issue when drizzling to large pixels. With large (0.2 arcsec) pixels in this PR:
Star 1 Pipeline SNR: 197
Star 1 Monte-Carlo SNR: 165
Star 2 Pipeline SNR: 191
Star 2 Monte-Carlo SNR: 184

Some small disagreements, but much close than the existing code.

The problem, however, is applying the new code to the default pixel scale. With regular pixels in this PR:
Star 1 Pipeline SNR: 161
Star 1 Monte-Carlo SNR: 165
Star 2 Pipeline SNR: 319
Star 2 Monte-Carlo SNR: 189

I.e., the SNR of extracted sources is off by ~ a factor of 2 for the source where the output drizzle grid didn't closely match the input grid.

The reason why is because, for output drizzled pixel scales close to the input scale, the mistake in variance propagation in the drizzle code was very nearly cancelled out by the mistake of ignoring the covariance. If we fix only the first one, we're going to see the problem of the second more clearly in our normal pipeline photometry. This will require some further discussion on best path forward.

schlafly · 2024-12-16T18:24:05Z

Have you talked to Stefano about this? He has pretty strong feelings about how the variances "should" be computed in order to get the right variances of sums over large-ish areas (i.e., when the covariances matter). I have not tried to work that out myself, unfortunately.

drlaw1558 · 2024-12-16T23:13:10Z

Agreed Stefano would be good to include in this discussion; likely early in the new year.

mcara added outlier_detection resample resample_spec labels Dec 9, 2024

mcara added this to the Build 11.2 milestone Dec 9, 2024

mcara requested review from nden, schlafly, melanieclarke, tapastro and drlaw1558 December 9, 2024 15:11

mcara self-assigned this Dec 9, 2024

mcara requested a review from a team as a code owner December 9, 2024 15:11

github-actions bot added testing installation labels Dec 9, 2024

mcara force-pushed the err-propagation-resample branch from fc2e46c to 812946a Compare December 9, 2024 15:25

stscijgbot-jp mentioned this pull request Dec 9, 2024

Implement variance propagation in the drizzle code #8813

Open

melanieclarke reviewed Dec 9, 2024

View reviewed changes

jwst/resample/resample.py Outdated Show resolved Hide resolved

jwst/resample/resample.py Outdated Show resolved Hide resolved

mcara force-pushed the err-propagation-resample branch from bf553f5 to 6cf0b3a Compare December 12, 2024 00:46

mcara force-pushed the err-propagation-resample branch from db155c0 to bf24e67 Compare December 12, 2024 15:00

mairanteodoro reviewed Dec 12, 2024

View reviewed changes

jwst/resample/resample.py Outdated Show resolved Hide resolved

mcara force-pushed the err-propagation-resample branch from bf24e67 to 64aad44 Compare December 12, 2024 15:25

melanieclarke reviewed Dec 12, 2024

View reviewed changes

jwst/resample/resample.py Outdated Show resolved Hide resolved

mairanteodoro reviewed Dec 12, 2024

View reviewed changes

jwst/resample/resample.py Show resolved Hide resolved

mcara force-pushed the err-propagation-resample branch 2 times, most recently from 3bd0f8e to b86e6ce Compare December 12, 2024 15:46

mcara added 4 commits December 12, 2024 10:52

JP-3757: Use squared drizzle coeffs for variance and error arrays

72e7bb4

temporarily change drizzle dependency

bc372bd

add unit test for varying pixel scale ratio

d23e4a1

Fix err computation for undefined variances

7531f55

mcara force-pushed the err-propagation-resample branch from b86e6ce to 7531f55 Compare December 12, 2024 15:52

melanieclarke modified the milestones: Build 11.2, Build 11.3 Dec 12, 2024

melanieclarke requested changes Dec 13, 2024

View reviewed changes

jwst/resample/tests/test_resample_step.py Outdated Show resolved Hide resolved

jwst/resample/tests/test_resample_step.py Outdated Show resolved Hide resolved

Remove outdated code comments

a29c7b2

mcara mentioned this pull request Dec 17, 2024

Add common resample code to stcal. spacetelescope/stcal#320

Open

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JP-3757: Use squared drizzle coeffs for variance and error arrays #8997

JP-3757: Use squared drizzle coeffs for variance and error arrays #8997

mcara commented Dec 9, 2024 •

edited

Loading

schlafly commented Dec 9, 2024

mcara commented Dec 9, 2024 •

edited

Loading

codecov bot commented Dec 9, 2024 •

edited

Loading

melanieclarke left a comment

drlaw1558 commented Dec 10, 2024

mcara commented Dec 10, 2024

mcara commented Dec 10, 2024

drlaw1558 commented Dec 10, 2024

mcara commented Dec 10, 2024

drlaw1558 commented Dec 10, 2024

mcara commented Dec 11, 2024 •

edited

Loading

melanieclarke commented Dec 11, 2024

schlafly commented Dec 11, 2024

melanieclarke commented Dec 11, 2024

drlaw1558 commented Dec 12, 2024

mcara commented Dec 12, 2024

mcara commented Dec 12, 2024

melanieclarke commented Dec 12, 2024

drlaw1558 commented Dec 12, 2024

mcara commented Dec 12, 2024

melanieclarke commented Dec 12, 2024

drlaw1558 commented Dec 12, 2024

melanieclarke commented Dec 12, 2024

mcara commented Dec 12, 2024

melanieclarke left a comment

drlaw1558 commented Dec 16, 2024

schlafly commented Dec 16, 2024

drlaw1558 commented Dec 16, 2024

JP-3757: Use squared drizzle coeffs for variance and error arrays #8997

Are you sure you want to change the base?

JP-3757: Use squared drizzle coeffs for variance and error arrays #8997

Conversation

mcara commented Dec 9, 2024 • edited Loading

Tasks

stage 1

stage 2

stage 3

other

schlafly commented Dec 9, 2024

mcara commented Dec 9, 2024 • edited Loading

codecov bot commented Dec 9, 2024 • edited Loading

Codecov Report

melanieclarke left a comment

Choose a reason for hiding this comment

drlaw1558 commented Dec 10, 2024

mcara commented Dec 10, 2024

mcara commented Dec 10, 2024

drlaw1558 commented Dec 10, 2024

mcara commented Dec 10, 2024

drlaw1558 commented Dec 10, 2024

mcara commented Dec 11, 2024 • edited Loading

melanieclarke commented Dec 11, 2024

schlafly commented Dec 11, 2024

melanieclarke commented Dec 11, 2024

drlaw1558 commented Dec 12, 2024

mcara commented Dec 12, 2024

mcara commented Dec 12, 2024

melanieclarke commented Dec 12, 2024

drlaw1558 commented Dec 12, 2024

mcara commented Dec 12, 2024

melanieclarke commented Dec 12, 2024

drlaw1558 commented Dec 12, 2024

melanieclarke commented Dec 12, 2024

mcara commented Dec 12, 2024

melanieclarke left a comment

Choose a reason for hiding this comment

drlaw1558 commented Dec 16, 2024

schlafly commented Dec 16, 2024

drlaw1558 commented Dec 16, 2024

mcara commented Dec 9, 2024 •

edited

Loading

mcara commented Dec 9, 2024 •

edited

Loading

codecov bot commented Dec 9, 2024 •

edited

Loading

mcara commented Dec 11, 2024 •

edited

Loading