Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comparison to ngmix.example.metacal.metacal.py #36

Closed
EiffL opened this issue Nov 6, 2021 · 21 comments
Closed

Comparison to ngmix.example.metacal.metacal.py #36

EiffL opened this issue Nov 6, 2021 · 21 comments
Assignees

Comments

@EiffL
Copy link
Member

EiffL commented Nov 6, 2021

This PR #37 adds a script that runs autometacal within the ngmix example script for direct comparison against ngmix's finite differences.

With default parameters I'm getting:

1000/1000 100%

S/N: 80161
-------------------
ngmix results:
R11: 0.351127
m: -0.000182258 +/- 0.000409339 (99.7% conf)
c: 9.31003e-07 +/- 4.1336e-06 (99.7% conf)
-------------------
autometacal results:
R11: 0.355601
m: -0.0126295 +/- 0.00031753 (99.7% conf)
c: 1.71161e-07 +/- 3.14146e-06 (99.7% conf)

So we see a discrepancy in m, but we keep in mind that:

I would attribute the difference in multiplicative bias to the difference in response matrix, and ultimately at this stage to our faulty interpolation gradients.

@andrevitorelli could you review the code in this PR #37 and try to run this script with your cubic interpolation code?

@EiffL EiffL linked a pull request Nov 6, 2021 that will close this issue
@EiffL
Copy link
Member Author

EiffL commented Nov 7, 2021

And for fun, I did a timing comparison :-) autometacal is about 65x faster than ngmix

@andrevitorelli
Copy link
Contributor

@EiffL if I understood correctly, the nb should reproduce these exact results? I'm not getting them, I might be missing something.

@EiffL
Copy link
Member Author

EiffL commented Nov 10, 2021

hummmm which notebook?

@EiffL
Copy link
Member Author

EiffL commented Nov 10, 2021

if you mean running the script in this PR: #37 you should get different results from mine, because my R matrix is wrong because I don't have the cubic interpolation code.

@andrevitorelli
Copy link
Contributor

Hey, @EiffL , do you think we could have a consistent definition of ResamplingType in galflow? The way I usually do this is define a hidden variable that's used everywhere throughout the package, and if it needed be, I set it externally when doing experiments. There are many places where ResamplingType is used throughout and at times it is tough to keep track if I've set them up properly.

In any case, with our cubic interp, the results are:

autometacal results:
R11: 0.34922
m: 0.00541301 +/- 0.000323341 (99.7% conf)
c: 1.74688e-07 +/- 3.19889e-06 (99.7% conf)

...which are closer but not there yet.

@EiffL
Copy link
Member Author

EiffL commented Nov 12, 2021

humm the simple way you would typically do this is just by choosing a convention, and making it the default in all functions, so that you don't have to think about it.
In which part of the code do you need to switch ResamplingType? In GalSim for instance, the default is Quintic and you never have to worry about it.

@EiffL
Copy link
Member Author

EiffL commented Nov 12, 2021

Hummmm interesting..... and slightly worrying ^^' I don't know why we would be getting differing results.....

@EiffL
Copy link
Member Author

EiffL commented Nov 12, 2021

This would mean that either our shear values, and/or our gradients are wrong :-/

@andrevitorelli
Copy link
Contributor

andrevitorelli commented Nov 12, 2021

Yes, exactly... ...so, in many functions as gf.convolve.convolve, gf.python.tfutils.transformer.perspective_transform... they are defined, separately. My idea was to have a single variable as gf._ResamplingType_ that would be initalised and used throughout the code. In this way you'd never need to worry that they are all used consistently. And if you ever come across something that you want to do with a different interpolator, as an experiment you just call gf._ResamplingType_ = tf.image.ResamplingType.YOURINTERPOLATOR and run with it.

@andrevitorelli
Copy link
Contributor

andrevitorelli commented Nov 12, 2021

This would mean that either our shear values, and/or our gradients are wrong :-/

Is this it or we just actually need quintic? Because we have improved by an order of magnitude...

@EiffL
Copy link
Member Author

EiffL commented Nov 12, 2021

Yeah it could be that we need quintic...

There is one way to know I guess. We can use our code but compute finite differences instead of autodiff response (like what you are doing in your comparison notebooks). If the bias stays roughly the same, it means that it most likely comes from the interpolation order. If the bias disappears, it means that our gradients are wrong.
If it becomes larger, well, we can't conclude much....

@andrevitorelli
Copy link
Contributor

Thus spake The Code:

-------------------
ngmix results:
R11: 0.351127
m: -0.000182258 +/- 0.000409339 (99.7% conf)
c: 9.31003e-07 +/- 4.1336e-06 (99.7% conf)
-------------------
autometacal results:
R11: 0.34922
m: 0.00541301 +/- 0.000323341 (99.7% conf)
c: 1.74688e-07 +/- 3.19889e-06 (99.7% conf)
finitediff results:
R11: 0.349435
m: 0.00479424 +/- 0.000323172 (99.7% conf)
c: 1.74447e-07 +/- 3.19686e-06 (99.7% conf)

Stepsize in finitediff was 0.01.

@andrevitorelli
Copy link
Contributor

w/ stepsize = 0.002, we get closer to autometacal, as we expected from previous investigations:

autometacal results:
R11: 0.34922
m: 0.00541301 +/- 0.000323341 (99.7% conf)
c: 1.74688e-07 +/- 3.19889e-06 (99.7% conf)
finitediff results:
R11: 0.349269
m: 0.00527089 +/- 0.000323326 (99.7% conf)
c: 1.74529e-07 +/- 3.19838e-06 (99.7% conf)

@EiffL
Copy link
Member Author

EiffL commented Nov 14, 2021

Ok, so, this is great, what can we conclude from this test?

  1. The gradients of the cubic interpolation and the whole autocal chain seem to be correct, because R11 is converging to the same value.

  2. It means that something is not quite right in our implementation of the metacal images and/or shape measurement. It's not a huge effect because otherwise it would also show up in our tests test_metacal.py and/or tests/test_tf_ngmix.py. It could be due to the interpolation kernel not being accurate enough.

So, the residual m bias can come from two things:

  1. Misestimation of the R11 factor: And the evidence would seem to point towards this because we find a significantly R11 value compared to ngmix.
  2. Misestimation of the g values (i.e. measured ellipticity without correction). We can try to make sure that this second point is ruled out by printing the "pre-correction" g value as measured by ngmix and autometacal. We should find that they agree, and thus rule out issues in the FFT, inverse FFT, and moments measurements.

@andrevitorelli
Copy link
Contributor

andrevitorelli commented Nov 22, 2021

Using Bernstein & Gruen 2014 quintic interpolation we have:

ngmix results:
R11: 0.351127
m: -0.000182258 +/- 0.000409339 (99.7% conf)
c: 9.31003e-07 +/- 4.1336e-06 (99.7% conf)
-------------------
autometacal results:
R11: 0.351393
m: -0.000804964 +/- 0.000321342 (99.7% conf)
c: 1.73607e-07 +/- 3.17911e-06 (99.7% conf)
finitediff results:
R11: 0.351884
m: -0.00219971 +/- 0.000320923 (99.7% conf)
c: 1.73232e-07 +/- 3.17461e-06 (99.7% conf)

@EiffL
Copy link
Member Author

EiffL commented Nov 22, 2021

oh oh oh, interesting :-) Does that mean quintic in TensorFlow?

It's just a bit surprising taht the finite diff doesnt converge to the same value :-/ hummm

@andrevitorelli
Copy link
Contributor

andrevitorelli commented Dec 9, 2021

After a more careful check of interpolations, the current result is (for an unrealistically high SNR (~1e7) galaxy - just for the sake of unit testing!)

-------------------
ngmix results:
R11: 0.34145
m: 0.000118535 +/- 0.000112539 (99.7% conf)
c: -1.37714e-07 +/- 1.19257e-06 (99.7% conf)
-------------------
finitediff results:
R11: 0.341446
m: 0.00016892 +/- 0.000142665 (99.7% conf)
c: -4.46461e-07 +/- 1.46364e-06 (99.7% conf)

autometacal results:
R11: 0.341706
m: -0.000592995 +/- 0.000142565 (99.7% conf)
c: -4.45364e-07 +/- 1.46246e-06 (99.7% conf)

Finite differences in autometacal are really close to the ngmix (all of them using Bernstein &Gruen 2014 'quintic' interpolation).

Now, into the devil details....

The R11 value differs between ngmix and finite differences in autometacal by 0.0012%. However, this corresponds to a difference of 42.5% in m! Why? And is this a problem? Let's check the values of the measured shear before calibration:

ngmix: 
g1 = 0.003414903118253962
autometacal finitediff:
g1 = 0.0034150328  

They differ by 0.0038% (Note the truncation due to single precision.)
After calibration we have:

ngmix: 
g1/R11 = 0.010001186106481666
autometacal finitediff:
g1/R11 = 0.010001689

...corresponding to a difference of 0.005%.

However, the remaining m with autometacal finitediff is still further away from zero then 3 sigma.

Now, with realistic noisy (SNR ~ 100, galsim def) galaxies:

-------------------
ngmix results:
R11: 0.343779
m: 0.0188038 +/- 0.265366 (99.7% conf)
c: 0.000885843 +/- 0.00271103 (99.7% conf)
-------------------
finitediff results:
R11: 0.343759
m: 0.0189779 +/- 0.265386 (99.7% conf)
c: 0.00088561 +/- 0.00271115 (99.7% conf)

autometacal results:
R11: 0.344023
m: 0.0181959 +/- 0.265183 (99.7% conf)
c: 0.000884929 +/- 0.00270907 (99.7% conf)

Now R11 differs by 0.006%, but m differs by only 0.9%. The unrealistic high SNR example makes m very sensitive.

Finally, as it is, autometacal gives marginally better results (~3%) in the realistic SNR example (the residual m is slightly less).

@EiffL
Copy link
Member Author

EiffL commented Dec 9, 2021

Nice results @andrevitorelli! So, the thing to look at is not necessarily the % difference between the m values, but whether or not you detect significant non zero m.

In the case of ngimx high SNR, you have m: 0.000118535 +/- 0.000112539 which means that for that sample size, the multplicative bias is consistant with zero at 1 sigma. However, in the case of autometacal high SNR, you have m: -0.000592995 +/- 0.000142565 (99.7% conf) so you detect at over 5 sigma level a non zero m

For the snr=100 sample you just cannot conclude anything, the m measurement is one order of mag smaller than the error bar, so it's all consistant with zero m, but at the 10^-1 level. You need a looooot more samples to reduce the error bars in this case.

@EiffL
Copy link
Member Author

EiffL commented Dec 9, 2021

Given that your finitediff impl. seems consistent with ngmix, this would seem to indicate that there is a small problem somewher in a gradient.

To make sure you can further reduce the error bars in your high snr test by increasing the sample size.

@EiffL
Copy link
Member Author

EiffL commented Mar 29, 2022

This is solved, right? We can close it?

@andrevitorelli
Copy link
Contributor

Yes, def.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants