-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Finite differences converges to different value than automatic differentiation. #26
Comments
can you illustrate this with a plot :-) Better yet, a unit test? |
but these are 2 different questions, the question of whether we get the correct gradients, and the question of whether we apply the correct shearing. |
Yes, indeed they are. I'm still collecting all the behaviours I observe (I wanted to post it earlier today, but had a lot to do here). What I know by now is:
|
@andrevitorelli great investigative work :-) But can you either upload your notebook or upload the plots that support your conclusions? It's very useful for other contributors (i.e. me) to be able to look into problems quickly, from a starting point ;-) Also, what would be the most constructive thing is to build unit tests, which can fail, highlihgting the potential issues with the code. Then we can work on fixes to these unit tests. |
@EiffL Thanks, and yes, I am in the process of creating a unit test that covers the problem precisely. It will just test two things: 1) does the finite difference derivative converge? (it always does, actually, but still) 2) when it does, does it differ from the autodiff value by a large (ie > 1e-3) value? I was checking what to expect from different examples. |
ok, so, a tiny bit of experimenting reveals the following. The tensorfow addon resampler seems to struggle to get derivatives around integer resampling points. I've added a test notebook here: https://github.com/CosmoStat/autometacal/blob/u/EiffL/test_grads/notebooks/MetacalGradients.ipynb When the resampling points are at half pixels, the gradients of the resampling step are perfect: But when at integer pixel values: (Tthese are residual plots) Ok, so, several steps from here:
Please let me know which option you want to pursue @andrevitorelli, I won't be very available this week, but don't hesiitate to ask me questions if you have any questions on how to proceed :-) |
Made a small demo showing the diffference in gradients when resample points are at integer vs half integer positions: You can use something similar to create a minimal working example to open an issue on tensorflow_addons. |
sorry ^^' one last comment before I end up offline in the countryside. This is also very correlated to what @Dr-Zero was working on. We are going to replace the interpolation code in tfa anyways. Are we close to a working implementation? Because this would be good to mention as well in a TFA issue, that we will have a fix in the form of a revamp of the interpolation implementation in resampler. |
@EiffL I'll talk to him during this week and work closely. I want to close this whole issue ASAP, preferably during this week, but surely, in a worst-case scenario, until the next one. |
Related to: tensorflow/addons#2535 |
How are we doing with a PR for TF addons ^^' ? |
Bonjour Francois,
Not much progress, I'm afraid, I had minor surgery and have been recovering
for the last two weeks.
Here's something I want to discuss:
Our code uses internally the SamplingKernelType enumeration from Tensorflow
to communicate the selected kernel type. Vitorelli asked me to add a new
quintic kernel. That is simple enough, except that since SamplingKernelType
is part of the main Tensorflow code, I can't really add a new element to
it, not without creating a path to Tensorflow itself. Now, the use of
SamplingKernelType is purely internal, the addons package uses strings to
identify the kernel type on its public APIs. As such, I believe we should
drop the internal use of SamplingKernelType. It wouldn't change any public
interface and we would be decoupled from Tensorflow on our Kernel selection.
Thiago.
…On Sat, Nov 6, 2021 at 2:06 PM Francois Lanusse ***@***.***> wrote:
How are we doing with a PR for TF addons ^^' ?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#26 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACAE3TUOVHRYJ65C5HEOVNTUKVOCZANCNFSM5A2YWQMA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Thanks for the update Thiago, and I hope you are recovering well! I took a look at your copy of the code, but it's not a fork of tensorflow/addons ? I'm not sure if you can open a PR if you didn't fork the repo from GitHub. |
@Dr-Zero @EiffL I have set up a fork at my own account here: addons, and checked the diff results locally:
Some of them are updates from upstream, and others are our work. I'm parsing through to see which is which to restructure the repo. I'll use git --author to fix the authorship of the changes. |
Interestingly, Mike Jarvis found what @Dr-Zero commented in one of our meetings: Quote:
This is not, however, the source of our troubles, as we only use the first derivative. |
Ok this is great! Here is what I recommend you do:
This way you won't have to deal with differences that are due to new modifs in tf addons. |
....or maybe easier, you just update your local clone of Thiago's repo to make it up to date with upstream. |
Ok, I think this worked, and kept @Dr-Zero's signed edits with it. Can you check, @EiffL , here: addons/new_kernels ? |
@Dr-Zero we need to come up with a PR for the addons repo ASAP for the code to be useful. |
Just to inform that the feature request was posted as an issue in the tfa repo here: addons#2779. They ask for people to do a feature request before submitting new features. |
As it is right now, when doing finite differences, it's clear that the derivative converges as we go to smaller step sizes. However, when using an odd number of pixels (which at the moment is the correct choice, as shear is applied around an integer central pixel, and we cannot properly shift the image without better interpolation), the value it converges to is consistently lower than the one given by automatic differentiation.
I'll post more about here illustrating what we know about this bug, but in general, it seems that when generating the metacalibration image, the current code creates an image with less shear than what we ask of it.
The text was updated successfully, but these errors were encountered: