-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove _get_ν
#452
Remove _get_ν
#452
Conversation
Co-authored-by: willtebbutt <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Is it possible to test this? |
Codecov Report
@@ Coverage Diff @@
## master #452 +/- ##
==========================================
- Coverage 93.18% 93.16% -0.02%
==========================================
Files 52 52
Lines 1261 1259 -2
==========================================
- Hits 1175 1173 -2
Misses 86 86
Continue to review full report at Codecov.
|
The rule was wrong, so it seems it's not covered by the tests? Why was this workaround introduced? |
It seems in #425 the commented out |
Yes, that was intended, as discussed on that PR (#425 (comment)), where @theogf wrote
And while the change -test_ADs(x->MaternKernel(nu=first(x)),[ν])
+test_ADs(() -> MaternKernel(; nu=ν)) worked for Forward- and ReverseDiff, the additional workaround in #425 was required for Zygote to also pass the AD test (w.r.t. the input features, not the kernel hyperparameters!). So the workaround is tested by virtue of the Zygote AD test on MaternKernel now also passing. |
The main problem is that tests passed successfully for 08437cc even though the pullback is clearly incorrect. So I was wondering if the pullback was evaluated at all. |
src/basekernels/matern.jl
Outdated
@@ -37,8 +37,18 @@ MaternKernel(; nu::Real=1.5, ν::Real=nu, metric=Euclidean()) = MaternKernel(ν, | |||
|
|||
@functor MaternKernel | |||
|
|||
# workaround for Zygote | |||
# unclear why it's needed but it is fine since it's stated officially that we don't support differentiation with respect to ν |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this workaround, the Zygote AD test fails. You can see this in 30385ac and the corresponding build
What was incorrect about the pullback?
It was, I only added it because without Zygote errored, see #452 (comment) |
It returned the wrong number of outputs (was missing
The problem is just that https://github.com/FluxML/Zygote.jl/blob/95e10218f0187a839ac0539fefcbf39c727bf068/src/lib/array.jl#L30 does not support |
Yeah, I completely agree, but unfortunately it hasn't been fixed in Zygote, and I think "workaround + AD is working" is still preferable over "no workaround + AD is broken". |
Instead of making a PR to KernelFunctions introducing a workaround you could have made one for Zygote that fixes the issue 🤔 |
I managed to figure out how to fix the issue with a workaround in KernelFunctions. That doesn't mean I would immediately know how to identify & fix the underlying issue upstream. If I did know how to fix it there, I would have, as I have in various other cases. You seem to have an amazing overview as well as deep understanding of a lot of different packages. The ability and time for that is unfortunately not something that everyone has available in the same amount. I'm assuming you're not intending to, but you give the impression of being rather disparaging of other people who're not thinking the same way you are. |
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
IMO usually it is very helpful to open at least Github issues in cases where it is not immediately clear how to fix the issue. Workarounds can be acceptable if they are fixed upstream eventually, but I assume it is unlikely that they are fixed if nobody makes the Zygote developers aware of it. Generally, I would have given you this advice in your PR but as desired I don't review your PRs anymore. |
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@devmotion I've pared it down to a minimal reproducible example: FluxML/Zygote.jl#1204 but I can't comment on anything else, if you've got further thoughts on what the issue is / what's needed to fix it, please comment there. |
Test errors seem unrelated? I didn't change anything related to them, I think. |
Agreed. Maybe we need to relax a tolerance somewhere? |
Yes, I think it's due to the changes in #447; the tests are stochastic (if you just re-run the failed tests, they might pass the next time - I've seen that in #454 as well which has zero code changes). In #447 I reduced the size of the kernel matrix evaluations; the isapprox() call (set with atol=1e-9, rtol=1e-9) is on the arrays as a whole, which does not look at individual elements, but instead checks whether the norm of the difference is within the tolerances, relative to the max of the norm of the individual matrices (see discussion). For the random inputs we're using in the tests, the norm of the kernel matrix is linear in the dimensions. So by reducing the size from ~1000 to ~10 we're reducing the norm by a factor ~100, so yes that suggests to me we should increase the rtol correspondingly. Alternatively, we could change it to element-wise isapprox comparison? then it's independent of the matrix norm. |
Shouldn't the scale be irrelevant if we use KernelFunctions.jl/src/test_utils.jl Lines 75 to 76 in 8e805ef
rtol per se but that the default already was already too low even before the recent changes. I completely agree with https://discourse.julialang.org/t/unit-test-equality-for-arrays/8138/12?u=devmotion that in general you want rtol since it is completely unclear what a meaningful atol is in general. Specifying a small atol as well mainly helps with comparisons that involve exact 0s which would fail otherwise.
|
Yes, I agree. Probably an oversight in the original test code of not realizing that by only setting atol you end up setting rtol=0... |
Apparently #455 was not sufficient. |
Bump. |
1 similar comment
Bump. |
Could we keep the global atol and rtol variables? |
I don't mind. It just seemed unnecessary to have global variables that are used only in a single place, and clearer (ie easier to spot) if the values are just used as default keyword arguments in the function definition. |
@theogf Did you see my last comment? Would you like it to be reverted? |
I think keeping the global const is a bit more elegant and easily findable since it's in the same file. But again I do not mind, I'll just accept the PR and do what you want 😅 |
Fix the bug introduced in #425 (see #450).
Removes the workaround.