-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make 3-arg dot rrule partially lazy #796
base: main
Are you sure you want to change the base?
Conversation
This is also technically a breaking change if we don't project. |
We may need to teach LazyArrays about projections, so that they project lazily? |
I like the idea of lazy projections. Maybe that would also provide a way to opt out of the projection at the gradient level by calling a |
hello, are there any updates on this? I really need to compute the gradient of |
I suppose you could just define your own |
Maybe even create a package |
What makes so difficult to directly apply the rrule on the standard |
problem is it is generally required to project the tangents back down onto the tangent space once it is computed. Anyway, that operation needs to also be done lazily if we want to return a lazy array from the 3 arg dot rrule. Which can be done, but it requires overloading |
This addresses #788. I had to remove the projection to make it work otherwise I get the following error due to a missing projection method. Projecting the lazy array to a dense array when
A
is dense partially defeats the purpose of this PR so I am leaving it up to the review process to decide what to do here. I can define a projection method if that's preferred.This is related to the discussion in FluxML/Zygote.jl#1507.