-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
POEM 093: linear solution caching #190
Conversation
@kanekosh In attempting to implement this, I've run into some test failures in cases where complex step is active. Is the example code valid if the b and x arrays are complex? If not, is there some way to modify it to make it valid? My current plan is to disable the caching when complex step is active. Is that an acceptable solution? |
@naylor-b I didn't think of complex numbers when implementing this prototype. I intended this linear solution caching to help accelerate the hierarchical linear solution process in I haven't tested, but I suspect the following lines do not work under the complex step. There should be a workaround, but I don't have a good solution off the top of my head...
Is it correct to say that the linear solver gets the complex |
@kanekosh I think that's correct that the only time the linear solver gets complex b and x is when computing derivatives using complex step at a level at or above a system that has a Newton solver, so it sounds like just disabling the caching when under complex step will be ok. |
Sounds good to me. Thank you for working on this! |
I have a branch here: https://github.com/naylor-b/OpenMDAO/tree/linear_cache that has the caching added to DirectSolver, PetscKrylov, and ScipyKrylov solvers if you want to try it out. BTW I added another option called |
I tried your branch on the OAS examples I included in this POEM, and it turned out the caching is not activated in I'm wondering if there's a more robust and inexpensive way of checking if two numpy arrays are parallel. I didn't think much about it when implementing the current way, which checks the dot product to the product of norms. |
Also, this is not directly related, but do you think it'd make sense to include the RHS=0 solver skip here along with the caching? The caching indirectly does that (in a slightly inefficient way), so it might not worth adding additional code for RHS=0 check, though. |
Sorry, I must have forgotten to re-run example2 after I tweaked the tolerance. I have an idea for a parallel array checker that's faster (on average) than the current way. I'll try it out and see if it works and is faster. I'll look into checking for RHS=0 as well. I thought we had already added an RHS=0 check earlier but I can't seem to find it. |
@kanekosh I changed the 'use_cache' option to 'rhs_checking' and got rid of the separate 'max_cache_entries' option. You can set the 'rhs_checking' option to True, False, or a dict that allows you to set specific attributes of the LinearRHSChecker object if you don't want the default values. The allowed dict entries are 'use_cache', 'check_zero', 'rtol', 'atol', 'max_cache_entries', and 'collect_stats'. By setting these attributes, caching and zero checking can be turned on and off independently. My idea for determining if two arrays are parallel did end up being much faster in the best case and 2 to 3x slower in the worst case (worst case was when the arrays were actually parallel). For now I'm just using your original algorithm. I also added display of cache hits and zero check hits so we can get a feel for how often our rhs checks actually help. I added a relevance check too that warns if caching has been turned on but there is no dependency between responses that would cause the same adjoint to be computed more than once. For example1 and example2, I get this warning:
these are the stats for example1:
and these are for example2:
What do you think about the warning related to the adjoint dependencies? Does it make sense to check that, or do you see problems with doing it that way? |
@naylor-b I think both the warning and stats are helpful. |
Another possibility would be to add an entry to the dict called 'auto', and if you set that to True then it would automatically activate the caching if it was needed based on the relevance. So something like:
Also, is it ok that we're using the same reltol and abstol for both the equality check (and negative check) and for the parallel check, or do those need different tolerances? |
Yes, I think that's ok. I don't think separate tolerances would be necessary. |
@kanekosh Is the implementation PR good with you? If so I'll consider this POEM accepted. |
Added implementation PR
The PR looks good to me, thank you! |
This POEM proposes to cache the subsystem-level linear solutions to avoid redundant linear solves with the same (or parallel) right-hand-side vector.