You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Improve the speed of RENs and LBDNs evaluated on the GPU. At the moment, there is basic support allowing the the models to be trained/evaluated on the GPU, but much of the code has been optimised for the CPU.
The text was updated successfully, but these errors were encountered:
One particular area for improvement is the backwards pass of _N_lip and _N_gen in the case ny > nu. For some reason, there must be scalar operations happening under the hood when we write A / B instead of A * inv(B), which CUDA.jl is not a fan of. Investigate this further. This behaviour was first identified in 567f801.
Examples of where it is a problem:
In norm_cayley when calculating B_T for DenseLBDN
In _N_lip and _N_gen when calculating N for D22 term in LipschitzRENParams and GeneralRENParams
The above #119 (comment) has been solved in d5bfd02, and was caused by adding the identity operator I to a CUDA array before back-propagating through A / (I + B).
Improve the speed of RENs and LBDNs evaluated on the GPU. At the moment, there is basic support allowing the the models to be trained/evaluated on the GPU, but much of the code has been optimised for the CPU.
The text was updated successfully, but these errors were encountered: