Improved GPU Support #119

nic-barbara · 2023-08-14T09:35:58Z

Improve the speed of RENs and LBDNs evaluated on the GPU. At the moment, there is basic support allowing the the models to be trained/evaluated on the GPU, but much of the code has been optimised for the CPU.

nic-barbara · 2023-08-14T10:05:11Z

One particular area for improvement is the backwards pass of _N_lip and _N_gen in the case ny > nu. For some reason, there must be scalar operations happening under the hood when we write A / B instead of A * inv(B), which CUDA.jl is not a fan of. Investigate this further. This behaviour was first identified in 567f801.

Examples of where it is a problem:

In norm_cayley when calculating B_T for DenseLBDN
In _N_lip and _N_gen when calculating N for D22 term in LipschitzRENParams and GeneralRENParams

nic-barbara · 2023-08-16T00:40:02Z

The above #119 (comment) has been solved in d5bfd02, and was caused by adding the identity operator I to a CUDA array before back-propagating through A / (I + B).

nic-barbara added documentation Improvements or additions to documentation enhancement New feature or request labels Aug 14, 2023

nic-barbara removed the documentation Improvements or additions to documentation label Aug 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improved GPU Support #119

Improved GPU Support #119

nic-barbara commented Aug 14, 2023 •

edited

Loading

nic-barbara commented Aug 14, 2023 •

edited

Loading

nic-barbara commented Aug 16, 2023 •

edited

Loading

Improved GPU Support #119

Improved GPU Support #119

Comments

nic-barbara commented Aug 14, 2023 • edited Loading

nic-barbara commented Aug 14, 2023 • edited Loading

nic-barbara commented Aug 16, 2023 • edited Loading

nic-barbara commented Aug 14, 2023 •

edited

Loading

nic-barbara commented Aug 14, 2023 •

edited

Loading

nic-barbara commented Aug 16, 2023 •

edited

Loading