Change log

[0.4.0] 10/02/2020

Changed

Split the memory management (CudaMatrix) from the CUBLAS invocation (CudaPipeline)
Moved all the allocation to the smart pointers inside CudaMatrix
Removed unused headers

[0.3.0] 26/09/2019

Added

Smart pointers to handle cuda resources
New CudaMatrix class
Use Eigen::MatrixXd
Check available memory in the GPU before computing

Removed

Template class, implementation only for double available
Triple tensor product
Shapes struct

[0.2.0] 27/08/2019

Added

Tensor matrix multiplacation using gemmbatched.
Async calls to memory copies.
Properly free memory after the tensor operation is done.

[0.1.0]

New

Use a template function to perform matrix matrix multiplacation using CUBLAS.
Use either pinned (default) or pageable memory, see cuda optimizations.