Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Measuring the unexploited efficiency of breaking down a block-diagonal matrix #6

Open
aGotelli opened this issue May 4, 2022 · 0 comments

Comments

@aGotelli
Copy link
Owner

aGotelli commented May 4, 2022

In this context it will be very interesting to measure the speedup in breaking down block matrices. In fact, for every operations we do on matrices, it is faster to do it with more smaller matrices than with one huge matrix.
This is even more significant when inverting the matrix, especially in the GPU.

So we can apply this concept to a couple of things:

When we have a multiplication of the adjoint matrix ad_xi we have that this matrix can be divided in four blocks, and the one in the upper left is zero. When we take the transpose, the block of zeros goes in the bottom left part.
So, we could split the computation in two parts: one multiplication of the upper part and another for the bottom right block.
This means: in the case of the wrench : first integrating the forces and then the couples, and will save some operations.

When you will test it, you can compare the different sizes of the corresponding A matrix and b vector.

To compute the generalized forces, we multiply by the matrix Phi and B by the wrench.
However, in this part, we have that these matrices are already known. So first, we can precompute the values at the Chebyshev point.
Moreover, being two block matrices, we can break the computation in as many parts as blocks in the matrices.

For these two cases, it will be interesting to measure the speedup in the classical CPU code and in the parallelized GPU code.
So once the library is completed, we can perform a couple of measures and define the current computational time.
Then we will compare with this optimized version.

Similarly in the GPU. We will first perform a common Spectral integration and then break down the terms.

@aGotelli aGotelli changed the title Optimized version for the adjoint computations Measuring the unexploited efficiency of breaking down a block-diagonal matrix May 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant