-
cutlass/include/cutlass/gemm/threadblock/mma_singlestage.h Lines 198 to 201 in 5c447dd As the comment say there is some overlap between memory loading from shared ram with mma, but in the loop cutlass/include/cutlass/gemm/threadblock/mma_singlestage.h Lines 223 to 239 in 5c447dd I do not see such behavior, can you help me understand it? |
Beta Was this translation helpful? Give feedback.
Answered by
hwu36
Jul 10, 2024
Replies: 1 comment
-
There is no overlap in single stage. just
It saves space at the cost of efficiency. |
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
wzhcz8902
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
There is no overlap in single stage.
just
It saves space at the cost of efficiency.