Skip to content

Latest commit

 

History

History
20 lines (13 loc) · 450 Bytes

sycl_05_transpose.md

File metadata and controls

20 lines (13 loc) · 450 Bytes

Parallelism in Modern C++; from CPU to GPU

Exercise 5: Transpose


In this first exercise you will learn:

  • How to create a simple matrix transpose kernel.
  • How to allocate and use local memory.
  • How to synchronize work-groups.
  • How different work-group sizes effect performance.

TODO

1.) Write a SYCL kernel for transposing matrices.

2.) Use local memory to improve global memory coalescing.

3.) Try different work-group sizes.