Skip to content

Latest commit

 

History

History
21 lines (14 loc) · 1.75 KB

File metadata and controls

21 lines (14 loc) · 1.75 KB

Parallel calculation of π

Starting from the mpi example of computing pi and using this Intel example construct a code which computes the value of pi using 2 or more gpus, with 1 GPU device per MPI task.

Computing of π

An approximation to the value of π can be calculated from the following expression

img

where the answer becomes more accurate with increasing N. As each term is independent, the summation over i can be parallelized nearly trivially. The work is divided in ntasks so that rank 0 does i=1, 2, ..., N / ntasks, rank 1 does i=N / ntasks + 1, N / ntasks + 2, ... , etc. (we assume that N is evenly divisible by the number of processes). Each tasks computes their own sum. Once finished with the calculation, all ranks (expect rank 0) send their partial sum to rank 0, which then calculates the final result and prints it out.

Task

Starting from the mpi parallel code pi.cpp, make a version that performs the calculation using sycl for the local reduction similar to the reduction with buffer or reduction with usm examples. Remember to assign 1 GPU to 1 task similar to the MPI examples taking into account that each Mahti GPU node has 4 GPUs and each LUMI-G node has 8 GPUs.