Optimal parallelization on small cluster #206
Replies: 3 comments 5 replies
-
So in general yes you would have one julia worker per node and then have multi-threading (openmp) for each julia worker on each node. So in your test case you have two nodes. It means you should do Just remember that If you want good performance, you may wanna use ThreadPinning on each worker for optimal core allocation. |
Beta Was this translation helpful? Give feedback.
-
@mloubout I tried to run on master node without workers with OMP enabled and got warnings:
It seems I have OpenMP library:
I think devito does't pass the right compiler flag. OS CentOS 7 |
Beta Was this translation helpful? Give feedback.
-
@mloubout hi, You were right about GCC version. |
Beta Was this translation helpful? Give feedback.
-
Hi,
I started using JUDI with small test cluster on the could.
For that purpose I have one task manager node and two compute nodes.
Each node have (given from
lcpu
command):Cluster manager is SLURM.
I think best parallelization would be if I manage to use each of two computing nodes per task (shot) and enble OpenMP. So there would be 2 shot processing at a time and each node uses two CPU to process a shot.
How should look the configuration?
For example I use fwi_example_NLopt.jl and my settings for now are:
but I believe this is not correct as there probably would be 4 workers while OpenMP enabled.
That means 4 shots are processed at the time and each process tries to use OpenMP with 2 cores - not optimal as I have only 4 cores (not 8).
Beta Was this translation helpful? Give feedback.
All reactions