EGSnrc parallelization using MPI #511

edoerner · 2019-01-30T21:06:56Z

The purpose of this contribution is to present a parallel solution for the EGSnrc Monte Carlo code system using the MPI programming model as an alternative to the provided implementation, based on the use of a batch-queueing system.

For details of this implementation please check the README_MPI.md file available in the EGSnrc main folder.

Currently, the BEAMnrc and DOSXYZnrc user codes support this parallel solution based on MPI. In the case of DOSXYZnrc, both the use of a phase space file and BEAM shared library as sources has been tested. Both codes can be used as models to introduce MPI features to other EGSnrc user codes.

By default, it is expected that OpenMPI is installed in the system. If another MPI implementation is desired, change the F77 macro inside the 'mpi' target to the desired MPI compiler in the following Makefiles:

$HEN_HOUSE/makefiles/standard_makefile (line 169)
$HEN_HOUSE/makefiles/beam_makefile (line 165)

by default F77=mpif90 in both files.

In order to compile an user code $(user_code) with MPI support go to the $EGS_HOME/$(user_code) folder and type:

make mpi

This will enable the MPI features in the user code and will create an executable called $(user_code)_mpi (i.e. the normal user code executable name with the '_mpi' suffix attached to it). Then, use mpirun or similar to execute it with MPI support:

mpirun -np #NUM_PROCS $(user_code)_mpi -i input_file -p pegs_file

ftessier · 2019-01-31T16:52:16Z

This is fantastic @edoerner! Similar to what I wished for. Question: is it possible, along the same lines, to write an MPI "wrapper" program that would simply dispatch N parallel jobs, wait for completion, and let the master thread combine everything at the end? Simply the same thing as exb dispatches jobs, but with MPI dispatch instead of a lock file. That would allow MPI execution across the board without modifying the code of existing applications. I think I will give it a try, based on what you did here!

crcrewso · 2019-01-31T17:38:32Z

I'm noticing a few modifications to implicit none in some of the changes. Shouldn't we be leaving implicit none for all Fortran code?

MPI is added for all compiling. Does GCC Behave nicely if MPI is not installed?

edoerner · 2019-01-31T19:16:08Z

@crcrewso Yes!, I just realized that, it is just the change of implicit none by the macro $IMPLICIT-NONE that it is defined by default as implicit none, I really do not remember why I did that, but of course, any Fortran program should use implicit none as default

About MPI, any call to a MPI functionality is 'protected' by a preprocessor conditional valid only if the user defines the _MPI macro. Such macro is only defined if the user code (DOSXYZnrc or a BEAM accelerator) is compiled with the 'mpi' target. All the definitions needed (such as the use of 'mpif90' as compiler) are defined in this target inside the relevant Makefiles, in order to not affect the $EGS_CONFIG file and therefore the rest of the platform.

For example, to gain MPI functionality in the dosxyznrc user code I compile it typing 'make mpi' in its folder inside $EGS_HOME. This creates a dedicated executable 'dosxyznrc_mpi' and then I execute it through mpirun. The idea was to isolate the MPI functionality inside EGSnrc in order to not affect the rest of the platform, specially if MPI is not installed.

@ftessier It would be interesting to look on that to not have to modify the rest of the codes. It would be nice if you can give it a try.

edoerner · 2019-06-07T22:28:54Z

FYI I just updated my branch to the last version of the develop branch within nrc-cnrc. Tell me if you have comments or ideas about this implementation.

ealemany · 2019-10-09T19:37:10Z

Hi edoerner,

I am new at EGSnrc. I am helping post-docs to run EGSnrc on a Ubuntu 18.04 server with OpenMPI and with a GUI. I followed the instruction to install EGSnrc at https://github.com/nrc-cnrc/EGSnrc/wiki/Install-EGSnrc-on-Linux#1-install-prerequisite-software
I followed Option 1 via the GUI. All went well. Icons were created on the desktop and each of them launch its application just fine. I have a question though and i hope you can help me.

When using the GUI to launch an EGSnrc application, how can the user allocate all the cores of the server.
I read your comment:
"For example, to gain MPI functionality in the dosxyznrc user code I compile it typing 'make mpi' in its folder inside $EGS_HOME. This creates a dedicated executable 'dosxyznrc_mpi' and then I execute it through mpirun. The idea was to isolate the MPI functionality inside EGSnrc in order to not affect the rest of the platform, specially if MPI is not installed"._
and i understand the process but i am not sure or understand how to do the same with the GUI option.

Thank you for your help.

Best,
Eric

rtownson · 2019-10-09T19:59:12Z

Hi @ealemany,

There's no option in the GUI at the moment to support this. It's good that you installed the GUIs as they will still be useful, but for running the jobs using MPI you'll have to launch them from the command line using the commands they suggested.

Also keep in mind that you will have to use 'git checkout' and direct it to grab this pull request, and run the installer after you have done so in order to compile these new MPI features.

ealemany · 2019-10-09T22:18:18Z

Hi @rtownson.

Thank you!

I am not sure if i understand your comment
"Also keep in mind that you will have to use 'git checkout' and direct it to grab this pull request, and run the installer after you have done so in order to compile these new MPI features."

I understand the 'git checkout' concept but i don't understand the rest " and direct it to grab........
....MPI features"

Could you give me a command line example - if there is such a thing?

Thanks

rtownson · 2019-10-10T11:42:54Z

@ealemany here is an example. Doing this means you are using an experimental branch of EGSnrc, so you might want to re-install when the official 2020 release comes out early next year if this pull request has been included in it. The following fetches this PR into a new branch locally for you called feature-mpi-support, then checks it out. After this is when you should run the installation. It's always safest to do a clean install (delete/rename the EGSnrc directory); I can't guarantee that re-running the installer over the existing installation will work (but it might).

git fetch origin pull/511/head:feature-mpi-support
git checkout feature-mpi-support

ealemany · 2019-10-11T20:02:52Z

Great! Thank you @rtownson for the explanation and the example, very helpful and good to know

ealemany · 2019-10-11T20:16:55Z

Hello,

Are there specific steps to install EGSnrc for multiple users for the GUI option? I read through the instructions at https://github.com/nrc-cnrc/EGSnrc/wiki/Install-EGSnrc-on-Linux#1-install-prerequisite-software
but i do not see any install configurations for multiple users.

i ran a test on a ubuntu server and i came across permissions issues for the short-cut icons on the desktop. To resolve this issue I created a group called "egsnrc", users joined the "egsnrc" group, I changed the permissions on the top directory (EGSnrc-2019) to 774 so users and egsnrc group can launch the short-cut icons on the desktop.

Is this the correct way to approach a multi user Desktop environment?

Thank you

rtownson · 2019-10-11T21:27:39Z

Hi @ealemany, I'm not sure what the optimal configuration is for a multiuser MPI setup. It's OK to share the whole EGSnrc directory, but usually users will each have their own egs_home directory, which is like the working area (all input and output files go here), and might share the HEN_HOUSE (all the source code is here). That means that each user would have their environment variables set with a different EGS_HOME. If you want to have more discussion about this we could do it on the reddit page, since this is not related to the implementation of this pull request.

ealemany · 2019-10-12T18:53:47Z

Hi @rtownson,
I see you (talk to you) on the reddit page.
Thanks

ealemany · 2019-11-01T22:57:10Z

Hi,

We installed and configured EGSnrc on a Master in a cluster of 12 nodes with OpenMPI. We ran our first job and it doesn't seem to be working. we follow the instructions as described above in @edoerner comment.
we did "make mpi" and running something with "mpirun -np 20 BEAM_FLASH_SCAN_mpi -i wb_right_block.egsinp -p FLASH60MeV"

Is there something wrong in our mpirun command?

I hope this the right place to post this kind of issue.

Thank you for your help

edoerner · 2019-11-12T12:39:17Z

hi @ealemany, I am not an expert in server configuration, but I suppose that you have 12 nodes , with each node having one or more cores. I think that the problem is essentialy the MPI configuration of your cluster.

I some systems I had the problem that mpi thinks that the system has less core than really available (for example, in a 8-core system I am not able to launch more than 4 MPI processes). In that case, I use the --oversubscribe flag for mpirun/mpiexec. Of course, you should check that effectively all the cores are used.

ealemany · 2019-11-12T22:24:41Z

Hi @edoerner, Thank you for your suggestion. You are right I might have some config issues with my OpenMPI. I will go over the configuration one more time and if it still give me the error message again, I will use the - -oversubscribe flag as you suggested.

…

____________________________________________________________________________________________________________________________ Eric F. Alemany Systems Administrator for Research EXO Extended Operations Stanford Medicine - Technology & Digital Services Stanford, California 94305 On Nov 12, 2019, at 4:39 AM, Edgardo Doerner <[email protected]<mailto:[email protected]>> wrote: hi @ealemany<https://github.com/ealemany>, I am not an expert in server configuration, but I suppose that you have 12 nodes , with each node having one or more cores. I think that the problem is essentialy the MPI configuration of your cluster. I some systems I had the problem that mpi thinks that the system has less core than really available (for example, in a 8-core system I am not able to launch more than 4 MPI processes). In that case, I use the --oversubscribe flag for mpirun/mpiexec. Of course, you should check that effectively all the cores are used. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#511?email_source=notifications&email_token=AIZHMOQANOSOWYWKO3JJZS3QTKPXTA5CNFSM4GTMH7EKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOED2DSDY#issuecomment-552876303>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AIZHMOXXDUWWFIN4Y5HDDSTQTKPXTANCNFSM4GTMH7EA>.

ftessier · 2021-03-25T04:46:33Z

This branch needs a good cleanup to synchronize with the current develop tip. @edoerner are you ok with me pushing the update to your feature-mpi-support branch to bring it in sync, and finally merge this into the EGSnrc trunk?

edoerner · 2021-03-26T01:29:23Z

Hi Frederic, Sure!, push the update to start the process... El jue., 25 de marzo de 2021 1:46 a. m., Frederic Tessier < ***@***.***> escribió:

…

This branch needs a good cleanup to synchronize with the current develop tip. @edoerner <https://github.com/edoerner> are you ok with me pushing the update to your feature-mpi-support branch to bring it in sync, and *finally* merge this into the EGSnrc trunk? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#511 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AE5JMRRNHZJGCUVII6OONYLTFK53PANCNFSM4GTMH7EA> .

Add Message Passing Interface (MPI) parallel support for DOSXYZnrc. Phase space sources supported, BEAMnrc source as shared library partially supported: it fails at end due to pjob functionality, still looking for best way to solve this problem.

This allows the preprocessor in Fortran.

ftessier · 2022-07-11T19:15:59Z

Rebased on develop, and fixed a few conflicts accrued over the last 3 years. Adjusted commit messages, removed EOL whitespace. This branch needs to be tested more thoroughly before merging.

blakewalters · 2022-07-15T23:00:32Z

@ftessier and @edoerner: I'm just wondering if this has been tested on Mac. I've been trying to get "make mpi" to work, but I'm running into a strange "clang (LLVM option parsing): Unknown command line argument '-x86-pad-for-align=false'" error. Now, I'm using a wonky combination of gcc and clang, so I don't know if this error is just peculiar to my system. @ftessier, have you had a chance to try it on OSX?

ftessier · 2022-07-16T00:31:17Z

I have not tested it on macOS. This needs more testing, will be merged into develop just after the 2022 release.

edoerner · 2022-07-20T03:12:04Z

Hi to everyone!,
@blakewalters as Frederic stated this solution has not been tested on MacOS. Unfortunatelly I have lost touch on the code since I left academia a couple of years ago, but back in the time I remember being able to use it in my iMac, but currently I do not have access to such OS to give it a try.
@ftessier Although I am out of research since a couple of years it would be nice that this contribution be able to see the light, hehehe, do you need any input or help from my side?

I am glad to see that you are still looking on this, I always liked EGSnrc as my primary research tool :)

edoerner force-pushed the feature-mpi-support branch from e9260d9 to 96e31b4 Compare January 30, 2019 21:39

rtownson added the improvement label Feb 23, 2019

rtownson requested a review from ftessier April 20, 2019 20:48

rtownson assigned ftessier Apr 20, 2019

rtownson requested review from rtownson, blakewalters and mainegra April 20, 2019 20:48

rtownson assigned edoerner Apr 20, 2019

ftessier added this to the Release 2022 milestone Mar 25, 2021

edoerner requested a review from a team as a code owner April 15, 2021 13:28

edoerner added 3 commits July 11, 2022 15:08

Add MPI parallel support in in DOSXYZnrc

3647582

Add Message Passing Interface (MPI) parallel support for DOSXYZnrc. Phase space sources supported, BEAMnrc source as shared library partially supported: it fails at end due to pjob functionality, still looking for best way to solve this problem.

Add MPI parallel support in BEAMnrc

8438dcd

Add ability to disable job control files via macro

1c1233e

edoerner and others added 5 commits July 11, 2022 15:10

Add MPI parallel functionality through makefiles

9fe71e6

Adjust beamnrc.mortran egsnrc.macros

ba3f48b

Add README file for MPI parallel support

4446bd6

Change fortran extension from 'f' to 'F'

b25ac34

This allows the preprocessor in Fortran.

Remove spurious end of line whitespace

cee2fb2

ftessier force-pushed the feature-mpi-support branch from 0cb437b to cee2fb2 Compare July 11, 2022 19:14

ftessier modified the milestones: Release 2022, Release 2023 Jul 16, 2022

ftessier unassigned edoerner Aug 9, 2024

rtownson modified the milestones: Release 2025, Release 2026 Dec 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EGSnrc parallelization using MPI #511

EGSnrc parallelization using MPI #511

edoerner commented Jan 30, 2019

ftessier commented Jan 31, 2019

crcrewso commented Jan 31, 2019 •

edited

Loading

edoerner commented Jan 31, 2019

edoerner commented Jun 7, 2019

ealemany commented Oct 9, 2019

rtownson commented Oct 9, 2019

ealemany commented Oct 9, 2019

rtownson commented Oct 10, 2019

ealemany commented Oct 11, 2019

ealemany commented Oct 11, 2019

rtownson commented Oct 11, 2019

ealemany commented Oct 12, 2019

ealemany commented Nov 1, 2019 •

edited

Loading

edoerner commented Nov 12, 2019

ealemany commented Nov 12, 2019 via email

ftessier commented Mar 25, 2021

edoerner commented Mar 26, 2021 via email

ftessier commented Jul 11, 2022

blakewalters commented Jul 15, 2022

ftessier commented Jul 16, 2022 •

edited

Loading

edoerner commented Jul 20, 2022

EGSnrc parallelization using MPI #511

Are you sure you want to change the base?

EGSnrc parallelization using MPI #511

Conversation

edoerner commented Jan 30, 2019

ftessier commented Jan 31, 2019

crcrewso commented Jan 31, 2019 • edited Loading

edoerner commented Jan 31, 2019

edoerner commented Jun 7, 2019

ealemany commented Oct 9, 2019

rtownson commented Oct 9, 2019

ealemany commented Oct 9, 2019

rtownson commented Oct 10, 2019

ealemany commented Oct 11, 2019

ealemany commented Oct 11, 2019

rtownson commented Oct 11, 2019

ealemany commented Oct 12, 2019

ealemany commented Nov 1, 2019 • edited Loading

edoerner commented Nov 12, 2019

ealemany commented Nov 12, 2019 via email

ftessier commented Mar 25, 2021

edoerner commented Mar 26, 2021 via email

ftessier commented Jul 11, 2022

blakewalters commented Jul 15, 2022

ftessier commented Jul 16, 2022 • edited Loading

edoerner commented Jul 20, 2022

crcrewso commented Jan 31, 2019 •

edited

Loading

ealemany commented Nov 1, 2019 •

edited

Loading

ftessier commented Jul 16, 2022 •

edited

Loading