Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EGSnrc parallelization using MPI #511

Open
wants to merge 8 commits into
base: develop
Choose a base branch
from

Conversation

edoerner
Copy link

The purpose of this contribution is to present a parallel solution for the EGSnrc Monte Carlo code system using the MPI programming model as an alternative to the provided implementation, based on the use of a batch-queueing system.

For details of this implementation please check the README_MPI.md file available in the EGSnrc main folder.

Currently, the BEAMnrc and DOSXYZnrc user codes support this parallel solution based on MPI. In the case of DOSXYZnrc, both the use of a phase space file and BEAM shared library as sources has been tested. Both codes can be used as models to introduce MPI features to other EGSnrc user codes.

By default, it is expected that OpenMPI is installed in the system. If another MPI implementation is desired, change the F77 macro inside the 'mpi' target to the desired MPI compiler in the following Makefiles:

$HEN_HOUSE/makefiles/standard_makefile (line 169)
$HEN_HOUSE/makefiles/beam_makefile (line 165)

by default F77=mpif90 in both files.

In order to compile an user code $(user_code) with MPI support go to the $EGS_HOME/$(user_code) folder and type:

make mpi

This will enable the MPI features in the user code and will create an executable called $(user_code)_mpi (i.e. the normal user code executable name with the '_mpi' suffix attached to it). Then, use mpirun or similar to execute it with MPI support:

mpirun -np #NUM_PROCS $(user_code)_mpi -i input_file -p pegs_file

@edoerner edoerner force-pushed the feature-mpi-support branch from e9260d9 to 96e31b4 Compare January 30, 2019 21:39
@ftessier
Copy link
Member

This is fantastic @edoerner! Similar to what I wished for. Question: is it possible, along the same lines, to write an MPI "wrapper" program that would simply dispatch N parallel jobs, wait for completion, and let the master thread combine everything at the end? Simply the same thing as exb dispatches jobs, but with MPI dispatch instead of a lock file. That would allow MPI execution across the board without modifying the code of existing applications. I think I will give it a try, based on what you did here!

@crcrewso
Copy link
Contributor

crcrewso commented Jan 31, 2019

I'm noticing a few modifications to implicit none in some of the changes. Shouldn't we be leaving implicit none for all Fortran code?

MPI is added for all compiling. Does GCC Behave nicely if MPI is not installed?

@edoerner
Copy link
Author

@crcrewso Yes!, I just realized that, it is just the change of implicit none by the macro $IMPLICIT-NONE that it is defined by default as implicit none, I really do not remember why I did that, but of course, any Fortran program should use implicit none as default

About MPI, any call to a MPI functionality is 'protected' by a preprocessor conditional valid only if the user defines the _MPI macro. Such macro is only defined if the user code (DOSXYZnrc or a BEAM accelerator) is compiled with the 'mpi' target. All the definitions needed (such as the use of 'mpif90' as compiler) are defined in this target inside the relevant Makefiles, in order to not affect the $EGS_CONFIG file and therefore the rest of the platform.

For example, to gain MPI functionality in the dosxyznrc user code I compile it typing 'make mpi' in its folder inside $EGS_HOME. This creates a dedicated executable 'dosxyznrc_mpi' and then I execute it through mpirun. The idea was to isolate the MPI functionality inside EGSnrc in order to not affect the rest of the platform, specially if MPI is not installed.

@ftessier It would be interesting to look on that to not have to modify the rest of the codes. It would be nice if you can give it a try.

@edoerner
Copy link
Author

edoerner commented Jun 7, 2019

FYI I just updated my branch to the last version of the develop branch within nrc-cnrc. Tell me if you have comments or ideas about this implementation.

@ealemany
Copy link

ealemany commented Oct 9, 2019

Hi edoerner,

I am new at EGSnrc. I am helping post-docs to run EGSnrc on a Ubuntu 18.04 server with OpenMPI and with a GUI. I followed the instruction to install EGSnrc at https://github.com/nrc-cnrc/EGSnrc/wiki/Install-EGSnrc-on-Linux#1-install-prerequisite-software
I followed Option 1 via the GUI. All went well. Icons were created on the desktop and each of them launch its application just fine. I have a question though and i hope you can help me.

  1. When using the GUI to launch an EGSnrc application, how can the user allocate all the cores of the server.
    I read your comment:
    "For example, to gain MPI functionality in the dosxyznrc user code I compile it typing 'make mpi' in its folder inside $EGS_HOME. This creates a dedicated executable 'dosxyznrc_mpi' and then I execute it through mpirun. The idea was to isolate the MPI functionality inside EGSnrc in order to not affect the rest of the platform, specially if MPI is not installed"._
    and i understand the process but i am not sure or understand how to do the same with the GUI option.

Thank you for your help.

Best,
Eric

@rtownson
Copy link
Collaborator

rtownson commented Oct 9, 2019

Hi @ealemany,

There's no option in the GUI at the moment to support this. It's good that you installed the GUIs as they will still be useful, but for running the jobs using MPI you'll have to launch them from the command line using the commands they suggested.

Also keep in mind that you will have to use 'git checkout' and direct it to grab this pull request, and run the installer after you have done so in order to compile these new MPI features.

@ealemany
Copy link

ealemany commented Oct 9, 2019

Hi @rtownson.

Thank you!

I am not sure if i understand your comment
"Also keep in mind that you will have to use 'git checkout' and direct it to grab this pull request, and run the installer after you have done so in order to compile these new MPI features."

I understand the 'git checkout' concept but i don't understand the rest " and direct it to grab........
....MPI features"

Could you give me a command line example - if there is such a thing?

Thanks

@rtownson
Copy link
Collaborator

@ealemany here is an example. Doing this means you are using an experimental branch of EGSnrc, so you might want to re-install when the official 2020 release comes out early next year if this pull request has been included in it. The following fetches this PR into a new branch locally for you called feature-mpi-support, then checks it out. After this is when you should run the installation. It's always safest to do a clean install (delete/rename the EGSnrc directory); I can't guarantee that re-running the installer over the existing installation will work (but it might).

git fetch origin pull/511/head:feature-mpi-support
git checkout feature-mpi-support

@ealemany
Copy link

Great! Thank you @rtownson for the explanation and the example, very helpful and good to know

@ealemany
Copy link

Hello,

Are there specific steps to install EGSnrc for multiple users for the GUI option? I read through the instructions at https://github.com/nrc-cnrc/EGSnrc/wiki/Install-EGSnrc-on-Linux#1-install-prerequisite-software
but i do not see any install configurations for multiple users.

i ran a test on a ubuntu server and i came across permissions issues for the short-cut icons on the desktop. To resolve this issue I created a group called "egsnrc", users joined the "egsnrc" group, I changed the permissions on the top directory (EGSnrc-2019) to 774 so users and egsnrc group can launch the short-cut icons on the desktop.

Is this the correct way to approach a multi user Desktop environment?

Thank you

@rtownson
Copy link
Collaborator

Hi @ealemany, I'm not sure what the optimal configuration is for a multiuser MPI setup. It's OK to share the whole EGSnrc directory, but usually users will each have their own egs_home directory, which is like the working area (all input and output files go here), and might share the HEN_HOUSE (all the source code is here). That means that each user would have their environment variables set with a different EGS_HOME. If you want to have more discussion about this we could do it on the reddit page, since this is not related to the implementation of this pull request.

@ealemany
Copy link

Hi @rtownson,
I see you (talk to you) on the reddit page.
Thanks

@ealemany
Copy link

ealemany commented Nov 1, 2019

Hi,

We installed and configured EGSnrc on a Master in a cluster of 12 nodes with OpenMPI. We ran our first job and it doesn't seem to be working. we follow the instructions as described above in @edoerner comment.
we did "make mpi" and running something with "mpirun -np 20 BEAM_FLASH_SCAN_mpi -i wb_right_block.egsinp -p FLASH60MeV"

Is there something wrong in our mpirun command?

I hope this the right place to post this kind of issue.

Thank you for your help

Screen Shot 2019-11-01 at 3 34 49 PM

@edoerner
Copy link
Author

hi @ealemany, I am not an expert in server configuration, but I suppose that you have 12 nodes , with each node having one or more cores. I think that the problem is essentialy the MPI configuration of your cluster.

I some systems I had the problem that mpi thinks that the system has less core than really available (for example, in a 8-core system I am not able to launch more than 4 MPI processes). In that case, I use the --oversubscribe flag for mpirun/mpiexec. Of course, you should check that effectively all the cores are used.

@ealemany
Copy link

ealemany commented Nov 12, 2019 via email

@ftessier
Copy link
Member

This branch needs a good cleanup to synchronize with the current develop tip. @edoerner are you ok with me pushing the update to your feature-mpi-support branch to bring it in sync, and finally merge this into the EGSnrc trunk?

@ftessier ftessier added this to the Release 2022 milestone Mar 25, 2021
@edoerner
Copy link
Author

edoerner commented Mar 26, 2021 via email

@edoerner edoerner requested a review from a team as a code owner April 15, 2021 13:28
edoerner added 3 commits July 11, 2022 15:08
Add Message Passing Interface (MPI) parallel support for DOSXYZnrc.
Phase space sources supported, BEAMnrc source as shared library
partially supported: it fails at end due to pjob functionality, still
looking for best way to solve this problem.
@ftessier ftessier force-pushed the feature-mpi-support branch from 0cb437b to cee2fb2 Compare July 11, 2022 19:14
@ftessier
Copy link
Member

Rebased on develop, and fixed a few conflicts accrued over the last 3 years. Adjusted commit messages, removed EOL whitespace. This branch needs to be tested more thoroughly before merging.

@blakewalters
Copy link
Contributor

@ftessier and @edoerner: I'm just wondering if this has been tested on Mac. I've been trying to get "make mpi" to work, but I'm running into a strange "clang (LLVM option parsing): Unknown command line argument '-x86-pad-for-align=false'" error. Now, I'm using a wonky combination of gcc and clang, so I don't know if this error is just peculiar to my system. @ftessier, have you had a chance to try it on OSX?

@ftessier ftessier modified the milestones: Release 2022, Release 2023 Jul 16, 2022
@ftessier
Copy link
Member

ftessier commented Jul 16, 2022

I have not tested it on macOS. This needs more testing, will be merged into develop just after the 2022 release.

@edoerner
Copy link
Author

Hi to everyone!,
@blakewalters as Frederic stated this solution has not been tested on MacOS. Unfortunatelly I have lost touch on the code since I left academia a couple of years ago, but back in the time I remember being able to use it in my iMac, but currently I do not have access to such OS to give it a try.
@ftessier Although I am out of research since a couple of years it would be nice that this contribution be able to see the light, hehehe, do you need any input or help from my side?

I am glad to see that you are still looking on this, I always liked EGSnrc as my primary research tool :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants