Skip to content

Commit

Permalink
added README file for MPI support
Browse files Browse the repository at this point in the history
  • Loading branch information
edoerner committed Jan 30, 2019
1 parent 4a75b78 commit 96e31b4
Showing 1 changed file with 132 additions and 0 deletions.
132 changes: 132 additions & 0 deletions README_MPI.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
## An MPI parallel implementation for EGSnrc

The purpose of this contribution is to present a parallel solution for the
EGSnrc Monte Carlo code system using the MPI programming model as an alternative
to the provided implementation, based on the use of a batch-queueing system.

Currently, the BEAMnrc and DOSXYZnrc user codes support this parallel solution
based on MPI. In the case of DOSXYZnrc, both the use of a phase space file
and BEAM shared library as sources has been tested. Both codes can be used
as models to introduce MPI features to other EGSnrc user codes.

## Authors

Edgardo Doerner (edoerner at fis.puc.cl)
Paola Caprile

Institute of Physics
Pontificia Universidad Catolica de Chile

## Method

This work incorporates MPI features to distribute the simulation between the
available compute units. These features are introduced through properly defined
macros, which are enabled depending on the compilation flags given by the user.
The workload balance is controlled by default via a 'job control file' as the
original implementation.

In order to ease the integration of MPI in EGSnrc, the compilation flags
needed to enable the MPI parallelization are added through the 'mpi' target,
defined in both the EGSnrc standard makefile (standard_makefile) and BEAMnrc
makefile (beam_makefile) inside $HEN_HOUSE/makefiles folder.

The result is a dedicated executable with MPI support that can be called by the
user. The advantage is that there is no need to modify or add the needed
compilation flags to the *.conf file during the installation process.

By default, it is expected that OpenMPI is installed in the system. If another
MPI implementation is desired, change the F77 macro inside the 'mpi' target to
the desired MPI compiler in the following Makefiles:

$HEN_HOUSE/makefiles/standard_makefile (line 169)
$HEN_HOUSE/makefiles/beam_makefile (line 165)

by default F77=mpif90 in both files.

## Usage

In order to compile an user code $(user_code) with MPI support go to
the $EGS_HOME/$(user_code) folder and type:

make mpi

This will enable the MPI features in the user code and will create an executable
called $(user_code)_mpi (i.e. the normal user code executable name with the
'_mpi' suffix attached to it). Then, use mpirun or similar to execute it with
MPI support:

mpirun -np #NUM_PROCS $(user_code)_mpi -i input_file -p pegs_file

## Issues

Two small issues remains in the following contribution:

1. When the DOSXYZnrc user code is used with a BEAM shared library as a source,
an error appears regarding the inability of locking the *.lock file used
to distribute the workload in the parallel simulation. This is triggered by the
shared library, not the dosxyznrc jobs or MPI processes involved.

This issue is inherited from the develop branch on EGSnrc, and causes all the
MPI processes to halt. As a temporal solution, the use of the file locking
mechanism is disabled in the BEAM shared library at compile time through a
proper compilation flag.

2. When the BEAMnrc user code is compiled with MPI support or DOSXYZnrc is
used with a BEAM shared library as a source, a console message
appears at the end of the simulation:

ls: ~/$EGS_HOME/BEAM_myaccel/myaccel_w*.egsdat: No such file or directory

Of course, it is expected to not have egsdat files if data arrays are not
stored during simulation. The presence of such files is tested in the
egs_combine_runs subroutine at the end of the simulation. When MPI support is
enabled, and in contrast with the standard compilation, the result of the
system command is echoed to the terminal. This does not affect the results
and it means only a minor annoyance during execution.

## Disabling Parallel Jobs functionality

Some systems (such as the National Laboratory of High Performance Computing
(NLHPC) in Chile) does not allow the use of a lock-file mechanism needed by the
EGSnrc parallel implementation and this MPI contribution to control access to
the job control file.

In such a case, the user can define the _NOPJOB macro during compilation to
disable the use of the control file. For example, in the following Makefiles add
the _NOPJOB macro definition as:

standard_makefile (line 169): FCFLAGS="$(FCFLAGS) -D_MPI -D_NOPJOB"
beam_makefile (line 165): FCFLAGS="$(FCFLAGS) -D_MPI -D_NOPJOB"

The result is that the simulation workload is now evenly distributed among
computing units, and therefore no job control file is needed.

## OpenMP features

The introduction of OpenMP features has been tested in the DOSXYZnrc user
code, with interesting results (see references below). However, some important
issues remains unsolved and definitely will need a more extended modification
of the user codes, namely:

1. The support of shared library sources in DOSXYZnrc when OpenMP is enabled.
2. The OpenMP implementation in BEAMnrc, considering the output to phase space
files.

An hybrid implementation, combining MPI and OpenMP features can be seen in the
pull request #341 (https://github.com/nrc-cnrc/EGSnrc/pull/341). For this
reason, it was decided to offer first a polished MPI implementation to the
EGSnrc community.

## References

This MPI implementation is contained in the following work:

Doerner E, Caprile P. An hybrid parallel implementation for EGSnrc Monte Carlo
user codes. Med. Phys. 45 (8), August 2018.

The aforementioned publication was based on an OpenMP-only solution which can be
reviewed in:

Doerner E, Caprile P. Parallel implementation of the EGSnrc Monte Carlo
simulation of ionizing radiation transport using OpenMP. Med. Phys. 44 (12),
December 2017

0 comments on commit 96e31b4

Please sign in to comment.