Skip to content

Commit

Permalink
Merge pull request #118 from siddanib/mpmd_examples
Browse files Browse the repository at this point in the history
MPMD cases
  • Loading branch information
ajnonaka authored May 17, 2024
2 parents ccbff96 + 7aaa8b8 commit 80b6d2c
Show file tree
Hide file tree
Showing 22 changed files with 655 additions and 0 deletions.
180 changes: 180 additions & 0 deletions Docs/source/MPMD_Tutorials.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,180 @@
.. _tutorials_mpmd:

AMReX-MPMD
==========

AMReX-MPMD utilizes the Multiple Program Multiple Data (MPMD) feature of MPI to provide cross-functionality for AMReX-based codes.
The framework enables data transfer across two different applications through **MPMD::Copier** class, which typically takes **BoxArray** of its application as an argument.
**Copier** instances created in both the applications together identify the overlapping cells for which the data transfer must occur.
**Copier::send** & **Copier::recv** functions, which take a **MultiFab** as an argument, are used to transfer the desired data of overlapping regions.

Case-1
------

The current case demonstrates the MPMD capability across two C++ applications.

Contents
^^^^^^^^

**Source_1** subfolder contains ``main_1.cpp`` which will be treated as the first application.
Similarly, **Source_2** subfolder contains ``main_2.cpp`` that will be treated as the second application.

Overview
^^^^^^^^

The domain in ``main_1.cpp`` is set to ``lo = {0, 0, 0}`` and ``hi = {31, 31, 31}``, while the domain in ``main_2.cpp`` is set to ``lo = {16, 16, 16}`` and ``hi = {31, 31, 31}``.
Hence, the data transfer will occur for the region ``lo = {16, 16, 16}`` and ``hi = {31, 31, 31}``.
Furthermore, the domain in ``main_1.cpp`` is split into boxes using ``max_grid_size=16``, while the domain in ``main_2.cpp`` is split using ``max_grid_size=8``.
Therefore, the **BoxArray**, and moreover, the number of boxes for the overlapping region are different across the two applications.
The data transfer demonstration is performed using a two component *MultiFab*.
The first component is populated in ``main_1.cpp`` before it is transferred to ``main_2.cpp``.
The second component is populated in ``main_2.cpp`` based on the received first component.
Finally, the second component is transferred from ``main_2.cpp`` to ``main_1.cpp``.
It can be seen from the plotfile generated by ``main_1.cpp`` that the second component is non-zero only for the overlapping region.

Compile
^^^^^^^

The compile process here assumes that the current working directory is ``ExampleCodes/MPMD/Case-1/``.

.. code-block:: bash
# cd into Source_1 to compile the first application
cd Source_1/
# Include USE_CUDA=TRUE for CUDA GPUs
make USE_MPI=TRUE
# cd into Source_2 to compile the second application
cd ../Source_2/
# Include USE_CUDA=TRUE for CUDA GPUs
make USE_MPI=TRUE
Run
^^^

Here, the current case is being run using a total of 12 MPI ranks with 8 allocated to ``main_1.cpp`` and the rest for ``main_2.cpp``.
Please note that MPI ranks attributed to each application/code need to be continuous, i.e., MPI ranks 0-7 are for ``main_1.cpp`` and 8-11 are for ``main_2.cpp``.
This may be default behaviour on several systems.
Furthermore, the run process here assumes that the current working directory is ``ExampleCodes/MPMD/Case-1/``.

.. code-block:: bash
# Running the MPMD process with 12 ranks
mpirun -np 8 Source_1/main3d.gnu.DEBUG.MPI.ex : -np 4 Source_2/main3d.gnu.DEBUG.MPI.ex
Running on Perlmutter (NERSC)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

This section presents information regarding sample scripts that can be used to run this case on `Perlmutter (NERSC) <https://docs.nersc.gov/systems/perlmutter/>`_.
The scripts ``mpmd_cpu.sh`` and ``mpmd_cpu.conf`` can be used to run the CPU version.
Similarly, ``mpmd_gpu.sh`` and ``mpmd_gpu.conf`` can be used to run the GPU version.
Please note that ``perlmutter_gpu.profile`` must be leveraged to compile the GPU version of the applications.

The content presented here is based on the following references:

* `NERSC documentation <https://docs.nersc.gov/jobs/examples/#mpmd-multiple-program-multiple-data-jobs>`_
* `WarpX documentation <https://warpx.readthedocs.io/en/latest/install/hpc/perlmutter.html>`_

Case-2
------

The current case demonstrates the MPMD capability across C++ and python applications.
This language interoperability is achieved through the python bindings of AMReX, `pyAMReX <https://github.com/AMReX-Codes/pyamrex>`_.

Contents
^^^^^^^^

``main.cpp`` will be the C++ application and ``main.py`` will be the python application.

Overview
^^^^^^^^

In the previous case (Case-1) of MPMD each application has its own domain, and therefore, different **BoxArray**.
However, there exist scenarios where both applications deal with the same **BoxArray**.
The current case presents such a scenario where the **BoxArray** is defined only in the ``main.cpp`` application, but this information is relayed to ``main.py`` application through the **MPMD::Copier**.

**Please ensure that the same AMReX source code is used to compile both the applications.**

Compiling and Running on a local system
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The compile process for pyAMReX is only briefly described here.
Please refer to the `pyAMReX documentation <https://pyamrex.readthedocs.io/en/latest/install/cmake.html#>`_ for more details.
**It must be mentioned that mpi4py is an important dependency that should exist in the utilized environment**.
For pyAMReX conda environment, it can be installed using ``conda install -c conda-forge mpi4py``, `resource <https://mpi4py.readthedocs.io/en/latest/install.html#using-conda>`_.

.. code-block:: bash
# find dependencies & configure
# Include -DAMReX_GPU_BACKEND=CUDA for gpu version
cmake -S . -B build -DAMReX_SPACEDIM="1;2;3" -DAMReX_MPI=ON -DpyAMReX_amrex_src=/path/to/amrex
# compile & install, here we use four threads
cmake --build build -j 4 --target pip_install
main.cpp compile
^^^^^^^^^^^^^^^^

The compile process here assumes that the current working directory is ``ExampleCodes/MPMD/Case-2/``.

.. code-block:: bash
# Include USE_CUDA=TRUE for CUDA GPUs
make USE_MPI=TRUE
Run
^^^

Here, the current case is being run using a total of 12 MPI ranks with 8 allocated to ``main.cpp`` and the rest for ``main.py``.
As mentioned earlier, the MPI ranks attributed to each application/code need to be continuous, i.e., MPI ranks 0-7 are for ``main.cpp`` and 8-11 are for ``main.py``.
This may be default behaviour on several systems.
Furthermore, the run process here assumes that the current working directory is ``ExampleCodes/MPMD/Case-2/``.

.. code-block:: bash
# Running the MPMD process with 12 ranks
mpirun -np 8 ./main3d.gnu.DEBUG.MPI.ex : -np 4 python main.py
Compiling and Running on Perlmutter (NERSC)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Running this case on perlmutter involves creating a python virtual environment.
pyAMReX must be compiled and installed into this virtual environment after its creation.
Similar to the previous case, this case also has supporting scripts to run on CPUs and GPUs.

The process detailed below assumes that the current working directory is ``ExampleCodes/MPMD/Case-2/``.

Creating a virtual environment
""""""""""""""""""""""""""""""

.. code-block:: bash
# Setup the required environment variables
source perlmutter_gpu.profile
# BEFORE PERFORMING THE FOLLOWING COMMANDS
# MOVE TO A DIRECTORY WHERE THE PYTHON VIRTUAL ENVIRONMENT MUST EXIST
python3 -m pip install --upgrade pip
python3 -m pip install --upgrade virtualenv
python3 -m pip cache purge
python3 -m venv pyamrex-gpu
source pyamrex-gpu/bin/activate
python3 -m pip install --upgrade pip
python3 -m pip install --upgrade build
python3 -m pip install --upgrade packaging
python3 -m pip install --upgrade wheel
python3 -m pip install --upgrade setuptools
python3 -m pip install --upgrade cython
python3 -m pip install --upgrade numpy
python3 -m pip install --upgrade pandas
python3 -m pip install --upgrade scipy
MPICC="cc -target-accel=nvidia80 -shared" python3 -m pip install --upgrade mpi4py --no-cache-dir --no-build-isolation --no-binary mpi4py
python3 -m pip install --upgrade openpmd-api
python3 -m pip install --upgrade matplotlib
python3 -m pip install --upgrade yt
python3 -m pip install --upgrade cupy-cuda12x # CUDA 12 compatible wheel
The content presented here is based on the following reference:

* `WarpX documentation <https://warpx.readthedocs.io/en/latest/install/hpc/perlmutter.html>`_
4 changes: 4 additions & 0 deletions Docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ sorted by the following categories:
- :ref:`heFFTe<tutorials_heffte>` -- heFFTe distributed tutorials.
- :ref:`Linear Solvers<tutorials_linearsolvers>` -- Examples of several linear solvers.
- :ref:`ML/PYTORCH<tutorials_ml>` -- Use of pytorch models to replace point-wise computational kernels.
- :ref:`MPMD<tutorials_mpmd>` -- Usage of AMReX-MPMD (Multiple Program Multiple Data) framework.
- :ref:`MUI<tutorials_mui>` -- Incorporates the MxUI/MUI (Multiscale Universal interface) frame into AMReX.
- :ref:`Particles<tutorials_particles>` -- Basic usage of AMReX's particle data structures.
- :ref:`Python<tutorials_python>` -- Using AMReX and interfacing with AMReX applications form Python - via `pyAMReX <https://github.com/AMReX-Codes/pyamrex/>`__
Expand All @@ -75,6 +76,7 @@ sorted by the following categories:
heFFTe_Tutorial
LinearSolvers_Tutorial
ML_Tutorial
MPMD_Tutorials
MUI_Tutorial
Particles_Tutorial
Python_Tutorial
Expand Down Expand Up @@ -102,6 +104,8 @@ sorted by the following categories:

.. _`Linear Solvers`: LinearSolvers_Tutorial.html

.. _`MPMD`: MPMD_Tutorials.html

.. _`MUI`: MUI_Tutorial.html

.. _`Particles`: Particles_Tutorial.html
Expand Down
20 changes: 20 additions & 0 deletions ExampleCodes/MPMD/Case-1/Source_1/GNUmakefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
AMREX_HOME ?= ../../../../../amrex

DEBUG = TRUE

DIM = 3

COMP = gcc

USE_MPI = TRUE

USE_OMP = FALSE
USE_CUDA = FALSE
USE_HIP = FALSE

include $(AMREX_HOME)/Tools/GNUMake/Make.defs

include ./Make.package
include $(AMREX_HOME)/Src/Base/Make.package

include $(AMREX_HOME)/Tools/GNUMake/Make.rules
1 change: 1 addition & 0 deletions ExampleCodes/MPMD/Case-1/Source_1/Make.package
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
CEXE_sources += main_1.cpp
69 changes: 69 additions & 0 deletions ExampleCodes/MPMD/Case-1/Source_1/main_1.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@

#include <AMReX.H>
#include <AMReX_Print.H>
#include <AMReX_MultiFab.H>
#include <AMReX_PlotFileUtil.H>
#include <mpi.h>
#include <AMReX_MPMD.H>

int main(int argc, char* argv[])
{
// Initialize amrex::MPMD to establish communication across the two apps
MPI_Comm comm = amrex::MPMD::Initialize(argc, argv);
amrex::Initialize(argc,argv,true,comm);
{
amrex::Print() << "Hello world from AMReX version " << amrex::Version() << "\n";
// Number of data components at each grid point in the MultiFab
int ncomp = 2;
// how many grid cells in each direction over the problem domain
int n_cell = 32;
// how many grid cells are allowed in each direction over each box
int max_grid_size = 16;
//BoxArray -- Abstract Domain Setup
// integer vector indicating the lower coordindate bounds
amrex::IntVect dom_lo(0,0,0);
// integer vector indicating the upper coordindate bounds
amrex::IntVect dom_hi(n_cell-1, n_cell-1, n_cell-1);
// box containing the coordinates of this domain
amrex::Box domain(dom_lo, dom_hi);
// will contain a list of boxes describing the problem domain
amrex::BoxArray ba(domain);
// chop the single grid into many small boxes
ba.maxSize(max_grid_size);
// Distribution Mapping
amrex::DistributionMapping dm(ba);
// Create an MPMD Copier based on current ba & dm
auto copr = amrex::MPMD::Copier(ba,dm,false);
//Define MuliFab
amrex::MultiFab mf(ba, dm, ncomp, 0);
//Geometry -- Physical Properties for data on our domain
amrex::RealBox real_box ({0., 0., 0.}, {1. , 1., 1.});
amrex::Geometry geom(domain, &real_box);
//Calculate Cell Sizes
amrex::GpuArray<amrex::Real,3> dx = geom.CellSizeArray(); //dx[0] = dx dx[1] = dy dx[2] = dz
//Fill only the first component of the MultiFab
for(amrex::MFIter mfi(mf); mfi.isValid(); ++mfi){
const amrex::Box& bx = mfi.validbox();
const amrex::Array4<amrex::Real>& mf_array = mf.array(mfi);

amrex::ParallelFor(bx, [=] AMREX_GPU_DEVICE(int i, int j, int k){

amrex::Real x = (i+0.5) * dx[0];
amrex::Real y = (j+0.5) * dx[1];
amrex::Real z = (k+0.5) * dx[2];
amrex::Real r_squared = ((x-0.5)*(x-0.5)+(y-0.5)*(y-0.5)+(z-0.5)*(z-0.5))/0.01;

mf_array(i,j,k,0) = 1.0 + std::exp(-r_squared);

});
}
// Send ONLY the first populated MultiFab component to main_2.cpp
copr.send(mf,0,1);
// Receive ONLY the second MultiFab component from main_2.cpp
copr.recv(mf,1,1);
//Plot MultiFab Data
WriteSingleLevelPlotfile("plt_cpp_1", mf, {"comp0","comp1"}, geom, 0., 0);
}
amrex::Finalize();
amrex::MPMD::Finalize();
}
20 changes: 20 additions & 0 deletions ExampleCodes/MPMD/Case-1/Source_2/GNUmakefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
AMREX_HOME ?= ../../../../../amrex

DEBUG = TRUE

DIM = 3

COMP = gcc

USE_MPI = TRUE

USE_OMP = FALSE
USE_CUDA = FALSE
USE_HIP = FALSE

include $(AMREX_HOME)/Tools/GNUMake/Make.defs

include ./Make.package
include $(AMREX_HOME)/Src/Base/Make.package

include $(AMREX_HOME)/Tools/GNUMake/Make.rules
1 change: 1 addition & 0 deletions ExampleCodes/MPMD/Case-1/Source_2/Make.package
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
CEXE_sources += main_2.cpp
62 changes: 62 additions & 0 deletions ExampleCodes/MPMD/Case-1/Source_2/main_2.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@

#include <AMReX.H>
#include <AMReX_Print.H>
#include <AMReX_MultiFab.H>
#include <AMReX_PlotFileUtil.H>
#include <mpi.h>
#include <AMReX_MPMD.H>

int main(int argc, char* argv[])
{
// Initialize amrex::MPMD to establish communication across the two apps
MPI_Comm comm = amrex::MPMD::Initialize(argc, argv);
amrex::Initialize(argc,argv,true,comm);
{
amrex::Print() << "Hello world from AMReX version " << amrex::Version() << "\n";
// Number of data components at each grid point in the MultiFab
int ncomp = 2;
// how many grid cells in each direction over the problem domain
int n_cell = 32;
// how many grid cells are allowed in each direction over each box
int max_grid_size = 8;
//BoxArray -- Abstract Domain Setup
// integer vector indicating the lower coordindate bounds
amrex::IntVect dom_lo(n_cell/2, n_cell/2, n_cell/2);
// integer vector indicating the upper coordindate bounds
amrex::IntVect dom_hi(n_cell-1, n_cell-1, n_cell-1);
// box containing the coordinates of this domain
amrex::Box domain(dom_lo, dom_hi);
// will contain a list of boxes describing the problem domain
amrex::BoxArray ba(domain);
// chop the single grid into many small boxes
ba.maxSize(max_grid_size);
// Distribution Mapping
amrex::DistributionMapping dm(ba);
// Create an MPMD Copier based on current ba & dm
auto copr = amrex::MPMD::Copier(ba,dm,false);
//Define MuliFab
amrex::MultiFab mf(ba, dm, ncomp, 0);
//Geometry -- Physical Properties for data on our domain
amrex::RealBox real_box ({0.5, 0.5, 0.5}, {1. , 1., 1.});
amrex::Geometry geom(domain, &real_box);
// Receive ONLY the first populated MultiFab component from main_1.cpp
copr.recv(mf,0,1);
//Fill the second component of the MultiFab
for(amrex::MFIter mfi(mf); mfi.isValid(); ++mfi){
const amrex::Box& bx = mfi.validbox();
const amrex::Array4<amrex::Real>& mf_array = mf.array(mfi);

amrex::ParallelFor(bx, [=] AMREX_GPU_DEVICE(int i, int j, int k){

mf_array(i,j,k,1) = amrex::Real(10.)*mf_array(i,j,k,0);

});
}
// Send ONLY the second MultiFab component (populated here) to main_1.cpp
copr.send(mf,1,1);
//Plot MultiFab Data
WriteSingleLevelPlotfile("plt_cpp_2", mf, {"comp0","comp1"}, geom, 0., 0);
}
amrex::Finalize();
amrex::MPMD::Finalize();
}
2 changes: 2 additions & 0 deletions ExampleCodes/MPMD/Case-1/mpmd_cpu.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
0-7 ./Source_1/main3d.gnu.x86-milan.DEBUG.MPI.ex
8-11 ./Source_2/main3d.gnu.x86-milan.DEBUG.MPI.ex
Loading

0 comments on commit 80b6d2c

Please sign in to comment.