-
Notifications
You must be signed in to change notification settings - Fork 22
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #96 from ecmwf-ifs/je-field-api-view-updates
Updated Field API variants of the clouds dwarf (CPU and GPU)
- Loading branch information
Showing
19 changed files
with
1,154 additions
and
344 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -28,6 +28,9 @@ Balthasar Reuter ([email protected]) | |
prototype that validates runs against platform and language-agnostic | ||
off-line reference data via HDF5 or the Serialbox package. The kernel code | ||
also is slightly cleaner than the original version. | ||
- **dwarf-cloudsc-fortran-field**: A fortran version of CLOUDSC that uses Field API | ||
for the data structures. The intent of this version is to show how | ||
Field API is used in newer versions of the IFS. | ||
- **dwarf-cloudsc-c**: Standalone C version of the kernel that has | ||
been generated by ECMWF tools. This relies exclusively on the Serialbox | ||
validation mechanism. | ||
|
@@ -81,13 +84,18 @@ Balthasar Reuter ([email protected]) | |
- **dwarf-cloudsc-gpu-scc-field**: GPU-enabled and optimized version of | ||
CLOUDSC that uses the SCC loop layout, and uses [FIELD API](https://github.com/ecmwf-ifs/field_api) (a Fortran library purpose-built for IFS data-structures that facilitates the | ||
creation and management of field objects in scientific code) to perform device offload | ||
and copyback. The intent is to demonstrate the explicit use of pinned host memory to speed-up | ||
data transfers, as provided by the shipped prototype implmentation, and | ||
investigate the effect of different data storage allocation layouts. | ||
and copyback. | ||
The field api variant supports modern features of the FIELD API such as *field gangs* that group | ||
multiple fields and allocates them in one larger field, in order to reduce allocations and | ||
data transfers. Field gang support can be enabled at runtime by setting the environment | ||
variable `CLOUDSC_PACKED_STORAGE=ON`. If CUDA is available, then the field api variant also supports | ||
the use of allocating fields in pinned memory. This is enabled by setting the | ||
environemnt variable `CLOUDSC_FIELD_API_PINNED=ON` and will speed up data transfers between host and device. | ||
To enable this variant, a suitable CUDA installation is required and the | ||
`--with-cuda` flag needs to be passed at the build stage. This variant lets the CUDA runtime | ||
manage temporary arrays and needs a large `NV_ACC_CUDA_HEAPSIZE` | ||
(eg. `NV_ACC_CUDA_HEAPSIZE=8GB` for 160K columns.) | ||
manage temporary arrays and needs a large `NV_ACC_CUDA_HEAPSIZE` (eg. `NV_ACC_CUDA_HEAPSIZE=8GB` for 160K columns.). | ||
It is possible to disable Field API registering fields in the OpenACC data map, by passing the | ||
`--without-mapped-fields` flag at build stage. | ||
- **cloudsc-pyiface.py**: a combination of the cloudsc/cloudsc-driver routines | ||
of cloudsc-fortran with the uppermost `dwarf` program replaced with a | ||
corresponding Python script capable of HDF5 data load and | ||
|
@@ -320,8 +328,9 @@ transfer overheads will dominate timings, and that most supported GPU | |
variants aim to optimise compute kernel timings only. However, a | ||
dedicated variant `dwarf-cloudsc-gpu-scc-field` has been added to | ||
explore host-side memory pinning, which improves data transfer times | ||
and alternative data layout strategies. By default, this will allocate | ||
each array variable individually in pinned memory. A runtime flag | ||
and alternative data layout strategies. By default, pinned memory is turned off | ||
but can be turned on by setting the environment variable `CLOUDSC_FIELD_API_PINNED=ON`. | ||
This will allocate each array variable individually in pinned memory. A runtime flag | ||
`CLOUDSC_PACKED_STORAGE=ON` can be used to enable "packed" storage, | ||
where multiple arrays are stored in a single base allocation, eg. | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.