Releases: ecmwf-ifs/dwarf-p-cloudsc
Releases · ecmwf-ifs/dwarf-p-cloudsc
v1.5.2
A bugfix release to point to the correct version tag for FIELD_API and fix Loki processing when using the OMNI frontend instead of the (default) FParser2.
What's Changed
- Ensure yoecldp.F90 is parsed with FParser by @reuterbal in #82
- Mark projects no longer as optional and fix field-api to v0.3.0 by @reuterbal in #83
- Version 1.5.2 by @reuterbal in #84
Full Changelog: v1.5.1...v1.5.2
v1.5.1
This is a bugfix release that fixes a performance bug in the code of Loki-generated variants, restoring the expected throughput performance on GPUs.
What's Changed
- Loki: enable imports and mark headers as ignored instead of blocked by @reuterbal in #80
- v1.5.1 by @reuterbal in #81
Full Changelog: v1.5.0...v1.5.1
v1.5.0
What's New
- A new variant based on Atlas, supporting FieldSet and MultiField storage backends (#60)
- A new GPU-optimised OpenACC variant "k-caching" (#46)
- HIP implementations for SCC, SCC-HOIST, SCC-K-CACHING (#59)
- A new Python-interface variant with a Python driver that calls the Fortran code (#38)
- A Python implementation generated via Loki (#51)
- SYCL implementations for SCC, SCC-HOIST, SCC-K-CACHING (#64)
- HDF5 input support for C-based variants
- Update to the FIELD_API variant to use the new open-source FIELD_API library
- Several updated or new architecture files
What's Changed
- Remove obsolete architectures, add march=znver2 option to gnu by @piotrows in #45
- Cloudsc variant using SCC optimization with field_api (standalone repo) by @awnawab in #25
- Openacc k-caching variant by @MichaelSt98 in #46
- PyIface - Fortran-Python bridging with CMake compilation by @piotrows in #38
- Python: Adapt to latest GT4Py release by @stubbiali in #50
- Pure Python variant with test setup by @mlange05 in #51
- Python-gt4py: Removing GT4Py-based implementation of CLOUDSC by @mlange05 in #52
- Pool allocator via Loki by @reuterbal in #49
- Enable gvmode and add nvidia/22.11 arch files by @reuterbal in #55
- Switch to CTest for CI and add NVHPC 23.5 to CI testing by @reuterbal in #56
- Architecture files for Leonardo by @reuterbal in #58
- Bundle integration for Atlas by @reuterbal in #57
- New Atlas-based variant using BlockStructured FunctionSpace by @sbrdar in #54
- CLOUDSC HIP (SCC, SCC-HOIST, SCC-K-CACHING) by @MichaelSt98 in #59
- Loki config update by @mlange05 in #61
- HIP updates by @MichaelSt98 in #63
- SYCL-specific env and README entry by @mlange05 in #68
- Introducing SYCL implementations/variants by @MichaelSt98 in #64
- Cuda update: metric col/s by @MichaelSt98 in #67
- Update to new open-source FIELD_API by @awnawab in #70
- use Atlas structures with an IFS-like variable batching by @sbrdar in #60
- Loki F2C using
loki_transform
instead ofloki_transform_transpile
by @MichaelSt98 in #69 - Emancipation from serialbox via HDF5 for C-style variants by @MichaelSt98 in #62
- Fix LUMI-G archs for HDF5 in HIP variants by @reuterbal in #72
- HDF5 support for CUDA/HIP/SYCL variants by @MichaelSt98 in #71
- Adjust CUDA-specific loki config file to Loki v0.2.0 by @mlange05 in #75
- Add semantic for reading thread count from $OMP_NUM_THREADS by @antoine-morvan in #73
- Compatibility with Loki v0.2 by @reuterbal in #74
- Update Github actions versions to remove node.js warnings by @reuterbal in #76
- Update loki version to 0.2.0 and field_api to 0.3.0 by @reuterbal in #77
- Version 1.5.0 by @reuterbal in #78
New Contributors
- @awnawab made their first contribution in #25
- @sbrdar made their first contribution in #54
- @antoine-morvan made their first contribution in #73
Full Changelog: v1.4.0...v1.5.0
Version 1.4.0
What's Changed
- Remove dependence on the global modules from the cloudsc-fortran by @piotrows in #35
- Bug-fix for Loki-SCC/H array demotion by @mlange05 in #39
- New Loki SCC-CUF variant by @MichaelSt98 in #31
- CUDA Fortran (CUF) 'k-caching' version (further optimized version) by @MichaelSt98 in #33
- Add col/s throughput metric in performance reports, add vector_length(NPROMA) to gang loops by @reuterbal in #42
- Initial FIELD API and gpu-scc-field variant by @mlange05 in #41
- CLOUDSC SCC-CUDA-C (semi-automatic) implementations by @MichaelSt98 in #40
Full Changelog: v1.3.0...v1.4.0
Version 1.3.0
This release includes a significant number of changes and new developments:
- An OpenMP-offload variant (
dwarf-cloudsc-gpu-omp-scc-hoist
), derived from the OpenACC SCC-hoist variant, developed by L. Lucido (Atos). - A new CUDA Fortran implementation (
dwarf-cloudsc-gpu-scc-cuf
), implementing the SCC loop layout. This requires the--with-cuda
flag to build. - A new Python implementation, based on GT4Py, capable of generating CPU and GPU code (using cupy and, optionally, DaCe)
- Deprecation of the
dwarf-cloudsc-gpu-claw
variant, which no longer works correctly on recent NVIDIA software stacks. Building this variant requires adding an explicit--with-claw
flag to the build command. - With Loki publicly available, the references to the Loki repository have been updated and testing of the source-to-source translation variants is now possible.
- A JUBE benchmark configuration has been added to ease testing across various platforms.
- Clean-up of the CMake scripts
- New arch files for:
- HPC2020, the ECMWF Atos system in Bologna
- LUMI
- MeluXina
- Additional Isambard partitions
Version 1.2.0
Merge pull request #9 from ecmwf-ifs/develop Merge develop to main