Skip to content

Releases: ecmwf-ifs/dwarf-p-cloudsc

v1.5.2

16 Apr 10:17
95125c2
Compare
Choose a tag to compare

A bugfix release to point to the correct version tag for FIELD_API and fix Loki processing when using the OMNI frontend instead of the (default) FParser2.

What's Changed

Full Changelog: v1.5.1...v1.5.2

v1.5.1

28 Mar 08:49
1aac43d
Compare
Choose a tag to compare

This is a bugfix release that fixes a performance bug in the code of Loki-generated variants, restoring the expected throughput performance on GPUs.

What's Changed

Full Changelog: v1.5.0...v1.5.1

v1.5.0

22 Mar 20:13
b92e249
Compare
Choose a tag to compare

What's New

  • A new variant based on Atlas, supporting FieldSet and MultiField storage backends (#60)
  • A new GPU-optimised OpenACC variant "k-caching" (#46)
  • HIP implementations for SCC, SCC-HOIST, SCC-K-CACHING (#59)
  • A new Python-interface variant with a Python driver that calls the Fortran code (#38)
  • A Python implementation generated via Loki (#51)
  • SYCL implementations for SCC, SCC-HOIST, SCC-K-CACHING (#64)
  • HDF5 input support for C-based variants
  • Update to the FIELD_API variant to use the new open-source FIELD_API library
  • Several updated or new architecture files

What's Changed

New Contributors

Full Changelog: v1.4.0...v1.5.0

Version 1.4.0

01 Mar 10:13
7da2893
Compare
Choose a tag to compare

What's Changed

  • Remove dependence on the global modules from the cloudsc-fortran by @piotrows in #35
  • Bug-fix for Loki-SCC/H array demotion by @mlange05 in #39
  • New Loki SCC-CUF variant by @MichaelSt98 in #31
  • CUDA Fortran (CUF) 'k-caching' version (further optimized version) by @MichaelSt98 in #33
  • Add col/s throughput metric in performance reports, add vector_length(NPROMA) to gang loops by @reuterbal in #42
  • Initial FIELD API and gpu-scc-field variant by @mlange05 in #41
  • CLOUDSC SCC-CUDA-C (semi-automatic) implementations by @MichaelSt98 in #40

Full Changelog: v1.3.0...v1.4.0

Version 1.3.0

20 Jan 13:49
f4a90b6
Compare
Choose a tag to compare

This release includes a significant number of changes and new developments:

  • An OpenMP-offload variant (dwarf-cloudsc-gpu-omp-scc-hoist), derived from the OpenACC SCC-hoist variant, developed by L. Lucido (Atos).
  • A new CUDA Fortran implementation (dwarf-cloudsc-gpu-scc-cuf), implementing the SCC loop layout. This requires the --with-cuda flag to build.
  • A new Python implementation, based on GT4Py, capable of generating CPU and GPU code (using cupy and, optionally, DaCe)
  • Deprecation of the dwarf-cloudsc-gpu-claw variant, which no longer works correctly on recent NVIDIA software stacks. Building this variant requires adding an explicit --with-claw flag to the build command.
  • With Loki publicly available, the references to the Loki repository have been updated and testing of the source-to-source translation variants is now possible.
  • A JUBE benchmark configuration has been added to ease testing across various platforms.
  • Clean-up of the CMake scripts
  • New arch files for:
    • HPC2020, the ECMWF Atos system in Bologna
    • LUMI
    • MeluXina
    • Additional Isambard partitions

Version 1.2.0

11 Feb 11:56
8cc1c8b
Compare
Choose a tag to compare
Merge pull request #9 from ecmwf-ifs/develop

Merge develop to main