Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metadata optimization: Only write ADIOS2 attributes from rank 0 in openPMD-api >= 0.16 #5223

Open
wants to merge 1 commit into
base: dev
Choose a base branch
from

Conversation

franzpoeschel
Copy link
Contributor

We were thinking about using this as the default option generally in openPMD, but our documentation specifies that attribute writing is collective only in HDF5, and attributes can be written independently in ADIOS2. So it would have broken the API contract to assume that attributes outside of rank 0 can be ignored.
In PIConGPU, however, they can.

@psychocoderHPC psychocoderHPC added this to the 0.8.0 / Next stable milestone Nov 18, 2024
@psychocoderHPC psychocoderHPC added component: plugin in PIConGPU plugin refactoring code change to improve performance or to unify a concept but does not change public API labels Nov 18, 2024
}
},
"adios2": {
"attribute_writing_ranks": 0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't this more targeted? Like this

Suggested change
"attribute_writing_ranks": 0,
# if OPENPMDAPI_VERSION_GE(0, 16, 0)
"attribute_writing_ranks": 0,
# endif

(with appropriate string concatenation). The next person changing something in this configuration will have to do so in two places. I'm aware that these are close together and it's not a huge burden but still it's something that could get overlooked in a quick diff that doesn't show the other section. (In fact, I've only seen the duplication because I've looked at the full file to understand what's going on. The current diff doesn't even show the other version.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried that at first, it's far worse to read than just seeing directly the configuration listed for the supported versions:

#    if OPENPMDAPI_VERSION_GE(0, 16, 0)
#        define PICONGPU_ATTRIBUTE_WRITING_RANKS                                                                      \
            "\n"                                                                                                      \
            R"(    "attribute_writing_ranks": 0,)"
#    else
#        define PICONGPU_ATTRIBUTE_WRITING_RANKS ""
#    endif

        std::string const& defaultValues = R"(
{
  "hdf5": {
    "dataset": {
      "chunks": "none"
    }
  },
  "adios2": {)" PICONGPU_ATTRIBUTE_WRITING_RANKS
                                           R"(
    "engine": {
      "preferred_flush_target": "buffer",
      "parameters": {
        "BufferChunkSize": 2147381248
      }
    }
  }
}
        )";

The only solution that I'd really consider an improvement in maintainability would be to define separately:

        std::string const& baseConfiguration = R"(
{
  "hdf5": {
    "dataset": {
      "chunks": "none"
    }
  },
  "adios2": {
    "engine": {
      "preferred_flush_target": "buffer",
      "parameters": {
        "BufferChunkSize": 2147381248
      }
    }
  }
}
        )";
    std::string const& openPMD_0_16_additionalConfiguration = 
        R"({"adios2": {"engine":{"attribute_writing_ranks": 0}}})";

And then use openPMD::json::merge() instead of manual string concatenation, but that's an additional round of parsing, an additionaly call to an external library and an additional runtime error source, too.

For a long-term solution, I'd suggest that we add something like {"dont_warn": ["preferred_flush_target"]} to openPMD so that warnings about unknown keys can be suppressed, making this above distinction unnecessary:

{
  "hdf5": {
    "dataset": {
      "chunks": "none"
    }
  },
  "adios2": {
    "engine": {
      "dont_warn": ["preferred_flush_target"],
      "preferred_flush_target": "buffer",
      "parameters": {
        "BufferChunkSize": 2147381248
      }
    }
  }
}

@chillenzer
Copy link
Contributor

A very online "offline" discussion: The current solution oscillates between string and json representation quite a bit. A cleaner solution that would also allow for a very readable solution of this particular problem would be to stay completely in json-world. @franzpoeschel will consider if changing this a beneficial move and act accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: plugin in PIConGPU plugin refactoring code change to improve performance or to unify a concept but does not change public API
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants