Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add shmem_{initialized, finalized} #470

Closed
wants to merge 3 commits into from

Conversation

nspark
Copy link
Contributor

@nspark nspark commented Jun 25, 2021

This PR adds shmem_initialized and shmem_finalized to query the state of the library.

Closes #457

@nspark
Copy link
Contributor Author

nspark commented Aug 16, 2021

Suggestion from 8/16 WG: Add note clarifying that shmem_initialized() == 0 is not necessarily sufficient to prevent multiple threads from racing to call shmem_init.

@manjugv
Copy link
Collaborator

manjugv commented Aug 27, 2021

August spec meeting: Please add this note in the shmem_initialized section as API notes.

@BryantLam
Copy link
Collaborator

BryantLam commented Aug 27, 2021

These were the states I could think of in a K-map style:

  1. Before the library is initialized, shmem_initialized() == 0 and shmem_finalized() == 0.

    • If initialization routine is called, go to (2).
    • If finalization routine is called, spec doesn't say what happens, but implies undefined behavior 🐢 backed up by non-authoritative error note in backmatter.
    • (mentioned above and will be added as a note) If library is not initialized (and is it valid to do so), it is user error to have multiple threads race on calling shmem_init.
  2. After library is initialized, shmem_initialized() == 1 and shmem_finalized() == 0.

  3. After library is finalized, shmem_initialized() == <don't care> and shmem_finalized() == 1.

    • If initialization routine is called, undefined behavior like (2) 🐌.
    • If finalization routine is called, undefined behavior like (1) 🐢 and because it's not the last OpenSHMEM call 🐙.
    • As currently defined, we can get away with shmem_initialized() == <don't care> because we are now finalized and any subsequent OpenSHMEM call (except shmem_initialized() and shmem_finalized()) would result in undefined behavior 🐌🐙. As written, we have to set shmem_initialized() to some value but the value doesn't matter; it's effectively useless (or in a different respect, error prone) due to all the undefined behavior.

From these states, this would be one of the only valid conditional guards:

void app_startup() {
    if (shmem_finalized()) {
        // OpenSHMEM is done. Undefined behavior to call any OpenSHMEM routine.
        return or abort();
    }
    if (!shmem_initialized()) {
        // OpenSHMEM is not initialized. Okay to call OpenSHMEM initialization logic.
        shmem_init() or shmem_init_thread();
        this_app_initialized_openshmem = 1;
    }
    // OpenSHMEM is initialized.
}

void app_teardown() {
    if (this_app_initialized_openshmem == 1) {
        shmem_finalize();
    }
}

Or if your apps are not tearing down in reverse-startup order, you can choose to not (RE: must not?) put shmem_finalize() in app_teardown() and finalize OpenSHMEM from the parent scope:

in parent {
    app_startup();
    // do work
    app_teardown();
    if (!shmem_finalized()) { // Note: Should not use `shmem_initialized()` due to aforementioned `<don't care>`.
        shmem_finalize();
    }
}

I think this is a bit error-prone in my mind, but it works and it's what we've specified.

🐌 If subsequent calls to initialization routines were allowed in an active OpenSHMEM program, one behavior could be to "do nothing and increment value of shmem_initialized()" for nesting or counting references (shmem_finalized() is counting now too 🐙). This would then change the conditional guard to something like:

void app_startup() {
    if (shmem_finalized() == shmem_initialized()) {
        return or abort();
    }
    shmem_init() or shmem_init_thread();
    // OpenSHMEM is initialized.
}

void app_teardown() {
    shmem_finalize();
}

in parent {
    app_startup();
    // do work
    app_teardown();
    assert(shmem_finalized() == shmem_initialized());
    // All apps tore down correctly, or either one of them or you have a bug.
}

The "do nothing" behavior could be a problem if subsequent calls of shmem_init_thread had incompatible arguments.

@jdinan
Copy link
Collaborator

jdinan commented Sep 2, 2021

If we follow the MPI semantics, the guard for calling shmem_init would be !shmem_initialized() && !shmem_finalized().

@nspark
Copy link
Contributor Author

nspark commented Sep 8, 2021

Added the following note:

Although shmem_initialized is thread-safe, its return value is not a sufficient guard to prevent multiple threads from racing to initialize the OpenSHMEM library concurrently, as shmem_initialized may return 0 to one thread while library initialization is in progress due to a call from another thread. Applications must ensure that only one call to shmem_init[_thread] is made to initialize the OpenSHMEM library.

@jdinan
Copy link
Collaborator

jdinan commented Sep 8, 2021

@nspark The note is reasonable, but why not add reference counted init/finalize and require that init functions are thread safe?

@nspark
Copy link
Contributor Author

nspark commented Sep 8, 2021

@jdinan IDK, that just seemed like a change of a larger scope. I'm happy to draft it, but I'd probably do it as a separate PR.

@nspark nspark removed the HadReading label Jan 28, 2022

\apidescription{
The \FUNC{shmem\_initialized} routine returns a value indicating
whether the \openshmem library has been initialized (i.e, a call to
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want "has been" or "is currently"?


\apidescription{
The \FUNC{shmem\_finalized} routine returns a value indicating
whether the \openshmem library has been finalized (i.e, a call to
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want "has been" or "is currently"? The problem with "has been" is that it tells you whether the API was ever called. If we assume that an OpenSHMEM library can't be reinitialized, then this is enough information to tell you the library state. If an OpenSHMEM library supports reinitialization then "has been" is not enough information to know the state of the library. You need to know "is currently".

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reference, here is the NVSHMEM library state query function: https://docs.nvidia.com/hpc-sdk/nvshmem/api/docs/gen/api/setup.html#nvshmemx-init-status

@jdinan jdinan added this to the OpenSHMEM 1.6 milestone May 27, 2022
@jdinan
Copy link
Collaborator

jdinan commented May 30, 2024

Superseded by #512

@jdinan jdinan closed this May 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add API for querying if OpenSHMEM is currently initialized
4 participants