Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenSHMEM Sessions (ctx hints) API #493

Merged
merged 32 commits into from
May 24, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
2387433
Add 1st draft of bundle start/stop API & example
davidozog Oct 14, 2021
352f20f
Added change-log entry for the bundle routines
davidozog Oct 15, 2021
ca03037
Branch from wip/bundles w/ feedback (highlighted)
davidozog Dec 13, 2021
0b532e0
WIP text / rearrangment for Sessions
davidozog Mar 31, 2022
97fe276
Merge branch 'master' of github.com:openshmem-org/specification into …
davidozog Sep 22, 2022
9d7b266
Sessions: rewrite bundles text to suit sessions
davidozog Sep 23, 2022
ef0d9c6
Sessions: rewrite bundles text to suit sessions
davidozog Sep 23, 2022
64b51f6
Merge branch 'wip/sessions' of github.com:davidozog/openshmem-specifi…
davidozog Sep 28, 2022
09a0f10
Sessions: remove unecessary drafting artifacts
davidozog Sep 28, 2022
93f510e
Sessions: address some feedback from reading
davidozog Oct 10, 2022
0a9873a
Sessions: rewording and edits based on WG feedback
davidozog Mar 17, 2023
c2dd73e
Update content/sessions_intro.tex
davidozog Mar 17, 2023
96758c9
Update content/backmatter.tex
davidozog Mar 17, 2023
4616742
Sessions: better define "chain" and add an example
davidozog Jul 10, 2023
6417925
rm extraneous sessions options, clarify "chaining"
davidozog Nov 17, 2023
bd37f7c
sessions: swap shmem_session_start arguments
davidozog Dec 9, 2023
e015370
sessions: "chain together", not "optimizations"
davidozog Dec 9, 2023
242595b
sessions: small vs. large blocking puts for chains
davidozog Dec 9, 2023
ff3a2db
sessions: Say "HPC Challenge benchmark" not "GUPS"
davidozog Dec 9, 2023
94dd7c1
sessions: add quiet & comments to sessions example
davidozog Dec 9, 2023
7a6f600
sessions: no-op if ctx equals SHMEM_CTX_INVALID
davidozog Dec 20, 2023
9e24b47
sessions: redo options table, rename chain->batch
davidozog Dec 20, 2023
7034807
sessions: note "small" depends on implementation
davidozog Dec 20, 2023
4a0e482
sessions: update code example based on WG feedback
davidozog Dec 20, 2023
30ce68e
sessions/example: check ctx_create return & clean
davidozog Jan 17, 2024
caacb66
sessions/example: add config struct and mask
davidozog Jan 18, 2024
6d0f714
sessions: rename completion_rate to delivery_rate
davidozog Feb 13, 2024
933f8ff
sessions: remove delivery_rate config_t parameter
davidozog Apr 10, 2024
aa84ede
sessions: improve language regarding config_t hint
davidozog Apr 12, 2024
3fc9bf3
sessions: improve writing, fix example, SAME_AMO
davidozog May 9, 2024
26f3cc0
sessions: rename entire API to shmem_ctx_session_*
davidozog May 9, 2024
e22a3f1
sessions: remove SESSION_SAME_AMO hint for now...
davidozog May 9, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions content/backmatter.tex
Original file line number Diff line number Diff line change
Expand Up @@ -674,6 +674,10 @@ \section{Version 1.6}
operations for team-based reductions.
\ChangelogRef{teamreducetypes}%
%
\item Added the session routines, \FUNC{shmem\_session\_start} and
\FUNC{shmem\_session\_stop}, which allow users to pass hints to the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these start/stop functions have been updated to include ctx as well?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, yes; I think these should have been renamed just like every other instance below.

How shall we handle this @jdinan? Do you think it's eligible for a doc edit?

\openshmem library to apply runtime optimizations.
\ChangelogRef{subsec:sessions}%
\item Added fine grained completion routine: \FUNC{shmem\_pe\_quiet}.
\ChangelogRef{subsec:shmem_pe_quiet}%
%
Expand Down
31 changes: 31 additions & 0 deletions content/sessions_intro.tex
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
\openshmem \emph{sessions} provide a mechanism for applications to inform the
\openshmem library of an upcoming sequence of communication routines that
exhibit suitable patterns for runtime optimizations.
A session is associated with a specific \openshmem communication context
(Section~\ref{sec:ctx}), and it indicates the beginning and ending of
communication phases on that context.
The \FUNC{shmem\_ctx\_session\_start} routine indicates the beginning of a session,
and the \FUNC{shmem\_ctx\_session\_stop} routine indicates the end of a session.
The \LibConstRef{SHMEM\_CTX\_SESSION\_*} options (Table~\ref{session_opts}) indicate
which patterns of \openshmem RMA and AMO routines will occur within a session.
These options serve only as \textit{hints} to the library; it is up to the
implementation whether or not to apply any optimizations within a session.
A session may be provided a configuration argument that specifies attributes
associated with the session. This configuration argument is of type
\CTYPE{shmem\_ctx\_session\_config\_t}, which is detailed further in
Section~\ref{subsec:shmem_team_config_t}.

Usage of the \openshmem session APIs on a particular context must comply with
the requirements of all options set on that context.
Starting and stopping \openshmem sessions should not affect the completion or
ordering semantics of any \openshmem routines in the program.
For these reasons, multi-threaded \openshmem programs may require additional
thread synchronization to ensure sessions hints are correctly applied to
shareable contexts.
Because sessions are associated with an \openshmem communication context,
routines not performed on a communication context (like collective routines)
are ineligible for session hints.

The \FUNC{shmem\_ctx\_session\_config\_t} object requires the \CONST{SIZE\_MAX}
macro defined in \HEADER{stdint.h} by \Cstd[99]~\S7.18.3 and
\Cstd[11]~\S7.20.3.
79 changes: 79 additions & 0 deletions content/shmem_ctx_session_config_t.tex
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
\apisummary{
A structure type representing communication session configuration arguments
}

\begin{apidefinition}

\begin{Csynopsis}
typedef struct {
size_t total_ops;
} shmem_ctx_session_config_t;
\end{Csynopsis}

\begin{apiarguments}
None.
\end{apiarguments}


\apidescription{
A communication session configuration object is provided as an argument to
the \FUNC{shmem\_ctx\_session\_start} routine.
The \VAR{shmem\_ctx\_session\_config\_t} object contains optional parameters
that are associated with the options of a communication session.
These parameters serve only as \textit{hints} to the library; it is up to
the implementation whether or not to use the parameter values within
a session.

The \VAR{total\_ops} member indicates the expected maximum number of all
calls to \openshmem RMA routines within the session (i.e., after a call to
\FUNC{shmem\_ctx\_session\_start} and before a corresponding call to
\FUNC{shmem\_ctx\_session\_stop}).
If \VAR{total\_ops} differs from the \textit{actual} number of calls to
\openshmem RMA routines within the session, then application performance
might be suboptimal; however, the result of any data transfers,
completions, or memory ordering operations are unaffected by the value of
\FUNC{total\_ops}.

When passing a configuration structure to \FUNC{shmem\_ctx\_session\_start},
the mask parameter specifies which fields the application requests to
associate with the session.
Any configuration parameter value that is not indicated in the mask will be
ignored, and the default value will be used instead.
Therefore, a program must set only the fields for which it does not want
the default value.

A configuration mask is created through a bitwise OR operation of the
following library constants.
A configuration mask value of \CONST{0} indicates that the session
should be started with the default values for all configuration
parameters.

\widetablerow{\LibConstRef{SHMEM\_CTX\_SESSION\_TOTAL\_OPS}}{
The value of the \VAR{total\_ops} member of the \VAR{config} structure is
unmasked within the session and applied as a hint.
}

The default values for configuration parameters are:

\widetablerow{\VAR{total\_ops} = \CONST{SIZE\_MAX}}{
By default, the expected maximum number of calls to \openshmem RMA routines
in the session is set to the maximum value of a \VAR{size\_t} variable,
\VAR{SIZE\_MAX}. This default setting indicates that the \openshmem
application chooses not to specify a value for \VAR{total\_ops}.
}
}

\apinotes{
Users are discouraged from calling \FUNC{shmem\_fence},
\FUNC{shmem\_ctx\_fence}, \FUNC{shmem\_quiet}, or \FUNC{shmem\_ctx\_quiet}
routines within a session whenever possible, because the library must
impose strict completions to comply with ordering semantics.
However, hints provided by \FUNC{shmem\_ctx\_session\_config\_t} do not imply
the occurence of any completion or memory ordering operations.
The requirements on buffers provided to \openshmem routines that are
\textit{in-use} (as described in Section
\ref{subsec:invoking_openshmem_operations}) apply regardless of any
\FUNC{shmem\_ctx\_session\_config\_t} hints.
}

\end{apidefinition}
113 changes: 113 additions & 0 deletions content/shmem_ctx_session_start.tex
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
\apisummary{
Start a communication session.
}

\begin{apidefinition}

\begin{Csynopsis}
void @\FuncDecl{shmem\_ctx\_session\_start}@(shmem_ctx_t ctx, long options, const shmem_ctx_session_config_t *config, long config_mask);
\end{Csynopsis}

\begin{apiarguments}
\apiargument{IN}{ctx}{A context handle specifying the context associated
with this session.}
\apiargument{IN}{options}{The set of requested options from
Table~\ref{session_opts} for this session. Multiple options may be
requested by combining them with a bitwise OR operation; otherwise,
\CONST{0} can be given if no options are requested.}
\apiargument{IN}{config}{
A pointer to the configuration parameters for the session.}
\apiargument{IN}{config\_mask}{
The bitwise mask representing the set of configuration parameters to use
from \VAR{config}.}
\end{apiarguments}

\apidescription{
\FUNC{shmem\_ctx\_session\_start} is a non-collective routine that begins a
session on communication context \VAR{ctx} with hints requested via
\VAR{options}.
Sessions on a communication context must be stopped with a call to
\FUNC{shmem\_ctx\_session\_stop} on the same context.
If a session is already started on a given context, another call to
\FUNC{shmem\_ctx\_session\_start} on that same context combines new options
via a bitwise OR operation. In such a case, unmasked member values in the
\VAR{config} argument replace any existing configuration values that are
already applied to the session.

If \VAR{ctx} compares equal to \LibConstRef{SHMEM\_CTX\_INVALID} then
\FUNC{shmem\_ctx\_session\_start} performs no action and returns immediately.

No combination of \VAR{options} passed to \FUNC{shmem\_ctx\_session\_start}
results in undefined behavior, but some combinations may be detrimental for
performance; for example, when selecting an option that is not applicable
to the session. It is the user's responsibility to determine which
combination of \VAR{options} benefits the performance of the session.

The \VAR{config} argument specifies session configuration parameters,
which are described in Section~\ref{subsec:shmem_ctx_session_config_t}.

The \VAR{config\_mask} argument is a bitwise mask representing the set of
configuration parameters to use from \VAR{config}.
A \VAR{config\_mask} value of \CONST{0} indicates that the session should
be started with the default values for all configuration parameters.
See Section~\ref{subsec:shmem_ctx_session_config_t} for field mask names and
default configuration parameters.
}

\apireturnvalues{
None.
}

\sessiontablebegin

\sessiontablerow{\LibConstRef{SHMEM\_CTX\_SESSION\_BATCH}}{
A \textit{batch} is a series of calls to \openshmem routines that occur
within a session on a communication context (i.e., after a call to
\FUNC{shmem\_ctx\_session\_start} and before a corresponding call to
\FUNC{shmem\_ctx\_session\_stop}), that might tolerate an increase in
individual call latencies. Designating a batch may provide an opportunity
to decrease the overall overhead typically involved with the \openshmem
library implementing the series as individual RMA operations. In other
words, the performance of \openshmem programs that issue many consecutive
and small-sized RMA routines might be improved by informing the library
implementation ahead of time that it is free to delay transferring data
in order to buffer, combine, and/or coalesce the issued \openshmem
routines. The specific mechanisms for improving performance using
batching optimizations depend on the \openshmem library implementation.

The \VAR{SHMEM\_CTX\_SESSION\_BATCH} hint indicates that a communication
context will be used to issue a batch. An example of a batch is an
iterative loop of non-blocking RMA and/or AMO routines. A batch may
include a memory ordering or collective operation, but such routines
might require completions and/or synchronization that could degrade
performance.

Because sessions do not affect the completion or ordering semantics of any
\openshmem routines in the program, routines such as non-blocking RMAs,
non-blocking AMOs, non-blocking \OPR{put-with-signals}, blocking scalar
\OPR{puts}, small blocking \OPR{puts}, and blocking non-fetching AMOs are
viable candidates for batching. Other routines, such as large blocking
\OPR{puts}, all blocking \OPR{gets}, blocking fetching AMOs, and the
memory ordering routines might require the library to enforce
completions, reducing the potential benefit of batching.

The \VAR{total\_ops} field of \VAR{config} indicates the expected maximum
number of calls to \openshmem RMA routines within the session.
See Section~\ref{subsec:shmem_ctx_session_config_t} for details
about \VAR{shmem\_ctx\_session\_config\_t} parameters.
} \hline

\sessiontableend

\apinotes{
The \FUNC{shmem\_ctx\_session\_start} routine provides hints for improving
performance, and \openshmem implementations are not required to apply any
optimization.
\FUNC{shmem\_ctx\_session\_start} is non-collective, so there is no implied
synchronization.
Blocking puts must be sufficiently small to benefit from batching, and the
exact threshold for this benefit depends on the \openshmem implemenation
and/or the application.
}

\end{apidefinition}
46 changes: 46 additions & 0 deletions content/shmem_ctx_session_stop.tex
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
\apisummary{
Stop a communication session.
}

\begin{apidefinition}

\begin{Csynopsis}
void @\FuncDecl{shmem\_ctx\_session\_stop}@(shmem_ctx_t ctx);
\end{Csynopsis}

\begin{apiarguments}
\apiargument{IN}{ctx}{A context handle specifying the context associated
with this session.}
\end{apiarguments}

\apidescription{
The \FUNC{shmem\_ctx\_session\_stop} routine ends a session on context \VAR{ctx}.
If a session is already stopped on a given communication context, another
call to \FUNC{shmem\_ctx\_session\_stop} on that context has no effect.
}

\apireturnvalues{
None.
}

\apinotes{
Users are discouraged from including non-\openshmem code, such as a long
computation loop, within a session without first calling
\FUNC{shmem\_ctx\_session\_stop}.
}


\begin{apiexamples}

\apicexample
{The following \CorCpp{} program demonstrates the usage of
\FUNC{shmem\_ctx\_session\_start} and \FUNC{shmem\_ctx\_session\_stop} with a loop of
random atomic non-fetching XOR updates to a distributed table, similar to
the HPC Challenge RandomAccess GUPS (Giga-updates per second) benchmark
\footnote{http://icl.cs.utk.edu/projectsfiles/hpcc/RandomAccess/}.}
{./example_code/shmem_ctx_session_example.c}
{}
\end{apiexamples}

\end{apidefinition}

4 changes: 2 additions & 2 deletions content/shmem_team_config_t.tex
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,8 @@
See Section~\ref{sec:ctx} for more on communication contexts and
Section~\ref{subsec:shmem_team_create_ctx} for team-based context creation.

When using the configuration structure to create teams, a mask parameter
controls which fields may be accessed by the \openshmem library.
When passing a configuration structure to a team creation routine, the mask parameter
specifies which fields the application requests to associate with the new team.
Any configuration parameter value that is not indicated in the mask will be
ignored, and the default value will be used instead.
Therefore, a program must set only the fields for which it does not want the default value.
Expand Down
51 changes: 51 additions & 0 deletions example_code/shmem_ctx_session_example.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
#include <shmem.h>
#include <stdint.h>
#include <stdlib.h>
#include <stdio.h>

#define N_UPDATES (1lu << 18)
#define N_INDICES (1lu << 10)
#define N_VALUES (1lu << 31)

int main(void) {

shmem_init();

uint64_t *table = shmem_calloc(N_INDICES, sizeof(uint64_t));

int mype = shmem_my_pe();
int npes = shmem_n_pes();
srand(mype);

shmem_ctx_t ctx;
int ret = shmem_ctx_create(0, &ctx);
if (ret != 0) {
printf("%d: Error creating context (%d)\n", mype, ret);
shmem_global_exit(1);
}

shmem_ctx_session_config_t config;
long config_mask;
config.total_ops = N_UPDATES;
config_mask = SHMEM_CTX_SESSION_TOTAL_OPS;

shmem_ctx_session_start(ctx, SHMEM_CTX_SESSION_BATCH, &config, config_mask);

for (size_t i = 0; i < N_UPDATES; i++) {
int random_pe = rand() % npes;
size_t random_idx = rand() % N_INDICES;
uint64_t random_val = rand() % N_VALUES;
shmem_ctx_uint64_atomic_xor(ctx, &table[random_idx], random_val, random_pe);
}

shmem_ctx_session_stop(ctx);
shmem_ctx_quiet(ctx); /* shmem_ctx_session_stop() does not quiet the context. */
shmem_sync_all(); /* shmem_ctx_session_stop() does not synchronize. */

/* At this point, it is safe to check and/or validate the table result... */

shmem_ctx_destroy(ctx);
shmem_free(table);
shmem_finalize();
return 0;
}
13 changes: 12 additions & 1 deletion main_spec.tex
Original file line number Diff line number Diff line change
Expand Up @@ -166,7 +166,6 @@ \subsubsection{\textbf{SHMEM\_CTX\_GET\_TEAM}}
\label{subsec:shmem_ctx_get_team}
\input{content/shmem_ctx_get_team.tex}


\subsection{Remote Memory Access Routines}\label{sec:rma}
\input{content/rma_intro.tex}

Expand Down Expand Up @@ -358,6 +357,18 @@ \subsubsection{\textbf{SHMEM\_SIGNAL\_FETCH}}\label{subsec:shmem_signal_fetch}
\input{content/shmem_signal_fetch.tex}


\subsection{Session Routines}\label{subsec:sessions}
\input{content/sessions_intro.tex}

\subsubsection{\textbf{SHMEM\_CTX\_SESSION\_CONFIG\_T}}\label{subsec:shmem_ctx_session_config_t}
\input{content/shmem_ctx_session_config_t.tex}

\subsubsection{\textbf{SHMEM\_CTX\_SESSION\_START}}\label{subsec:shmem_ctx_session_start}
\input{content/shmem_ctx_session_start.tex}

\subsubsection{\textbf{SHMEM\_CTX\_SESSION\_STOP}}\label{subsec:shmem_ctx_session_stop}
\input{content/shmem_ctx_session_stop.tex}


\subsection{Collective Routines}\label{subsec:coll}
\input{content/collective_intro.tex}
Expand Down
Loading