Skip to content

Commit

Permalink
Merge pull request #540 from jdinan/drafts/osh-1.6-rc1
Browse files Browse the repository at this point in the history
OpenSHMEM 1.6rc1 Draft
  • Loading branch information
jdinan authored Sep 27, 2024
2 parents 1d6f40e + 82b8e19 commit 4b70f0b
Show file tree
Hide file tree
Showing 86 changed files with 786 additions and 480 deletions.
18 changes: 16 additions & 2 deletions content/atomics_intro.tex
Original file line number Diff line number Diff line change
Expand Up @@ -29,11 +29,18 @@
The non-fetching routines include:
\FUNC{shmem\_atomic\_\{set, inc, add, and, or, xor\}[\_nbi]}.

\begin{DeprecateBlock}

Starting in \openshmem[1.4], all \ac{AMO} functions added "\_atomic\_" to the function
name and deprecated the equivalent functions without "\_atomic\_" in the name.

\end{DeprecateBlock}

\end{itemize}

\openshmem \ac{AMO} routines specified in this section have two variants. In
one of the variants, the context handle, \VAR{ctx}, is explicitly passed as
an argument. In this variant, the operation is performed on the specified
an argument. In this variant, the operation is performed on the specified
context. If the context handle \VAR{ctx} does not correspond to a valid
context, the behavior is undefined. In the other variant, the context handle
is not explicitly passed and thus, the operations are performed on the
Expand All @@ -56,7 +63,7 @@
integer types defined in \HEADER{stdint.h} by \Cstd[99]~\S7.18.1.1 and
\Cstd[11]~\S7.20.1.1. When the \Cstd translation environment
does not provide exact-width integer types with \HEADER{stdint.h}, an
\openshmem implemementation is not required to provide support for these types.
\openshmem implementation is not required to provide support for these types.

\begin{table}[h]
\begin{center}
Expand Down Expand Up @@ -123,3 +130,10 @@
\label{bitamotypes}
\end{center}
\end{table}
]






137 changes: 103 additions & 34 deletions content/backmatter.tex
Original file line number Diff line number Diff line change
Expand Up @@ -151,9 +151,9 @@ \chapter{Undefined Behavior in OpenSHMEM}\label{sec:undefined}
\tabularnewline
\hline
Use of non-symmetric variables & Some routines require remotely accessible
variables to perform their function. For example, a \PUT{} to a non-symmetric variable may
be trapped where possible and the library may abort the program. Another
implementation may choose to continue execution with or without a warning.
variables to perform their function. For example, an \openshmem libray may detect a \PUT{} to a non-symmetric variable
and choose to abort the program.
However, another implementation may choose to continue execution with or without a warning.
\tabularnewline
\hline
Non-symmetric allocation of symmetric memory & The symmetric memory management routines are
Expand Down Expand Up @@ -648,12 +648,17 @@ \subsection{Table~\ref{p2psynctypes}: point-to-point synchronization types}
\chapter{Changes to this Document}\label{sec:changelog}

\section{Version 1.6}
\label{changelog:v1.6}
Major changes in \openshmem[1.6] include the addition of the new
\FUNC{shmem\_team\_ptr}, \FUNC{shmem\_ibget}, and \FUNC{shmem\_ibput}
functions.

The following list describes the specific changes in \openshmem[1.6]:
\begin{itemize}
\begin{enumerate}
%
\item Added an inclusive (\FUNC{shmem\_sum\_inscan}) and exclusive
(\FUNC{shmem\_sum\_exscan}) collective summation operation.
\ChangelogRef{subsec:shmem_scan}
%
\item Added support for initialization and finalization routines to be called
multiple times, and added an initialization status query API
Expand All @@ -668,23 +673,14 @@ \section{Version 1.6}
update a remote flag without associated data transfer of a put-with-signal operation.
\ChangelogRef{subsec:shmem_signal_add, subsec:shmem_signal_set}%
%
\item Clarified that \OPR{Fence} operations only guarantee ordering for
operations that are performed on the same context.
\ChangelogRef{subsec:shmem_fence}%
%
\item Added a team-based pointer query routine:
\FUNC{shmem\_team\_ptr}.
\ChangelogRef{subsec:shmem_team_ptr}%
%
\item Clarified that \FUNC{shmem\_team\_split\_strided} and
\FUNC{shmem\_team\_split\_strided} return a nonzero value when the parent
team compares equal to \LibConstRef{SHMEM\_TEAM\_INVALID}.
\ChangelogRef{subsec:shmem_team_split_strided, subsec:shmem_team_split_2d}%
%
\item Removed \openshmem[1.5] Table 9, which was an incomplete duplicate of
\openshmem[1.5] Table 10, and clarified the types, names, and supporting
operations for team-based reductions.
\ChangelogRef{teamreducetypes}%
\item Clarified that the behavior of \FUNC{shmem\_team\_split\_strided} is
undefined when the input \VAR{start}, \VAR{stride}, and \VAR{size} arguments
imply a \textit{wrap-around} with respect to the parent team's \acp{PE}.
\ChangelogRef{subsec:shmem_team_split_strided}%
%
\item Added the session routines, \FUNC{shmem\_ctx\_session\_start} and
\FUNC{shmem\_ctx\_session\_stop}, which allow users to pass hints to the
Expand All @@ -703,11 +699,6 @@ \section{Version 1.6}
the world team.
\ChangelogRef{subsec:shmem_malloc, subsec:shmem_free, subsec:shmem_realloc,
subsec:shmem_align, subsec:shmmallochint, subsec:shmem_calloc}%
\item Corrected the level argument's recommended value in API notes for
\FUNC{shmem\_pcontrol} to indicate that the value should be greater than
2 to enable profiling with profile library defined effects and
additional arguments.
\ChangelogRef{subsec:shmem_pcontrol}
%
\item Clarified that \FUNC{shmem\_team\_get\_config} returns the current
configuration values, which may differ from the values assigned at the
Expand All @@ -722,7 +713,44 @@ \section{Version 1.6}
stride argument is 0 or negative.
\ChangelogRef{subsec:shmem_team_split_strided}
%
\end{itemize}
\item Clarified the requirements for the source buffer before entering the
collective routines.
\ChangelogRef{subsec:shmem_alltoall,subsec:shmem_broadcast,subsec:shmem_collect,subsec:shmem_reductions,subsec:shmem_scan}
%
\item Added a new Errata Section~\ref{sec:errata} that indicates errors or ambiguities in the
\openshmem specification and the version that required correction or clarification.
\ChangelogRef{sec:errata}
%
\item Removed \openshmem[1.5] Table 9, which was an incomplete duplicate of
\openshmem[1.5] Table 10, and clarified the types, names, and supporting
operations for team-based reductions. \label{changelog:reduction_table}
\ChangelogRef{teamreducetypes}%
%
\item Clarified that \VAR{source} and \VAR{dest} arrays must be the same
across \acp{PE} in \openshmem reductions \label{changelog:reduction_args}
\ChangelogRef{subsec:shmem_reductions}
%
\item Clarified that \OPR{Fence} operations only guarantee ordering for
operations that are performed on the same context. \label{changelog:fence_ctx}
\ChangelogRef{subsec:shmem_fence}%
%
\item Clarified that \FUNC{shmem\_test\_all} and \FUNC{shmem\_test\_all\_vector}
routines return 1 when the test set is empty. \label{changelog:test_all}
\ChangelogRef{subsec:shmem_test_all,subsec:shmem_test_all_vector}%
%
\item Clarified that \FUNC{shmem\_team\_split\_strided} and
\FUNC{shmem\_team\_split\_strided} return a nonzero value when the parent
team compares equal to \LibConstRef{SHMEM\_TEAM\_INVALID}. \label{changelog:split_strided_2d}
\ChangelogRef{subsec:shmem_team_split_strided, subsec:shmem_team_split_2d}%
%
\item Corrected the level argument's recommended value in API notes for
\FUNC{shmem\_pcontrol} to indicate that the value should be greater than
2 to enable profiling with profile library defined effects and
additional arguments. \label{changelog:pcontrol}
\ChangelogRef{subsec:shmem_pcontrol}
%

\end{enumerate}

\section{Version 1.5}
Major changes in \openshmem[1.5] include the addition of new team-based
Expand All @@ -732,7 +760,7 @@ \section{Version 1.5}
interface, and the removal of the entire \Fortran \ac{API}.

The following list describes the specific changes in \openshmem[1.5]:
\begin{itemize}
\begin{enumerate}
%
\item Removed \FUNC{SHMEM\_CACHE}.
\ChangelogRef{dep:shmem_cache}%
Expand Down Expand Up @@ -883,7 +911,7 @@ \section{Version 1.5}
\item Clarified the atomicity guarantees of the \openshmem memory model.
\ChangelogRef{subsec:amo_guarantees}%
%
\end{itemize}
\end{enumerate}

\section{Version 1.4}
Major changes in \openshmem[1.4] include
Expand All @@ -898,7 +926,7 @@ \section{Version 1.4}
and \Cstd[11] type-generic interfaces for point-to-point synchronization.

The following list describes the specific changes in \openshmem[1.4]:
\begin{itemize}
\begin{enumerate}
%
\item New communication management \ac{API}, including \FUNC{shmem\_ctx\_create};
\FUNC{shmem\_ctx\_destroy}; and additional \ac{RMA}, \ac{AMO}, and memory ordering
Expand Down Expand Up @@ -993,7 +1021,7 @@ \section{Version 1.4}
%
\item Expanded the type support for \ac{RMA}, \ac{AMO}, and point-to-point
synchronization operations.
%% cleveref will compress a list of references by default. It is better to not
%% cleveref will compress a list of references by default. It is better to not
%% compress this list of *table* references because the clickable hyperref
%% links are useful. You can tell cleveref to not compress the LHS and RHS by
%% inserting an empty item between them; i.e., `,,`.
Expand All @@ -1018,7 +1046,7 @@ \section{Version 1.4}
\item Clarified that complex-typed reductions in C are optionally supported.
\ChangelogRef{subsec:shmem_reductions}%
%
\end{itemize}
\end{enumerate}



Expand All @@ -1031,7 +1059,7 @@ \section{Version 1.3}
and \Cstd[11] type-generic interfaces for \ac{RMA} and \ac{AMO} operations.

The following list describes the specific changes in \openshmem[1.3]:
\begin{itemize}
\begin{enumerate}
%
\item Clarified implementation of \acp{PE} as threads.
%
Expand Down Expand Up @@ -1072,7 +1100,7 @@ \section{Version 1.3}
\item Deprecation of \FUNC{SHMEM\_CACHE}.
\ChangelogRef{dep:shmem_cache}%
%
\end{itemize}
\end{enumerate}



Expand All @@ -1087,7 +1115,7 @@ \section{Version 1.2}
and clarifications to several \ac{API} descriptions.

The following list describes the specific changes in \openshmem[1.2]:
\begin{itemize}
\begin{enumerate}
%
\item Added specification of \VAR{pSync} initialization for all routines that use it.
%
Expand Down Expand Up @@ -1143,7 +1171,7 @@ \section{Version 1.2}
support across versions of the \openshmem Specification.
\ChangelogRef{sec:dep}%
%
\end{itemize}
\end{enumerate}



Expand All @@ -1157,7 +1185,7 @@ \section{Version 1.1}
and general readabilty and usability improvements to the document structure.

The following list describes the specific changes in \openshmem[1.1]:
\begin{itemize}
\begin{enumerate}
%
\item Clarifications of the completion semantics of memory synchronization
interfaces.
Expand Down Expand Up @@ -1266,6 +1294,47 @@ \section{Version 1.1}
\item Name changes for UV and ICE for \ac{SGI} systems.
\ChangelogRef{sec:openshmem_history}%
%
\end{itemize}
\end{enumerate}

\chapter{Errata}\label{sec:errata}

Errors or ambiguities in the \openshmem specification may be discovered after
publication.
Errata, or corrections, are included in the the sections below indicating the
version of the OpenSHMEM specification that required the correction or
clarification.
These corrections have been applied to all subsequent versions of the
specification and this section serves as a historical record of the changes
made to assist users and implementers with applying the necessary corrections.
Errata that result in a change to the specifciation are also included in
Annex~\ref{sec:changelog}.
For an implementation to comply with a particular version of \openshmem, it
must account for all errata associated with that version as indicated below.

\section{Version 1.5}

\begin{enumerate}
\item Removed \openshmem[1.5] Table 9, which was an incomplete duplicate of
\openshmem[1.5] Table 10, and clarified the types, names, and supporting
operations for team-based reductions
(\ref{changelog:v1.6}.\ref{changelog:reduction_table}).
\item Clarified that \VAR{source} and \VAR{dest} arrays must be the same
across \acp{PE} in \openshmem reductions
(\ref{changelog:v1.6}.\ref{changelog:reduction_args}).
\item Clarified that \OPR{Fence} operations only guarantee ordering for operations
that are performed on the same context
(\ref{changelog:v1.6}.\ref{changelog:fence_ctx}).
\item Clarified that \FUNC{shmem\_test\_all} and
\FUNC{shmem\_test\_all\_vector} routines return 1 when the test set is empty
(\ref{changelog:v1.6}.\ref{changelog:test_all}).
\item Clarified that \FUNC{shmem\_team\_split\_strided} and
\FUNC{shmem\_team\_split\_2d} return nonzero when the parent team is
\LibConstRef{SHMEM\_TEAM\_INVALID}
(\ref{changelog:v1.6}.\ref{changelog:split_strided_2d}).
\item Corrected the \VAR{level} argument's recommended value in API notes for
\FUNC{shmem\_pcontrol} to indicate that the value should be greater than 2 to enable
profiling with profile library defined effects and additional arguments
(\ref{changelog:v1.6}.\ref{changelog:pcontrol}).
\end{enumerate}

%end of setlength command that was started in frontmatter.tex
25 changes: 14 additions & 11 deletions content/collective_intro.tex
Original file line number Diff line number Diff line change
@@ -1,21 +1,24 @@
\emph{Collective routines} are defined as coordinated communication or synchronization
operations performed by a group of \acp{PE}.

\openshmem provides three types of collective routines:
\openshmem provides four types of collective routines:

\begin{enumerate}
\item Collective routines that operate on teams use a team handle parameter to determine
which \acp{PE} will participate in the routine, and use resources encapsulated by the team object
to perform operations. See Section~\ref{subsec:team} for details on team management.
\item Collective routines that operate on teams use a team handle parameter to determine
which \acp{PE} will participate in the routine, and use resources encapsulated by the team object
to perform operations. See Section~\ref{subsec:team} for details on team management.

\begin{DeprecateBlock}
\item Collective routines that operate on active sets use a set of parameters to determine
which \acp{PE} will participate and what resources are used to perform operations.
\end{DeprecateBlock}
\begin{DeprecateBlock}
\item Collective routines that operate on active sets use a set of parameters to determine
which \acp{PE} will participate and what resources are used to perform operations.

\item Collective routines that do not accept active set
parameters and, as required, the default context.
\end{DeprecateBlock}

\item Collective routines that accept neither team nor active set
parameters, which implicitly operate on the world team and, as
required, the default context.
\item Collective routines that do not accept team
parameters, which implicitly operate on the world team and, as
required, the default context.
\end{enumerate}

Concurrent accesses to symmetric memory by an \openshmem collective
Expand Down
54 changes: 53 additions & 1 deletion content/coverpage.tex
Original file line number Diff line number Diff line change
Expand Up @@ -47,8 +47,60 @@ \section*{Sponsored by}
\end{itemize}

\section*{Authors and Collaborators}
This document is a collaborative effort consisting of several releases of \openshmem versions 1.0 through 1.5. This section lists the authors and contributors in reverse chronological order, starting with \openshmem 1.5.
This document is a collaborative effort consisting of several releases of \openshmem versions 1.0 through 1.6. This section lists the authors and contributors in reverse chronological order, starting with \openshmem 1.6.

\subsection*{\openshmem 1.6}
\begin{multicols}{2}
\begin{itemize}
\setlength\itemsep{0.1em}
\item Ferrol Aderholdt, NVIDIA
\item Muhammad Awad, \ac{AMD}
\item Matthew Baker, \ac{ORNL}
\item Swen Boehm, \ac{ORNL}
\item Aurelien Bouteiller, \ac{UTK}
\item Mark Brown, Intel
\item Bob Cernohous, \ac{HPE}
\item James Dinan\footnotemark[1], NVIDIA
\item Megan Grodowitz, Arm Inc.
\item Max Grossman, Georgia Tech
\item Yanfei Guo, \ac{ANL}
\item Khaled Hamidouche, NVIDIA
\item Jeff Hammond, NVIDIA
\item Akihiro Hayashi, Georgia Tech
\item Oscar Hernandez, \ac{ORNL}
\item Kieran Holland, Intel
\item Robert Kierski, \ac{HPE}
\item Bryant Lam, \ac{DoD}
\item Akhil Langer, NVIDIA
\item Tiffany M. Mintz, \ac{ORNL}
\item Bryan Morgan, Intel
\item William Okuno\footnotemark[2], \ac{HPE}
\item David Ozog\footnotemark[5], Intel
\item Nicholas Park, \ac{DoD}
\item Wendy Poole, \ac{LANL}
\item Steve Poole\footnotemark[6], \ac{OSSS}
\item Swaroop Pophale, \ac{ORNL}
\item Sreeram Potluri, NVIDIA
\item Brandon Potter\footnotemark[4], \ac{AMD}
\item Howard Pritchard, \ac{LANL}
\item Md. Wasi-ur- Rahman\footnotemark[11], Intel
\item Naveen Ravichandrasekaran\footnotemark[9], \ac{HPE}
\item Michael Raymond, \ac{HPE}
\item Elliot Ronaghan\footnotemark[8], \ac{HPE}
\item James Ross, \ac{ARL}
\item Pavel Shamis, NVIDIA
\item Sameer Shende, \ac{UO}
\item Danielle Sikich, \ac{HPE}
\item Brian Smith, Cornelis Networks
\item Lawrence Stewart\footnotemark[7], Intel
\item Zach Tiffany, NVIDIA
\item Manjunath Gorentla Venkata\footnotemark[10], NVIDIA
\item Kevin Waters\footnotemark[3], \ac{DoD}
\item Aaron Welch, \ac{ORNL}
\item Nathan Wichmann, \ac{HPE}
\item Jeffrey Young, Georgia Tech
\end{itemize}
\end{multicols}

\subsection*{\openshmem 1.5}
\begin{multicols}{2}
Expand Down
Loading

0 comments on commit 4b70f0b

Please sign in to comment.