-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Interoperability with other programming models #243
Comments
Updates in 20180904's Thread WG call: Slides: Summary of feedbacks (mainly for progress property
|
Notes form WG Meeting Tue Sep 4th:
|
Try to explain myself after a bit more thought.
I think SHMEM queries SHMEM attributes, not MPI/UPC/Fortran. I think there are 3 attributes (+detail/sub attributes if desired).
For 2), I don't think we can define if MPI progress (MPI_TEST) make progress on SHMEM, only that SHMEM needs progress. e.g. Cray XC supports Cray MPI, Intel MPI, OpenMPI (and sometime MVAPICH). I don't see how Cray SHMEM defines 'unified progress' or anything about MPI, only that SHMEM itself needs (or doesn't need) progress from other programming models/application. (edit) and I don't see SHMEM defining whether or not MPI makes auto progress or required manual calls to MPI_Test/wait/. |
From section 4.1 [Progress of OpenSHMEM Operations] in spec (v 1.4), one interpretation is that progress without API call is required. If so, SHMEM_REQUIRES_PROGRESS (or SHMEM_PROGRESS_MANUAL) is not allowed by the spec and SHMEM_PROGRESS_AUTO is guaranteed. If the interpretation of the spec allows relaxed progress, why not query that property outside the scope of interop? Regarding SHMEM_PROGRESS_ALL (or SHMEM_PROGRESS_X) : If an application is written to rely on this property, isn't the programmer also assuming a specific OpenSHMEM implementation? If so, why not leave this as a property that can only be queried by a vendor-specific shmemx_ API? |
OSHMPI has unfinished support for MPMD and it is straightforward to extend that to DPM (dynamic process management). There isn't any serious impediment to doing it, but I had no motivation other than curiosity. If somebody in the OpenSHMEM community wants to know how this works, I might be able to implement it for testing. |
@minsii Regarding progress, this is rather complicated and perhaps not amenable to a simple boolean descriptor. For example, OpenSHMEM might progress MPI RMA but not two-sided. How do we express that? |
@bcernohous Thanks a lot for the detailed explanation. However, it might be inappropriate if we consider the progress property only from SHMEM. This is an information we want to have only for hybrid SHMEM + X program. As @anshumang mentioned, the OpenSHMEM spec actually guanratees that the progress of SHMEM can only be @anshumang Thanks for pointing out the progress spec. After reading the spec again, I agree with your interpretation. I.e., a SHMEM implementation is required to make progress even when the program does not make API call. It is a stronger progress guarantee than that in MPI. Then the info we really need to query is "whether SHMEM makes progress for X". Regarding shmemx suggestion, I do not think the query API has to be vendor-specific, however, the vendor can define different values for the property (e.g., SHMEM_PROGRESS_MPI|SHMEM_PROGRESS_UPC, or none) Another issue I am reading from Bob's comment is that, although an runtime implementation may support both MPI and SHMEM, the user may link with another implementation of one of the models at runtime (or preload at execution time). E.g., MVAPICH gives both MPI and SHMEM and supports I guess the runtime may figure out some properties (e.g., progress, pe mapping) after |
@jeffhammond I do not think OpenSHMEM itself can support dynamic progress because Example of unsupported dynamic process:
However, it might be allowed if some PEs are forked separately (e.g., through Example that might be supported:
|
That was my original understanding but from today's call it seemed like that wasn't universal. So I think the only question is whether shmem makes progress for XXXX. And that's a hard thing to answer. As you've suggested, it's not static. I'd guess it's very limited (my vendor shmem + my vendor mpi) so it a query really useful vs knowing which libraries you linked with? Header versions? |
Yes, @minsii, I was thinking of a case where an MPI application spawned islands, each of which would call |
@bcernohous Indeed, I agree that |
@bcernohous I think the runtime can figure out some properties after all hybrid models are initialized in program. The supported properties include:
The assumption is that, only unified runtime (e.g., one who implements both SHMEM and X) wants to return not-none value. Once both SHMEM and X(es) are initialized, the runtime will know whether the internal X is being used. E.g., the vendor runtime developer can simply set a global variable The only property we cannot support is initialization-ordering. Because the user program needs this info before any initialization call, but the runtime cannot figure out correct value until initialized. |
Slides used for threads WG meeting on September 18: Comments from today's meeting:
|
@naveen-rn and I have discussed offline about the query API. Here is a summary of the comments:
Alternative options:
I especially like the idea of For pe-rank mapping, the query feature could help user program only when the SHMEM implementation supports it thus a portable program has to still write the manual version. Besides, it saves cost only at program init time (e.g., no info exchange). The benefit of this feature seems to be limited and likely increases code complexity in SHMEM implementation. Thus, I would suggest that we discard the proposal of pe-mapping in the initial version and go with option-2. |
A pull request has been created to prepare the spec document draft: minsii#1 |
The slides used for 2018-10-30's WG call: SHMEM_Interoperability_20181030.pptx PDF of spec draft (work in progress): |
Jeff S articulates the race condition very well in his slides: |
Thanks @manjugv . I agree that when we consider SHMEM may be called by multiple libraries, To address the specific issue where two threads concurrently make Option 1 is to make the entire
Option 2 is to require |
Option 2 is to require shmem_init_thread() to be a no-op if SHMEM has already been initialized.
As per the current spec, shmem_init_thread returns 0 upon success; otherwise, it returns a non-zero value.
So, don’t we already have the required semantics?
…-Naveen.
From: Min Si [mailto:[email protected]]
Sent: Tuesday, October 30, 2018 10:01 PM
To: openshmem-org/specification <[email protected]>
Cc: Naveen Ravichandrasekaran <[email protected]>; Mention <[email protected]>
Subject: Re: [openshmem-org/specification] Interoperability with other programming models (#243)
Thanks @manjugv<https://github.com/manjugv> . I agree that when we consider SHMEM may be called by multiple libraries, shmem_initialized and shmem_finalized seem insufficient. The MPI sessions proposal is made for this purpose.
To address the specific issue where two threads concurrently make shmem_init_thread() call rather than defining such a case as undefined behavior:
Option 1 is to make the entire test-and-init as an atomic operation, but it might be hard if two threads are maintained by different libraries.
if(shmem_initialized())
shmem_init_thread()
Option 2 is to require shmem_init_thread() to be a no-op if SHMEM has already been initialized. The shmem_init_thread() call can be implemented as an atomic op to synchronize between multiple calling threads. This change might be similar to the deprecated start_pes (i.e., calling start_pes more than once has no subsequent effect).
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#243 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AFcrr4gMDrXy-33LHzYOMeDPxFkUQpWKks5uqRJQgaJpZM4WZrIq>.
|
Not yet, the current spec still says "If the call to shmem_init_thread is unsuccessful in allocating and initializing resources for the OpenSHMEM library, then the behavior of any subsequent call to the OpenSHMEM library is undefined." If the second |
Here are some updates about the init/finalize questions.
It would be great if we can discuss the above thoughts in tomorrow's thread WG call (see attached slides). |
@naveen-rn Thanks for pointing out the previous discussion ! Let me summary that mail-chain:
Option 4 seems to be the most useful one, but it is actually the same as deprecated |
Slides for 2019-01-29's thread WG meeting: |
@jeffhammond Agreed that we must list it as a limitation of OSHMPI. But eventually this can be fixed by fixing the MPI spec :-) |
As discussed at the previous threads WG meeting (see threads-1-29-2019), the topic about supporting multiple init/finalize calls within a program has been separated into ticket #263 |
Attached draft of spec change for 2019-03-19 thread WG meeting: Also see git diff at minsii#1 |
@minsii Is this issue resolved? If it is only partially resolved, it might be helpful to create one or more issues for additional work to be done on interoperability (and possibly assign to the OpenSHMEM 1.6 milestone). |
Goal: define a new interoperability section
Context: Appendix D has been removed because the contents are outdated. We want to make a new interoperability section that defines the interoperability of OpenSHMEM with other programming models such as MPI, UPC, CAF, in order to help portable hybrid OpenSHMEM + X programs.
The section is planed to cover:
A PR is created for MPI interoperability. See minsii#1
The text was updated successfully, but these errors were encountered: