Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Counting semaphore variant of lock API #485

Open
nspark opened this issue Nov 29, 2021 · 6 comments
Open

Counting semaphore variant of lock API #485

nspark opened this issue Nov 29, 2021 · 6 comments
Assignees

Comments

@nspark
Copy link
Contributor

nspark commented Nov 29, 2021

The current locking API provides a fair-queued lock across the PEs, allowing one PE to hold the lock at a time. There is user interest in supporting a variant that supports N PEs to hold the "lock" at a time, as with a counting semaphore. Similar fair-queuing, FIFO behavior is strongly desirable.

One significant use-case is for parallel file I/O. With large PE counts, user applications can often see better performance for parallel writers if only some of the PEs write at a time. Such an API would provide a fair mechanism for PEs to enqueue as writers (as with shmem_*_lock), while also supporting multiple, concurrent writers (a new capability).

@nspark
Copy link
Contributor Author

nspark commented Feb 8, 2022

One can provide such a lock using the existing lock routines and AMOs, roughly as follows:

static long lock = 0;
static long active_counter = 0;
const int root_pe = shmem_n_pes() / 2;

// Acquire the counting semaphore
// -- (Wait to) acquire the lock
shmem_set_lock(&lock);
// -- Bump the active counter
long num_active = shmem_atomic_fetch_inc(&active_counter, root_pe);
// -- If (one) too many are active, wait-loop
unsigned int dt_us = 10;
while (num_active >= max_active) {
  usleep(dt_us);
  dt_us = (dt_us > 25000u ? 50000u : (2 * dt_us));
  num_active = shmem_atomic_fetch(&active_counter, root_pe);
}
// -- Release the lock
shmem_clear_lock(&lock);

do {
  something_useful();
} while (!done);

// Release the counting semaphore
shmem_atomic_add(&active_counter, -1, root_pe);

There is some potential for optimization here, but hopefully this gives a good sketch of how this can be done. The shmem_*_lock API provides the FIFO queueing for the semaphore's counter, while the counter itself allows more than one PE to access the guarded section.

@jdinan
Copy link
Collaborator

jdinan commented Feb 13, 2022

Would the lock type of the proposed API also be long or something else? IIUC, the above implementation would require more space than we have in a single long.

@tonycurtis
Copy link

tonycurtis commented Feb 25, 2022 via email

@jdinan
Copy link
Collaborator

jdinan commented Feb 28, 2022

I think we would need a point where we could symmetrically allocate memory for the lock. Locks are global objects that aren't tied to a team, so this could be tricky. We do have extra space in the MCS lock algorithm that we used in SOS. We assume that sizeof(long) == 2*sizeof(int). Both ints are used on PE 0 and there is one int free on every other process. So you could place the counter on another PE (leaving single PE jobs as an exercise to the reader). That gives you a 32-bit counter if we assume LP64 ABI. Hopefully this is enough space for the counter; we do need a way to describe the range of the counter to the user, especially if it could be less than the maxval of the function argument used to pass it.

However, if we're going to introduce new API, I would really prefer to fix this design issue with the legacy lock API.

@nspark
Copy link
Contributor Author

nspark commented Feb 28, 2022

However, if we're going to introduce new API, I would really prefer to fix this design issue with the legacy lock API.

Do you mean you'd rather the lock routines use an opaque type?

@nspark
Copy link
Contributor Author

nspark commented Feb 28, 2022

Some working group comments:

  • Generally favor opaque type; prevents abuse of assumptions on types/ABIs
  • Good to have a static initializer
  • Maybe collapse "traditional" locks together with "counting" locks
  • Consider support for locks from multiple threads in a PE

@naveen-rn naveen-rn self-assigned this Feb 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants