Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using argodsm in a server client style application #145

Open
suyashmahar opened this issue Oct 11, 2023 · 6 comments
Open

Using argodsm in a server client style application #145

suyashmahar opened this issue Oct 11, 2023 · 6 comments

Comments

@suyashmahar
Copy link

I was wondering if argodsm can be used to do SHM-style communication between server and client apps which don't share code? For example, I want to create a complex graph in an application, pass the pointer to another application and traverse that graph. In a node-local SHM, I would just memory map the same region in both the application, can I do this in argodsm?

One way of doing this that I see is to allocate some memory that I want to share, and send the pointer over, but I was a little confused by the tutorial.

The tutorial explains that conew_() returns the same pointer on every instance of a parallel program:

This allocation function argo::conew_array is run on all the nodes, returning the same pointer to all of them, thus initializing the data variable in all of them.

But I don't understand how the implementation know which call corresponds to which pointer?

Appreciate any help!

@davidklaftenegger
Copy link
Member

I am not sure I understand the question correctly. conew_() will allocate, collectively on all nodes, a part of the global memory such that it can be reliably accessed on all nodes, and the pointer will then be represented identically on all nodes. If the question is about how I would go about writing such code,

char* shared_memory_region = argo::conew_array<char>(shared_region_size); // must be called on all nodes to succeed
if(argo::node_id() == 0) {
    call_server_code(shared_memory_region); // will execute on exactly one node
} else {
    call_client_code(shared_memory_region); // will execute on N-1 nodes
}

would be my first attempt. This would ensure all nodes call argo::conew_array at the same time, and then have access to the same pointer.

argo::conew_() is just a way of doing what you called "passing the pointer" to other nodes' binaries, to allocate globally accessible memory without such synchronization between all nodes there is argo::new_().

I hope this helps, and if it does not please feel free to ask again.

@suyashmahar
Copy link
Author

I see. So if I understand this correctly, and if I just want a big, shared memory region, I can just do this:

auto my_shm = argo::new_(1G);
if (argo::node_id() == 0) {
  call_server_code(my_shm);
} else {
  call_client_code(my_shm);
}

Now, in the client and the server, I can just do writes to my_shm on one node, and they will show up on the other node (assuming appropriate synchronization).

Thanks.

@davidklaftenegger
Copy link
Member

If each node executes new_(), each node will get a distinct memory region; if you want all of them to allocate the same memory region you should use conew_(). After allocating with conew_(), assuming appropriate synchronization according to the memory model, you can do writes to my_shm on one node, and they can be read on the other node. The distinction might be irrelevant to your use case, but data might never be transferred (and thus never "show up") if it is not read.

@suyashmahar
Copy link
Author

If each node executes new_(), each node will get a distinct memory region;

It would still be "distributed and shared memory", right? That is, I can access that allocated region as long as I have a pointer to it in the client or the server.

--

From what I understand, allocating a region with new_() and sharing the pointer with other nodes over TCP or something accomplishes the same end result as conew_(), but with extra steps.

I guess I'm misunderstanding something here, or maybe I am not familiar with HPC-style programming.
What I am trying to achieve is local mmap-style shared memory, but over RDMA. With local SHM, I'd mmap a file at a fixed address from /dev/shm inside an application. Now, I have a region of memory in my virtual address space that I can write to, and the writes will show up in any application that also has the region mapped.

So, the reason I specifically want to do it this way is to make sure I can map it to the same address across applications and can run my own allocator in the region (imagine a simple bump allocator).

Thank you so much for the help!

@davidklaftenegger
Copy link
Member

It would still be "distributed and shared memory", right? That is, I can access that allocated region as long as I have a pointer to it in the client or the server.

Yes

From what I understand, allocating a region with new_() and sharing the pointer with other nodes over TCP or something accomplishes the same end result as conew_(), but with extra steps.

Yes, I guess I just don't see a good reason to do TCP communication when conew_() already solves this problem.

I guess I'm misunderstanding something here, or maybe I am not familiar with HPC-style programming. What I am trying to achieve is local mmap-style shared memory, but over RDMA. With local SHM, I'd mmap a file at a fixed address from /dev/shm inside an application. Now, I have a region of memory in my virtual address space that I can write to, and the writes will show up in any application that also has the region mapped.

This should indeed work

So, the reason I specifically want to do it this way is to make sure I can map it to the same address across applications and can run my own allocator in the region (imagine a simple bump allocator).

If your custom allocator performs better than just calling new_() directly in your code then I would want to see the code. I know our allocator is simplistic, so improving upon it is probably not too difficult, but our goal would be to provide an API that does not make it necessary for the user to provide another memory allocator. At the very least I would like to see a performance comparison against calling new_() directly, if you really implement your own allocator on an ArgoDSM-provided memory region.

@blingjistar
Copy link

@suyashmahar hi,friend。did you access your goal at last?now we face the same situation,and we are consuing how to solve it。
best wishes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants