Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data model and serialization in mobsim #1

Open
kainagel opened this issue Sep 2, 2018 · 0 comments
Open

Data model and serialization in mobsim #1

kainagel opened this issue Sep 2, 2018 · 0 comments

Comments

@kainagel
Copy link
Member

kainagel commented Sep 2, 2018

Current status of our discussions is the following:

  • It is our intuition that the bottleneck with shared memory parallelization is that more and more threads try to access the same memory, and that fairly early (with about 5 threads) the memory access becomes the bottleneck.
  • In consequence, we want to try true distributed memory parallelization, so that we obtain more physical memory bandwidth.

In this, the current approach is roughly as follows (that material contains some opinions by myself):

  • Start as many instances of Control(l)er as we have physical machines, each one in a separate JVM. I call them computational nodes (CPNs).
  • Each CPN reads the full network, but marks only some part of it as "local" (more precisely: some of the nodes).
  • Each CPN reads the full population, but immediately ignores all persons that do not start on the local part of the network.
  • On all CPNs, the mobsim is started simultaneously, at the same time step. Agents depart on the networks where they are registered, and keep moving forward until they hit a link where the toNode is on a different machine.
  • From here on, we need to worry about communication, about which the real question will be.

(As a side note: For replanning, the first idea would be to make each CPN responsible for the agents that start locally.)

Re communication in the mobsim. Recall that a link is composed of a queue and a buffer. When fromNode and toNode belong to different CPNs, then the queue belongs to the fromNode CPN, and the buffer to the toNode CPN. A time step approximately looks like

processNodes() ; // intersections; moves vehicles from buffers across node to queue
sndRcvEmptySpacesInBuffers() ; // tell links about empty spaces in buffers
processLinks() ; // move vehicles from queue to buffer if possible
sndRcvVehicles() ; // move vehicles from buffers in one CPN to buffer in other CPN if link is split between CPNs.

sndRcvEmptySpacesInBuffers() is fairly straightforward, so I will not talk about it.

In contrast, sndRcvVehicles() is a challenge. We also need to communicate all passengers, meaning serialization and deserialization. The current approach is that one can implement arbitrary agents here, as long as the fulfill the appropriate interface. I think that we would want to keep that property. This, however, means that we need to force implementers of the agents to think about serialization. Using the standard Serializable seems like a terrible idea here, since implementers can just add that as an interface but not do anything else, leading to wrong code. So it seems that we need to introduce our own interface, maybe MatsimSerializable. Does this make sense? Other ideas?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant