-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve engine peformance on ancestry traversal #1174
Improve engine peformance on ancestry traversal #1174
Conversation
@Twisol , I talked to either @mattdailis , @DavidLegg , or both about how this would affect incremental simulation. The incremental sim implementation needs to be able to re-run tasks, and the appropriate parent needs to be re-run to make sure the appropriate context for the child is not lost. So, for example, if resource R has changed, and child task C from the previous simulation reads R, we need to re-run C in the appropriate context. The current code looks up the parent activity or daemon task P that spawned C and re-executes P. So, the question is can incremental sim either record (in cell read history) or lookup in a past simulation engine an appropriate activity/daemon to re-execute in the context these code changes? The re-execution of P could be expensive in incremental simulation if P has many task children. An optimization on re-execution could be saving away lambda functions for some or all of the tasks. Doing this for every cell read would probably use more memory than that saved by the changes here. Thoughts? |
Hi Brad!
I'm not entirely certain this is true. When a child is spawned, we are given a |
Are there |
So, I see this: public static <T> TaskFactory<T> threaded(final Supplier<T> task) {
return executor -> new ThreadedTask<>(executor, ModelActions.context, task);
} But, can I get from a |
Saving the TaskFactory is basically the same as saving the lambda. I fear these could be very large in comparison to ids for Clipper, for example. |
Yes, every task is given by a task factory. Both the spawn method used by tasks to spawn children, as well as the
I'm not confident I understand how the code you've quoted relates to your question. At the moment, the engine throws away task factories when they're used once; holding on to them for longer (and associating them to task IDs) would definitely be necessary for incremental resimulation. The current engine is not incremental, so it doesn't keep factories around longer than it needs them for. The alternative is, as you mentioned, re-running tasks up to the point where the desired task was spawned. As you (also) mentioned, that approach itself comes with tradeoffs -- mostly trading space for time, I think. The major issue I see is that, for daemon tasks, we might need to step the task forward many, many times to get back to the point the daemon was at, whether it's using a while loop or is trampolined by
That depends on how much data the lambda references from outside its scope. Java (as with most languages) performs closure conversion on lambdas, meaning that the code of the lambda itself is only defined once ever. When a new instance of that lambda is created, its constructor accepts any data the lambda needs to reference from the caller's scope. So, the amount of data the lambda pins in memory long-term is going to depend precisely on how many things in the caller's scope the lambda depends on. This is exactly the same situation as with activities: an activity contains a number of fields, and the activity's run method reads those fields. As long as the activity instance lives, so too do its fields. So you can think of closure conversion as a kind of "promotion" to anonymous activity types. |
14d5fab
to
86cf706
Compare
86cf706
to
01c470d
Compare
01c470d
to
3fc628d
Compare
3fc628d
to
8edb2fd
Compare
8edb2fd
to
230edf8
Compare
230edf8
to
d201218
Compare
d201218
to
536a764
Compare
536a764
to
c3244b8
Compare
c3244b8
to
65494ad
Compare
The span that a task contributes to can, and will change across a step.
99bae03
to
e5ae0f5
Compare
Description
This PR attempts to address the needs of #1062 by improving the efficiency of existing capabilities rather than introducing a new capability. In particular, the "spawn-based trampolining" approach to reducing the replay overhead of long-lived replaying tasks was not viable because it produced an ever-growing ancestry lineage that blew up memory usage (and would have blown up results computation time if the simulation got that far). This PR enables the engine to prune any ancestry information that is neither relevant to (1) computing task span information nor (2) determining when a parent task should resume after a called child task completes.
In the course of making this change, the span (duration) information for each task has been factored out into its own concept; moreover, child tasks will by default contribute to their parent's span rather than constructing a new span. This makes tasks anonymous and purely internal by default. Instead, a task can now enter a new nested "scope of work", which is associated with a new span. The activity mappers generated by the Merlin framework now automatically enter and exit from these scopes, ensuring that every activity gets its own span, since the intended semantics of activities is that their scopes of work are meaningfully distinguished from any activities they were delegated from.
Verification
develop
to smoke out any differences. Other ideas for verification would be welcome -- the engine is not especially friendly to unit testing at this time...Documentation
Other than documenting new capabilities, no existing capabilities have changed, so long as models built prior to this change are recompiled for the new version of the framework. (The responsibility for entering new spans has effectively shifted from the engine to the framework.)
Future work