Launch tasks in from a shallow Python stack to avoid recursion errors #3478
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Prior to this PR, task dependencies were launched from fairly arbitrary places in the call stack: for example, wherever a particular Future callback happens, sometimes deep inside executor-specific threads and/or in launch code for earlier tasks. In situations described in #3472 this could result in a a Python call stack overflow, as long call chains accumulate.
This PR makes all task launches be queued to happen from a near-empty stack in a single thread (managed as a concurrent.futures.ThreadPoolExcutor) - with a "invoke this call soon" pattern.
The launch_if_ready method was already intended to be launched multiple times from multiple threads. this PR might make the invocation to launch_if_ready happen a bit later, but correctness-wise that should be fine: a task can only become more ready to run, not less ready.
This PR introduces a test
test_dependency_deep
, which simulates a situation where this #3472 failure happens (before this PR), and tests (for the future) that tasks are not launched beyond a fairly arbitrary stack depth.Changed Behaviour
Performance of task launching is probably a bit different now. I have not quantified this.
Exceptions during task launch (from parsl, or from plugged in executors) will now appear in a different call stack, and so be reported differently.
Fixes
Fixes #3472
Type of change