-
Notifications
You must be signed in to change notification settings - Fork 199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update monitoring task_inputs after input Futures are resolved #3054
Update monitoring task_inputs after input Futures are resolved #3054
Conversation
The monitoring task table stores a column that should store the inputs argument of a task, after Future resolution. Prior to this PR: Although the sending side of the code, DataFlowKernel._send_task_log_info, is correctly invoked to send that information, it first sends an earlier representation of that task where inputs are still unresolved futures - for example: [<AppFuture at 0x7fca83d22610 state=pending>, <AppFuture at 0x7fca83d7b550 state=pending>, <AppFuture at 0x7fca83d80110 state=pending>] When a later update is sent, not all columns of the task table are updated; only a specific list of columns that are expected to change. task_inputs was not on that list, so even after future resolution, the monitoring task_inputs column was not updated. This PR adds task_inputs to the list of columns that are updated on every task update message. This PR adds a test which if run without this change results in this test failure: ``` result = connection.execute(text("SELECT task_inputs FROM task")) (task_inputs, ) = result.first() > assert task_inputs == "[" + repr(TOKEN) + "]" E AssertionError: assert '[<Future at ...ate=pending>]' == '[54739]' E - [54739] E + [<Future at 0x7f946e3f9010 state=pending>] parsl/tests/test_monitoring/test_incomplete_futures.py:66: AssertionError ```
f2 = this_app(inputs=[f1]) | ||
|
||
f1.set_result(TOKEN) | ||
|
||
assert f2.result() == TOKEN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any value in checking the intermediate state of the monitoring DB between these statements?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's a bit non-deterministic so I didn't go that way - after line 43, the task row will appear (with an in-progress value) "eventually" in the monitoring database, but it's not synchronous with line 43 completing.
What I would expect is that various records would appear - basically the asserts that are in eg. PR #3056 - but I think it would be some code that has to sit there and poll the DB a few times until the "eventually consistent" DB reflects that the task is launched.
The interesting thing to check for the specific bug in this PR would be to check that the input field does not have TOKEN
stored in it until after line 45, as a pre-condition.
That's also what could happen instead of shutting down the DFK with cleanup
a few lines later: don't shut down parsl here, but instead poll the DB until it looks right.
I don't really have any strong feelings about whether it's the right thing to do or not.
…g-record-resolved-inputs
The monitoring task table stores a column that should store the inputs argument of a task, after Future resolution.
Prior to this PR:
Although the sending side of the code, DataFlowKernel._send_task_log_info, is correctly invoked to send that information, it first sends an earlier representation of that task where inputs are still unresolved futures - for example:
[<AppFuture at 0x7fca83d22610 state=pending>, <AppFuture at 0x7fca83d7b550 state=pending>, <AppFuture at 0x7fca83d80110 state=pending>]
When a later update is sent, not all columns of the task table are updated; only a specific list of columns that are expected to change. task_inputs was not on that list, so even after future resolution, the monitoring task_inputs column was not updated.
This PR adds task_inputs to the list of columns that are updated on every task update message.
This PR adds a test which if run without this change results in this test failure:
Changed Behaviour
Better info in monitoring db for tasks with
inputs=
parameters.Type of change