-
Notifications
You must be signed in to change notification settings - Fork 199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bad decoding of exception in radical error handling path #3328
Comments
This marker originally referred to issue #363 for tests which should not be expected to pass if running in an environment where stderr and stdout are not staged back to the submitting system by the executor. Since it was created, it then expanded in use to more generally refer to any test which had a problem in environments which did not properly make either output files or stdout/stderr available on the submitting system, as a separate behaviour from the core of parsl behaving properly. An example of that would be running in a multi-site environment without working file staging. However, since all that happened, the feel in the Parsl community is that running with a shared filesystem is the core supported usage mode of Parsl; and also since then, notions of file staging have become much stronger, both in Parsl core and in individual executors, making it a much more reasonable test requirement that if you make a file, it can be expected to be available for the test to inspect - and that if you are in an environment that does not support that, you should not expect the test suite (or Parsl) to work. Because of all of that, this PR removes the issue363 test marker entirely. It was removed from Work Queue and Task Vine test environments in PR #3327, leaving only the Radical-PILOT test environment setting that. It turns out only two tests don't work in the Radical-PILOT environment when issue363 is removed: One requires support for the slightly awkward "stdout/err opening modes, specified by tuples" feature, and this PR labels and disables that test using a new 'executor_supports_std_stream_tuples' label. (In general, an executor which does staging of stdout/err streams is likely to not support the features offered by std stream tuples, such as appending to stderr files from multiple task - so this label is not Radical-PILOT specific) Another test looks like it is broken exception handling in the Radical-PILOT executor, and so it is labelled with a new issue specific label, relating to the bug report for that issue - issue #3328, issue3328.
This marker originally referred to issue #363 for tests which should not be expected to pass if running in an environment where stderr and stdout are not staged back to the submitting system by the executor. Since it was created, it then expanded in use to more generally refer to any test which had a problem in environments which did not properly make either output files or stdout/stderr available on the submitting system, as a separate behaviour from the core of parsl behaving properly. An example of that would be running in a multi-site environment without working file staging. However, since all that happened, the feel in the Parsl community is that running with a shared filesystem is the core supported usage mode of Parsl; and also since then, notions of file staging have become much stronger, both in Parsl core and in individual executors, making it a much more reasonable test requirement that if you make a file, it can be expected to be available for the test to inspect - and that if you are in an environment that does not support that, you should not expect the test suite (or Parsl) to work. Because of all of that, this PR removes the issue363 test marker entirely. It was removed from Work Queue and Task Vine test environments in PR #3327, leaving only the Radical-PILOT test environment setting that. It turns out only two tests don't work in the Radical-PILOT environment when issue363 is removed: One requires support for the slightly awkward "stdout/err opening modes, specified by tuples" feature, and this PR labels and disables that test using a new 'executor_supports_std_stream_tuples' label. (In general, an executor which does staging of stdout/err streams is likely to not support the features offered by std stream tuples, such as appending to stderr files from multiple task - so this label is not Radical-PILOT specific) Another test looks like it is broken exception handling in the Radical-PILOT executor, and so it is labelled with a new issue specific label, relating to the bug report for that issue - issue #3328, issue3328.
@AymenFJA elif isinstance(exception, str):
parsl_task.set_exception(eval(exception)) to the checks. @benclifford PS.: I should not condone the casual use of |
@andre-merzky @AymenFJA thanks for looking into this - I'll have a poke at that test with your changes above. (Without digging in, I expect the difference in exceptions comes from the different code paths that the radical executor uses for bash apps - but let me spend some time investigating before believing me). Serialization of exceptions has given us problems in the past, but I haven't personally been hitting problems recently - it might just be that we dealt with all the common cases that we were encountering - we have a bit of special logic in parsl.app.errors.RemoteExceptionWrapper, and I can see that at least some part of the radical executor is using that:
|
If you want to avoid this In that case, tests would have to be loosened up to expect different exception types - I'm not in principle against that. |
Yes, I would prefer that option. It's not ideal for sure, but very portable... |
@benclifford @andre-merzky, sorry for the late response. I believe this is not an issue of serialization. It seems like we set the sandbox of the task to the path of the stdout/stderr assigned by Parsl task kwargs. Specifically for the test shown in this ticket, it's set for a path that leads to a Interestingly, this test did not fail when we ran the entire test suite. But running it individually showed this issue. Regardless I tested the changes and it passed on my local machine with the release version of RP and Parsl. |
so I guess there's two things going on here - 1) one the underlying bug in choice of directories that I think @AymenFJA fixed in 61f9b92, and 2) the problem reported at the start of this issue that some errors happening around that code path are not properly reported to the end user, which is I think that Andre is addressing here above: #3328 (comment) That 2nd problem still exists, I guess, in the case of any other errors that are not the fixed directory choice problem, right? |
Hey @benclifford, I just pushed a commit that addresses the 2nd problem here: 0903330 . |
@benclifford is there anything else that needs to be done in this ticket? Thank you. |
@benclifford should we close this ticket? as the related PR with the fixes is merged? |
Describe the bug
Here's an error I see in CI (when working on some stderr behaviour elsewhere). I think it looks like one piece of code is serializing an exception as a bytestring of the repr of the exception, but another piece of the code is expecting to unserialize it as base64.
cc @AymenFJA
To Reproduce
Expected behavior
This should be raising the relevant exception (the PermissionError) up to the user program, rather than raising a DeserializationError.
Environment
my laptop
parsl master 13c3147
The text was updated successfully, but these errors were encountered: