-
Notifications
You must be signed in to change notification settings - Fork 199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make deliberately scaled-in unstarted blocks not be failures #3594
Changes from all commits
a56397b
7ad5608
66f94af
86f2ff1
5575b19
e569c2a
058187a
4e4017d
e875dda
660318a
7d9b420
885d094
2e5ea4f
99924ae
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,85 @@ | ||
import time | ||
|
||
import pytest | ||
|
||
import parsl | ||
from parsl.channels import LocalChannel | ||
from parsl.config import Config | ||
from parsl.executors import HighThroughputExecutor | ||
from parsl.launchers import WrappedLauncher | ||
from parsl.providers import LocalProvider | ||
|
||
|
||
def local_config(): | ||
# see the comments inside test_regression for reasoning about why each | ||
# of these parameters is set why it is. | ||
return Config( | ||
max_idletime=1, | ||
|
||
strategy='htex_auto_scale', | ||
strategy_period=1, | ||
|
||
executors=[ | ||
HighThroughputExecutor( | ||
label="htex_local", | ||
encrypted=True, | ||
provider=LocalProvider( | ||
init_blocks=1, | ||
min_blocks=0, | ||
max_blocks=1, | ||
launcher=WrappedLauncher(prepend="sleep inf ; "), | ||
), | ||
) | ||
], | ||
) | ||
|
||
|
||
@parsl.python_app | ||
def task(): | ||
return 7 | ||
|
||
|
||
@pytest.mark.local | ||
def test_regression(try_assert): | ||
# The above config means that we should start scaling out one initial | ||
# block, but then scale it back in after a second or so if the executor | ||
# is kept idle (which this test does using try_assert). | ||
|
||
# Because of 'sleep inf' in the WrappedLaucher, the block will not ever | ||
# register. | ||
|
||
# The bug being tested is about mistreatment of blocks which are scaled in | ||
# before they have a chance to register, and the above forces that to | ||
# happen. | ||
|
||
# After that scaling in has happened, we should see that we have one block | ||
# and it should be in a terminal state. The below try_assert waits for | ||
# that to become true. | ||
|
||
# At that time, we should also see htex reporting no blocks registered - as | ||
# mentioned above, that is a necessary part of the bug being tested here. | ||
|
||
# Give 10 strategy periods for the above to happen: each step of scale up, | ||
# and scale down due to idleness isn't guaranteed to happen in exactly one | ||
# scaling step. | ||
|
||
htex = parsl.dfk().executors['htex_local'] | ||
|
||
try_assert(lambda: len(htex.status_facade) == 1 and htex.status_facade['0'].terminal, | ||
timeout_ms=10000) | ||
|
||
assert htex.connected_blocks() == [], "No block should have connected to interchange" | ||
|
||
# Now we can reconfigure the launcher to let subsequent blocks launch ok, | ||
# and run a trivial task. That trivial task will scale up a new block and | ||
# run the task successfully. | ||
|
||
# Prior to issue #3568, the bug was that the scale in of the first | ||
# block earlier in the test case would have incorrectly been treated as a | ||
# failure, and then the block error handler would have treated that failure | ||
# as a permanent htex failure, and so the task execution below would raise | ||
# a BadStateException rather than attempt to run the task. | ||
|
||
assert htex.provider.launcher.prepend != "", "Pre-req: prepend attribute should exist and be non-empty" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would use of _inf_sleep = "sleep inf ; "
...
launcher=WrappedLauncher(prepend=_inf_sleep),
...
assert htex.provider.launcher.prepend.startswith(_inf_sleep), "..." |
||
htex.provider.launcher.prepend = "" | ||
assert task().result() == 7 | ||
Comment on lines
+84
to
+85
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you distill the comment above into the fail msg for this assert? Not an easy task, I know, but when this does fail, makes it that easier to dig in. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using the
fail_msg
argument to explain that this is a block. (Analogous toassert ..., "fail_msg"
)