-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make deliberately scaled-in unstarted blocks not be failures #3594
Changes from 13 commits
a56397b
7ad5608
66f94af
86f2ff1
5575b19
e569c2a
058187a
4e4017d
e875dda
660318a
7d9b420
885d094
2e5ea4f
99924ae
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,78 @@ | ||
import time | ||
|
||
import pytest | ||
|
||
import parsl | ||
from parsl.channels import LocalChannel | ||
from parsl.config import Config | ||
from parsl.executors import HighThroughputExecutor | ||
from parsl.launchers import WrappedLauncher | ||
from parsl.providers import LocalProvider | ||
|
||
|
||
def local_config(): | ||
return Config( | ||
max_idletime=1, | ||
strategy='htex_auto_scale', | ||
strategy_period=1, | ||
executors=[ | ||
HighThroughputExecutor( | ||
label="htex_local", | ||
address="127.0.0.1", | ||
cores_per_worker=1, | ||
encrypted=True, | ||
provider=LocalProvider( | ||
channel=LocalChannel(), | ||
init_blocks=1, | ||
min_blocks=0, | ||
max_blocks=1, | ||
# TODO: swap out the launcher during the test. make it use a much longer sleep (or infinite sleep) | ||
# to give blocks that run but never complete themselves. then swap that for a launcher which | ||
# launches the worker pool immediately. | ||
launcher=WrappedLauncher(prepend="sleep inf ; "), | ||
), | ||
) | ||
], | ||
) | ||
|
||
|
||
@parsl.python_app | ||
def task(): | ||
return 7 | ||
|
||
|
||
@pytest.mark.local | ||
def test_regression(try_assert): | ||
# config means that we should start with one block and very rapidly scale | ||
# it down to 0 blocks. Because of 'sleep inf' in the WrappedLaucher, the | ||
# block will not ever register. This lack of registration is part of the | ||
# bug being tested for regression. | ||
|
||
# After that scaling has happened, we should see that we have one block | ||
# and it should be in a terminal state. We should also see htex reporting | ||
# no blocks connected. | ||
|
||
# Give 10 strategy periods for the above to happen: each step of scale up, | ||
# and scale down due to idleness isn't guaranteed to happen in exactly one | ||
# scaling step. | ||
|
||
htex = parsl.dfk().executors['htex_local'] | ||
|
||
try_assert(lambda: len(htex.status_facade) == 1 and htex.status_facade['0'].terminal, | ||
timeout_ms=10000) | ||
|
||
assert htex.connected_blocks() == [], "No block should have connected to interchange" | ||
|
||
# Now we can reconfigure the launcher to let subsequent blocks launch ok, | ||
# and run a trivial task. That trivial task will scale up a new block and | ||
# run the task successfully. | ||
|
||
# Prior to issue #3568, the bug was that the scale in of the first | ||
# block earlier in the test case would have been treated as a failure, | ||
# and then the block error handler would have treated that failure as a | ||
# permanent htex failure, and so the task execution below would raise | ||
# a BadStateException rather than attempt to run the task. | ||
|
||
assert htex.provider.launcher.prepend != "", "Pre-req: prepend attribute should exist and be non-empty" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would use of _inf_sleep = "sleep inf ; "
...
launcher=WrappedLauncher(prepend=_inf_sleep),
...
assert htex.provider.launcher.prepend.startswith(_inf_sleep), "..." |
||
htex.provider.launcher.prepend = "" | ||
assert task().result() == 7 | ||
Comment on lines
+84
to
+85
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you distill the comment above into the fail msg for this assert? Not an easy task, I know, but when this does fail, makes it that easier to dig in. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using the
fail_msg
argument to explain that this is a block. (Analogous toassert ..., "fail_msg"
)