-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: document that parallelized=True
resources with add_limit(x)
usually yield x-1
#2142
base: devel
Are you sure you want to change the base?
Conversation
✅ Deploy Preview for dlt-hub-docs ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
|
||
limit = 5 | ||
result = list(sync_resource1().add_limit(limit)) | ||
allowed_result_range = range(limit - int(parallelized), limit + 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not ideal.
Ideal would be to make the parallel yields exact. I tried this for a few hours to no avail. Next best would be to run this test x times when parallelized=True
and then ensure that the threshold of yields lies below 4.5.
parallelized=True
resources with add_limit(x)
usually yield x-1
@joscha thanks for this PR. If you look in the PR list, you can see that I have changed the add_limit implementation. You could check out that branch and see wether you are still seeing those problem there or maybe even add your test there. I think it might actually be resolved. |
Will give your branch a try! |
yes, current code of #2131 (3738c29) reliably produces exacltly |
@joscha amazing, your helping out on this is very much appreciated :) |
I am glad it now produces exactly @pytest.mark.parametrize("parallelized", [False, True])
def test_limit_sync_resource(parallelized: bool) -> None:
@dlt.resource(parallelized=parallelized)
def sync_resource1():
for i in range(1, 10):
yield i
limit = 5
result = list(sync_resource1().add_limit(limit))
assert len(result) == limit which is precise enough to keep. |
Most of the time (80%ish or so) parallelized resources that are limited yield one item less.