-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
logs warning if deduplication state is large #1877
Conversation
✅ Deploy Preview for dlt-hub-docs canceled.
|
cf80f0d
to
aeca74a
Compare
4e59fb7
to
dd66d42
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to simplify the code a little
dlt/extract/incremental/__init__.py
Outdated
@@ -118,6 +118,8 @@ class Incremental(ItemTransform[TDataItem], BaseConfiguration, Generic[TCursorVa | |||
EMPTY: ClassVar["Incremental[Any]"] = None | |||
placement_affinity: ClassVar[float] = 1 # stick to end | |||
|
|||
DEFAULT_DUPLICATE_CURSOR_WARNING_THRESHOLD: ClassVar[int] = 200 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incremental
is also a dataclass. you should declare uplicate_cursor_warning_threshold
under lag
and this is a good place to attach default value so DEFAULT_DUPLICATE_CURSOR_WARNING_THRESHOLD
is not longer needed.
tbh. my preference would be to not create a new field at all and just keep DEFAULT_DUPLICATE_CURSOR_WARNING_THRESHOLD
if someone want to change that, one can do it for all instances by changing the classvar. which is IMO pretty cool. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, changed in ca3471c
…e, update the warning message, change the test to check for single warning
dd66d42
to
ca3471c
Compare
Description
As a first step towards resolving this issue: warns if the deduplication state grows over a large count of hashes of the primary key.
Related Issues
#1131