Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add incremental value info to table computed in the DltResource #1974

Open
rudolfix opened this issue Oct 21, 2024 · 1 comment · May be fixed by #2033
Open

add incremental value info to table computed in the DltResource #1974

rudolfix opened this issue Oct 21, 2024 · 1 comment · May be fixed by #2033
Assignees
Labels
enhancement New feature or request

Comments

@rudolfix
Copy link
Collaborator

Background
We do not annotate table schemas with Incremental settings if present. This annotations may be useful downstream for systems that will automatically generate pipelines that incrementally append or merge data and are without access to original Python code.

Tasks

  1. standardize column annotation for Incremental. We should be able to recreate Incremental instance from those settings. Look what we already do when defining incremental for rest_api - we may reuse the same TypedDict.
  2. When computing schema, create a proper annotation on the table. we allow only one incremental, so we need to update the code that diffs and merge tables so only one incremental is allowed.
  3. For incremental that do not refer to a single field (simple json path) we obviously do not generate annotations.
  4. now it is trivial to add incremental to dlt.resource decorator. accept both incremental instance and new typed dict. there's existing code that will use such incremental to override the one defined in resource function arguments.

Tests

  1. on top of usual tests, test also the updated decorator:
  • do the value in decorator override the default in function signature
  • is it passed to an argument with Incremental type but without value?
  • what if there's no incremental argument?
@rudolfix rudolfix added the enhancement New feature or request label Oct 21, 2024
@rudolfix rudolfix moved this from Todo to In Progress in dlt core library Oct 21, 2024
@axellpadilla
Copy link
Contributor

Hi, interesting, I have one question, it's somewhat common to use multiple columns (like an or clause) to load data from different incremental values, imagine a MongoDB source that have a created date and updated date but the updated date only exists when the record is updated the first time after creation, is there any way to solve this officially with dlt?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: In Progress
Development

Successfully merging a pull request may close this issue.

3 participants