Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(Workable): use append write disposition for candidate subresources #366

Merged
merged 1 commit into from
Apr 22, 2024

Conversation

ethanve
Copy link
Contributor

@ethanve ethanve commented Feb 22, 2024

Tell us what you do here

Given that there's no primary key set for candidate sub-resources, we should just use append

  • implementing verified source (please link a relevant issue labeled as verified source)
  • fixing a bug (please link a relevant bug report)
  • improving, documenting, or customizing an existing source (please link an issue or describe below)
  • anything else (please link an issue or describe below)

Relevant issue

issue #

More PR info

)
yield candidates_resource | dlt.transformer(
name=f"candidates_{sub_endpoint}", write_disposition="merge"
name=f"candidates_{sub_endpoint}", write_disposition="append"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey! you are right, we missed primary_key (or merge_key) here, candidates details response has no id key to make it primary. I think it makes more sense to set write_disposition='replace', to avoid data duplication.

Copy link
Contributor Author

@ethanve ethanve Mar 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AstrakhantsevaAA given that candidates is incremental, it will replace all existing data with just updated ones.
So for example,

  1. Initial sync syncs 100 candidates and 10 offers
  2. Candidate 20 now has an offer
  3. Second pull will replace candidate offers with only a single offer for candidate 20

While merge may have duplicates, it's better than the alternative of replacing everything

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right!

Copy link
Collaborator

@AstrakhantsevaAA AstrakhantsevaAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution!

)
yield candidates_resource | dlt.transformer(
name=f"candidates_{sub_endpoint}", write_disposition="merge"
name=f"candidates_{sub_endpoint}", write_disposition="append"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right!

@AstrakhantsevaAA AstrakhantsevaAA added the ci from fork Allows to run tests from PR coming from fork label Mar 4, 2024
@AstrakhantsevaAA
Copy link
Collaborator

@ethanve can you rebase on master pls?

@rudolfix rudolfix self-requested a review April 2, 2024 11:47
Copy link
Contributor

@rudolfix rudolfix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@rudolfix rudolfix merged commit d4806af into dlt-hub:master Apr 22, 2024
15 of 16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci from fork Allows to run tests from PR coming from fork
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants