Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: extract multiple columns from one prejoined object #290

Conversation

tkiehn
Copy link
Collaborator

@tkiehn tkiehn commented Nov 27, 2024

Description

Change prejoin-processing to only produce one left join for each unique combination of relation, columns and operator.

Also add new syntax for prejoins, allowing extraction of multiple columns from the same prejoin-source:

  prejoined_columns:
      - extract_columns: 
           - id
           - number
        aliases:
           - businessid
           - businessnumber
        ref_model: 'business_raw'
        this_column_name: 'ContractId'
        ref_column_name: 'ContractId'

In this example we join the source model with business_raw and extract id and number and assign an alias for each column.
Assigning an alias is optional, but if done, the number of extract_columns has to match the amount of aliases.
For each prejoin-target a new list, containing at least a dictionary for extract_columns and values for ref_model(or source_name & source_table) and the column names, needs to be defined.

The old syntax is still valid and can be used if desired.

Fixes #287

Type of change

Please delete options that are not relevant.

  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

Checklist:

  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation or included information that needs updates

ToDO: Update Docs to include the new syntax and additionally include information about aliases

@tkiehn tkiehn added the testing To trigger the automated test workflow as internal User. label Nov 27, 2024
@tkiehn tkiehn linked an issue Nov 27, 2024 that may be closed by this pull request
@remoteworkflow
Copy link

Link to workflow summary: https://github.com/ScalefreeCOM/datavault4dbt-ci-cd/actions/runs/12053297132


RESULTS for Synapse:
✅ dbt-tests
✅ dbt-macro-tests


RESULTS for Postgres:
✅ dbt-tests
✅ dbt-macro-tests


RESULTS for BigQuery:
✅ dbt-tests
✅ dbt-macro-tests


RESULTS for Redshift:
✅ dbt-tests
✅ dbt-macro-tests


RESULTS for Snowflake:
❌ dbt-tests
❌ dbt-macro-tests


RESULTS for Exasol:
❌ dbt-tests
✅ dbt-macro-tests


RESULTS for Fabric:
❌ dbt-tests
✅ dbt-macro-tests


RESULTS for Oracle:
❌ dbt-tests
✅ dbt-macro-tests


RESULTS for Databricks:
✅ dbt-tests
✅ dbt-macro-tests

@remoteworkflow remoteworkflow bot removed the testing To trigger the automated test workflow as internal User. label Nov 27, 2024
@tkiehn
Copy link
Collaborator Author

tkiehn commented Nov 29, 2024

Closing for now until rework is done

@tkiehn tkiehn closed this Nov 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE] Extract multiple columns from one prejoined object
1 participant