Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix running models that use alias while supporting dbt versions #662

Merged
merged 3 commits into from
Nov 17, 2023

Conversation

binhnq94
Copy link
Contributor

@binhnq94 binhnq94 commented Nov 9, 2023

Description

Current version, cosmos will got bug Not found node because it run with alias selection as: --models customers_abc_v1 and --models customers_abc_v2 .
I propose to parsing node selection in unique_id instead of using alias .
So node selection should be: unique_id.split('.', 2)[2] , reference to function and resource-details document.
In addition, with this change help cosmos also support versioned models on dbt-core >=1.5.0 instead >=1.6.0 as current version.

Related Issue(s)

closes #636

Breaking Change?

Cosmos will support dynamic alias and versioned models

Checklist

  • I have made corresponding changes to the documentation (if required)
  • I have added tests that prove my fix is effective or that my feature works

@binhnq94 binhnq94 requested a review from a team as a code owner November 9, 2023 06:46
@binhnq94 binhnq94 requested a review from a team November 9, 2023 06:46
Copy link

netlify bot commented Nov 9, 2023

👷 Deploy Preview for amazing-pothos-a3bca0 processing.

Name Link
🔨 Latest commit 6526251
🔍 Latest deploy log https://app.netlify.com/sites/amazing-pothos-a3bca0/deploys/65568e2909a64100086cc3d8

@dosubot dosubot bot added area:parsing Related to parsing DAG/DBT improvement, issues, or fixes dbt:run Primarily related to dbt run command or functionality parsing:custom Related to custom parsing, like custom DAG parsing, custom DBT parsing, etc labels Nov 9, 2023
@binhnq94 binhnq94 force-pushed the dynamic_alias_and_versioned_model branch from d5e89d2 to ccc1dc3 Compare November 9, 2023 07:04
@binhnq94 binhnq94 changed the title Change to select by unique_id Change to selecting by node parsed from unique_id Nov 9, 2023
Copy link

codecov bot commented Nov 9, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (91babb9) 92.76% compared to head (6526251) 92.77%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #662      +/-   ##
==========================================
+ Coverage   92.76%   92.77%   +0.01%     
==========================================
  Files          55       55              
  Lines        2238     2243       +5     
==========================================
+ Hits         2076     2081       +5     
  Misses        162      162              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@tatiana tatiana added the priority:high High priority issues are blocking or critical issues without a workaround and large impact label Nov 9, 2023
Copy link
Collaborator

@tatiana tatiana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @binhnq94 , thank you very much for doing this work!

It is a critical part of Cosmos, and I took some time to review. I am still learning about dbt's aliasses and naming strategies - and I'm slightly concerned that this change may break users or have unintended side-effects. That said, I believe the best way will be for us to release it as part of an alpha release and collect user feedback.

Please, when you have a chance, address the comments inline - I'll be testing it with a few dbt examples and we'll aim to release this as part of an alpha 1.3 release.

cosmos/dbt/graph.py Outdated Show resolved Hide resolved
cosmos/dbt/graph.py Outdated Show resolved Hide resolved
@tatiana
Copy link
Collaborator

tatiana commented Nov 16, 2023

@binhnq94 confirming the overall approach we are adopting after this PR:

We will be naming Airflow tasks based on the model and version and no longer aliases since those were designed to reference SQL tables and not as model identifiers.

When we do dbt ls in this PR's example_model_version, we have the following list of models/seeds:

jaffle_shop.customers.v1
jaffle_shop.customers.v2
jaffle_shop.orders
jaffle_shop.staging.stg_customers.v1
jaffle_shop.staging.stg_customers.v2
jaffle_shop.staging.stg_orders.v1
jaffle_shop.staging.stg_payments
jaffle_shop.raw_customers
jaffle_shop.raw_orders
jaffle_shop.raw_payments

When using Cosmos 1.2.0 - 1.2.4, we rendered based on the alias, which meant that customers_v1 and customers_v2 were not correctly rendered in the Airflow DAG (they were named customers_USA_v1 and customers_USA_v2:
Screenshot 2023-11-16 at 12 35 09

When Cosmos attempted to run the dbt command, it failed to find the models with those alias-based names, although the task succeeded:

Running command: ['/Users/tati/Code/cosmos-clean/astronomer-cosmos/venv-38/bin/dbt', 'run', '--models', 'customers_USA_v1', '--profiles-dir', '/var/folders/td/522y78v91d1f5wgh67mj3p0m0000gn/T/tmpoi48p9to', '--profile', 'default', '--target', 'dev']

Since dbt does not recognise the alias as a model:

[2023-11-16T12:32:37.420+0000] {subprocess.py:94} INFO - (astronomer-cosmos) - 12:32:37  The selection criterion 'customers_USA_v1' does not match any nodes

With the change introduced by this PR, we can render the example_model_version correctly and execute all its tasks when triggering the DAG, regardless of whether they define aliases or not:
Screenshot 2023-11-16 at 12 16 34

[2023-11-16T12:42:57.339+0000] {subprocess.py:73} INFO - (astronomer-cosmos) - Running command: ['/Users/tati/Code/cosmos-clean/astronomer-cosmos/venv-38/bin/dbt', 'run', '--models', 'customers.v1', '--profiles-dir', '/var/folders/td/522y78v91d1f5wgh67mj3p0m0000gn/T/tmpx23gphsc', '--profile', 'default', '--target', 'dev']

That matches an existing model and runs it successfully:

[2023-11-16T12:42:59.873+0000] {subprocess.py:94} INFO - (astronomer-cosmos) - 12:42:59  1 of 1 START sql table model public.customers_USA_v1 ........................... [RUN]
[2023-11-16T12:42:59.873+0000] {subprocess.py:94} INFO - 12:42:59  1 of 1 START sql table model public.customers_USA_v1 ........................... [RUN]
[2023-11-16T12:42:59.873+0000] {subprocess.py:94} INFO - (astronomer-cosmos) - 12:42:59  1 of 1 START sql table model public.customers_USA_v1 ........................... [RUN]
[2023-11-16T12:42:59.968+0000] {subprocess.py:94} INFO - (astronomer-cosmos) - 12:42:59  1 of 1 OK created sql table model public.customers_USA_v1 ...................... [SELECT 100 in 0.09s]
[2023-11-16T12:42:59.968+0000] {subprocess.py:94} INFO - 12:42:59  1 of 1 OK created sql table model public.customers_USA_v1 ...................... [SELECT 100 in 0.09s]
[2023-11-16T12:42:59.968+0000] {subprocess.py:94} INFO - (astronomer-cosmos) - 12:42:59  1 of 1 OK created sql table model public.customers_USA_v1 ...................... [SELECT 100 in 0.09s]
[2023-11-16T12:42:59.981+0000] {subprocess.py:94} INFO - (astronomer-cosmos) - 12:42:59
[2023-11-16T12:42:59.981+0000] {subprocess.py:94} INFO - 12:42:59
[2023-11-16T12:42:59.981+0000] {subprocess.py:94} INFO - (astronomer-cosmos) - 12:42:59
[2023-11-16T12:42:59.981+0000] {subprocess.py:94} INFO - (astronomer-cosmos) - 12:42:59  Finished running 1 table model in 0 hours 0 minutes and 0.23 seconds (0.23s).

I'm confident with this change, and once you address the minor feedback & rebase, we can merge it into the main. Since this is a fix, we can release it in a micro release of Cosmos (e.g. 1.2.5).

@tatiana
Copy link
Collaborator

tatiana commented Nov 16, 2023

@binhnq94 I'd love your thoughts on this ticket I logged as a follow up: #682

@binhnq94 binhnq94 force-pushed the dynamic_alias_and_versioned_model branch from 6ec9df1 to a58b040 Compare November 16, 2023 13:58
@tatiana tatiana changed the title Change to selecting by node parsed from unique_id Fix running models that use alias while supporting dbt versions Nov 17, 2023
Copy link
Collaborator

@tatiana tatiana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this and addressing all the feedback, @binhnq94 , great work!

@tatiana tatiana merged commit e23a445 into astronomer:main Nov 17, 2023
42 checks passed
@binhnq94 binhnq94 deleted the dynamic_alias_and_versioned_model branch November 20, 2023 02:01
This was referenced Nov 23, 2023
@tatiana
Copy link
Collaborator

tatiana commented Nov 23, 2023

@binhnq94 This was released in 1.2.5 🎉

pull bot pushed a commit to pgoslatara/astronomer-cosmos that referenced this pull request Nov 24, 2023
Bug fixes

* Fix running models that use alias while supporting dbt versions by
@binhnq94 in astronomer#662
* Make profiles_yml_path optional for ExecutionMode.DOCKER and
KUBERNETES by @MrBones757 in astronomer#681
* Prevent overriding dbt profile fields with profile args of type or
method by @jbandoro in astronomer#702
* Fix LoadMode.DBT_LS fail when dbt outputs WarnErrorOptions by
@adammarples in astronomer#692
* Add support for env vars in RenderConfig for dbt ls parsing by
@jbandoro in astronomer#690
* Add support for Kubernetes on_warning_callback by @david-mag in astronomer#673
* Fix ExecutionConfig.dbt_executable_path to use ``default_factory`` by
@jbandoro in astronomer#678

Others

* Docs fix: example DAG in the README and docs/index by @tatiana in astronomer#705
* Docs improvement: highlight DAG examples in README by @iancmoritz and
@jlaneve in astronomer#695

(cherry picked from commit 2878d6a)
arojasb3 pushed a commit to arojasb3/astronomer-cosmos that referenced this pull request Jul 14, 2024
…onomer#662)

Current version, cosmos will got bug `Not found node` because it run
with alias selection as: `--models customers_abc_v1 ` and `--models
customers_abc_v2` .
I propose to parsing node selection in `unique_id` instead of using
`alias` .
So node selection should be: `unique_id.split('.', 2)[2]` , reference to
[function](https://github.com/dbt-labs/dbt-core/blob/main/core/dbt/contracts/graph/node_args.py#L26)
and [resource-details
document](https://docs.getdbt.com/reference/artifacts/manifest-json#resource-details).
In addition, with this change help cosmos also support versioned models
on dbt-core `>=1.5.0` instead `>=1.6.0` as current version.

Cosmos will support dynamic aliases and versioned models

Closes: astronomer#636
arojasb3 pushed a commit to arojasb3/astronomer-cosmos that referenced this pull request Jul 14, 2024
Bug fixes

* Fix running models that use alias while supporting dbt versions by
@binhnq94 in astronomer#662
* Make profiles_yml_path optional for ExecutionMode.DOCKER and
KUBERNETES by @MrBones757 in astronomer#681
* Prevent overriding dbt profile fields with profile args of type or
method by @jbandoro in astronomer#702
* Fix LoadMode.DBT_LS fail when dbt outputs WarnErrorOptions by
@adammarples in astronomer#692
* Add support for env vars in RenderConfig for dbt ls parsing by
@jbandoro in astronomer#690
* Add support for Kubernetes on_warning_callback by @david-mag in astronomer#673
* Fix ExecutionConfig.dbt_executable_path to use ``default_factory`` by
@jbandoro in astronomer#678

Others

* Docs fix: example DAG in the README and docs/index by @tatiana in astronomer#705
* Docs improvement: highlight DAG examples in README by @iancmoritz and
@jlaneve in astronomer#695

(cherry picked from commit 2878d6a)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:parsing Related to parsing DAG/DBT improvement, issues, or fixes dbt:run Primarily related to dbt run command or functionality parsing:custom Related to custom parsing, like custom DAG parsing, custom DBT parsing, etc priority:high High priority issues are blocking or critical issues without a workaround and large impact
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support naming tasks after the model name and not alias
2 participants