Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct stale root_path in partial parse file #950

Merged
merged 13 commits into from
May 10, 2024

Conversation

pankajkoti
Copy link
Contributor

@pankajkoti pankajkoti commented May 10, 2024

With the introduction of enabling partial parse in PR #904,
upon testing the implementation, it is observed that the seeds
files were not been able to be located as the partial parse file
contained a stale root_path from previous command runs.
This issue is observed on specific earlier versions of dbt-core like
1.5.4 and 1.6.5, but not on recent versions of dbt-core 1.5.8, 1.6.6
and 1.7.0. I am suspecting that PR dbt-labs/dbt-core#8762
is likely the fix and the fix appears to be backported to later version
releases of 1.5.x and 1.6.x.

However, irrespective of the dbt-core version, this PR attempts to
correct the root_path in the partial parse file by replacing it with
the needed project directory where the project files are located.
And thus ensures that the feature runs correctly for older and
newer versions of dbt-core.

closes: #937

Copy link

netlify bot commented May 10, 2024

Deploy Preview for sunny-pastelito-5ecb04 canceled.

Name Link
🔨 Latest commit b00c50f
🔍 Latest deploy log https://app.netlify.com/sites/sunny-pastelito-5ecb04/deploys/663e51fc72e0a70007da7c44

cosmos/cache.py Outdated Show resolved Hide resolved
cosmos/cache.py Outdated Show resolved Hide resolved
cosmos/cache.py Outdated

# Update root_path in partial parse file to point to the needed project directory. This is necessary because in some
# earlier versions of dbt (e.g. 1.5.4), the root_path was hardcoded to a stale directory and is not updated to the
# needed project directory. This seems to have been resolved in later versions of dbt, but we still need to handle
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth saying explicitly this was not needed in dbt 1.7.4 and 1.80

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, added some more information.

Copy link
Collaborator

@tatiana tatiana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is lookign great, @pankajkoti ! Probably worth to add a comment in the PR description that we observed this issue only for specific versions of dbt (1.5.4) and that they were not observed in dbt 1.7.4 onwards.

Copy link

codecov bot commented May 10, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 95.72%. Comparing base (87e8085) to head (54d4674).

❗ Current head 54d4674 differs from pull request most recent head b00c50f. Consider uploading reports for the commit b00c50f to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #950      +/-   ##
==========================================
+ Coverage   95.70%   95.72%   +0.01%     
==========================================
  Files          59       59              
  Lines        2867     2876       +9     
==========================================
+ Hits         2744     2753       +9     
  Misses        123      123              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

pankajkoti and others added 8 commits May 10, 2024 21:26
With the introduction of enabling partial parse in PR #904, upon
testing the implementation, it is observed that the seeds files
were not been able to be located as the partial parse file contained
a stale `root_path` from previous command runs. This PR attempts
to correct the `root_path` in the partial parse file by replacing
it with the needed project directory where the project files are
located.

closes: #937
@pankajkoti pankajkoti force-pushed the correct-project-dir-partial-parse-file branch from 5f8057c to 2ad1340 Compare May 10, 2024 15:56
@pankajkoti
Copy link
Contributor Author

This is lookign great, @pankajkoti ! Probably worth to add a comment in the PR description that we observed this issue only for specific versions of dbt (1.5.4) and that they were not observed in dbt 1.7.4 onwards.

added some more description. please check.

@pankajkoti pankajkoti marked this pull request as ready for review May 10, 2024 16:20
@pankajkoti pankajkoti requested a review from a team as a code owner May 10, 2024 16:20
@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label May 10, 2024
@pankajkoti pankajkoti requested a review from tatiana May 10, 2024 16:20
@dosubot dosubot bot added area:parsing Related to parsing DAG/DBT improvement, issues, or fixes area:testing Related to testing, like unit tests, integration tests, etc dbt:parse Primarily related to dbt parse command or functionality labels May 10, 2024
Copy link
Collaborator

@tatiana tatiana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this, @pankajkoti ! We're close to the 1.4 release!

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label May 10, 2024
@tatiana tatiana merged commit ebae8ef into main May 10, 2024
65 checks passed
@tatiana tatiana deleted the correct-project-dir-partial-parse-file branch May 10, 2024 16:58
@tatiana tatiana mentioned this pull request May 13, 2024
tatiana added a commit that referenced this pull request May 13, 2024
Features

* Add dbt docs natively in Airflow via plugin by @dwreeves in #737
* Add support for ``InvocationMode.DBT_RUNNER`` for local execution mode
by @jbandoro in #850
* Support partial parsing to render DAGs faster when using
``ExecutionMode.LOCAL``, ``ExecutionMode.VIRTUALENV`` and
``LoadMode.DBT_LS`` by @dwreeves in #800
* Improve performance by 22-35% or more by caching partial parse
artefact by @tatiana in #904
* Add Azure Container Instance as Execution Mode by @danielvdende in
#771
* Add dbt build operators by @dylanharper-qz in #795
* Add dbt profile config variables to mapped profile by @ykuc in #794
* Add more template fields to ``DbtBaseOperator`` by @dwreeves in #786
* Add ``pip_install_options`` argument to operators by @octiva in #808

Bug fixes

* Make ``PostgresUserPasswordProfileMapping`` schema argument optional
by @FouziaTariq in #683
* Fix ``folder_dir`` not showing on logs for ``DbtDocsS3LocalOperator``
by @PrimOox in #856
* Improve ``dbt ls`` parsing resilience to missing tags/config by
@tatiana in #859
* Fix ``operator_args`` modified in place in Airflow converter by
@jbandoro in #835
* Fix Docker and Kubernetes operators execute method resolution by
@jbandoro in #849
* Fix ``TrinoBaseProfileMapping`` required parameter for non method
authentication by @AlexandrKhabarov in #921
* Fix global flags for lists by @ms32035 in #863
* Fix ``GoogleCloudServiceAccountDictProfileMapping`` when getting
values from the Airflow connection ``extra__`` keys by @glebkrapivin in
#923
* Fix using the dag as a keyword argument as ``specific_args_keys`` in
DbtTaskGroup by @tboutaour in #916
* Fix ACI integration (``DbtAzureContainerInstanceBaseOperator``) by
@danielvdende in #872
* Fix setting dbt project dir to the tmp dir by @dwreeves in #873
* Fix dbt docs operator to not use ``graph.gpickle`` file when
``--no-write-json`` is passed by @dwreeves in #883
* Make Pydantic a required dependency by @pankajkoti in #939
* Gracefully error if users try to ``emit_datasets`` with ``Airflow
2.9.0`` or ``2.9.1`` by @tatiana in #948
* Fix parsing tests that have no parents in #933 by @jlaneve
* Correct ``root_path`` in partial parse cache by @pankajkoti in #950

Docs

* Fix docs homepage link by @jlaneve in #860
* Fix docs ``ExecutionConfig.dbt_project_path`` by @jbandoro in #847
* Fix typo in MWAA getting started guide by @jlaneve in #846
* Fix typo related to exporting docs to GCS by @tboutaour in #922
* Improve partial parsing docs by @tatiana in #898
* Improve docs for datasets for airflow >= 2.4 by @SiddiqueAhmad in #879
* Improve test behaviour docs to highlight ``warning`` feature in the
``virtualenv`` mode by @mc51 in #910
* Fix docs typo by @SiddiqueAhmad in #917
* Improve Astro docs by @RNHTTR in #951

Others

* Add performance integration tests by @jlaneve in #827
* Enable ``append_env`` in ``operator_args`` by default by @tatiana in
#899
* Change default ``append_env`` behaviour depending on Cosmos
``ExecutionMode`` by @pankajkoti and @pankajastro in #954
* Expose the ``dbt`` graph in the ``DbtToAirflowConverter`` class by
@tommyjxl in #886
* Improve dbt docs plugin rendering padding by @dwreeves in #876
* Add ``connect_retries`` to databricks profile to fix expensive
integration failures by @jbandoro in #826
* Add import sorting (isort) to Cosmos by @jbandoro in #866
* Add Python 3.11 to CI/tests by @tatiana and @jbandoro in #821, #824
and #825
* Fix failing ``test_created_pod`` for
``apache-airflow-providers-cncf-kubernetes`` after v8.0.0 update by
@jbandoro in #854
* Extend ``DatabricksTokenProfileMapping`` test to include session
properties by @tatiana in #858
* Fix broken integration test uncovered from Pytest 8.0 update by
@jbandoro in #845
* Add Apache Airflow 2.9 to the test matrix by @tatiana in #940
* Replace deprecated ``DummyOperator`` by ``EmptyOperator`` if Airflow
>=2.4.0 by @tatiana in #900
* Improve logs to troubleshoot issue in 1.4.0a2 with astro-cli by
@tatiana in #947
* Fix issue when publishing a new release to PyPI by @tatiana in #946
* Pre-commit hook updates in #820, #834, #843 and #852, #890, #896,
#901, #905, #908, #919, #931, #941
@tatiana tatiana added this to the 1.4.0 milestone May 13, 2024
arojasb3 pushed a commit to arojasb3/astronomer-cosmos that referenced this pull request Jul 14, 2024
With the introduction of enabling partial parse in PR astronomer#904, 
upon testing the implementation, it is observed that the seeds 
files were not been able to be located as the partial parse file 
contained a stale `root_path` from previous command runs. 
This issue is observed on specific earlier versions of dbt-core like
`1.5.4` and `1.6.5`, but not on recent versions of dbt-core `1.5.8`,
`1.6.6`
and `1.7.0`. I am suspecting that PR
dbt-labs/dbt-core#8762
is likely the fix and the fix appears to be backported to later version 
releases of `1.5.x` and `1.6.x`.

However, irrespective of the dbt-core version, this PR attempts to 
correct the `root_path` in the partial parse file by replacing it with 
the needed project directory where the project files are located. 
And thus ensures that the feature runs correctly for older and 
newer versions of dbt-core.

closes: astronomer#937

---------

Co-authored-by: Tatiana Al-Chueyr <[email protected]>
arojasb3 pushed a commit to arojasb3/astronomer-cosmos that referenced this pull request Jul 14, 2024
Features

* Add dbt docs natively in Airflow via plugin by @dwreeves in astronomer#737
* Add support for ``InvocationMode.DBT_RUNNER`` for local execution mode
by @jbandoro in astronomer#850
* Support partial parsing to render DAGs faster when using
``ExecutionMode.LOCAL``, ``ExecutionMode.VIRTUALENV`` and
``LoadMode.DBT_LS`` by @dwreeves in astronomer#800
* Improve performance by 22-35% or more by caching partial parse
artefact by @tatiana in astronomer#904
* Add Azure Container Instance as Execution Mode by @danielvdende in
astronomer#771
* Add dbt build operators by @dylanharper-qz in astronomer#795
* Add dbt profile config variables to mapped profile by @ykuc in astronomer#794
* Add more template fields to ``DbtBaseOperator`` by @dwreeves in astronomer#786
* Add ``pip_install_options`` argument to operators by @octiva in astronomer#808

Bug fixes

* Make ``PostgresUserPasswordProfileMapping`` schema argument optional
by @FouziaTariq in astronomer#683
* Fix ``folder_dir`` not showing on logs for ``DbtDocsS3LocalOperator``
by @PrimOox in astronomer#856
* Improve ``dbt ls`` parsing resilience to missing tags/config by
@tatiana in astronomer#859
* Fix ``operator_args`` modified in place in Airflow converter by
@jbandoro in astronomer#835
* Fix Docker and Kubernetes operators execute method resolution by
@jbandoro in astronomer#849
* Fix ``TrinoBaseProfileMapping`` required parameter for non method
authentication by @AlexandrKhabarov in astronomer#921
* Fix global flags for lists by @ms32035 in astronomer#863
* Fix ``GoogleCloudServiceAccountDictProfileMapping`` when getting
values from the Airflow connection ``extra__`` keys by @glebkrapivin in
astronomer#923
* Fix using the dag as a keyword argument as ``specific_args_keys`` in
DbtTaskGroup by @tboutaour in astronomer#916
* Fix ACI integration (``DbtAzureContainerInstanceBaseOperator``) by
@danielvdende in astronomer#872
* Fix setting dbt project dir to the tmp dir by @dwreeves in astronomer#873
* Fix dbt docs operator to not use ``graph.gpickle`` file when
``--no-write-json`` is passed by @dwreeves in astronomer#883
* Make Pydantic a required dependency by @pankajkoti in astronomer#939
* Gracefully error if users try to ``emit_datasets`` with ``Airflow
2.9.0`` or ``2.9.1`` by @tatiana in astronomer#948
* Fix parsing tests that have no parents in astronomer#933 by @jlaneve
* Correct ``root_path`` in partial parse cache by @pankajkoti in astronomer#950

Docs

* Fix docs homepage link by @jlaneve in astronomer#860
* Fix docs ``ExecutionConfig.dbt_project_path`` by @jbandoro in astronomer#847
* Fix typo in MWAA getting started guide by @jlaneve in astronomer#846
* Fix typo related to exporting docs to GCS by @tboutaour in astronomer#922
* Improve partial parsing docs by @tatiana in astronomer#898
* Improve docs for datasets for airflow >= 2.4 by @SiddiqueAhmad in astronomer#879
* Improve test behaviour docs to highlight ``warning`` feature in the
``virtualenv`` mode by @mc51 in astronomer#910
* Fix docs typo by @SiddiqueAhmad in astronomer#917
* Improve Astro docs by @RNHTTR in astronomer#951

Others

* Add performance integration tests by @jlaneve in astronomer#827
* Enable ``append_env`` in ``operator_args`` by default by @tatiana in
astronomer#899
* Change default ``append_env`` behaviour depending on Cosmos
``ExecutionMode`` by @pankajkoti and @pankajastro in astronomer#954
* Expose the ``dbt`` graph in the ``DbtToAirflowConverter`` class by
@tommyjxl in astronomer#886
* Improve dbt docs plugin rendering padding by @dwreeves in astronomer#876
* Add ``connect_retries`` to databricks profile to fix expensive
integration failures by @jbandoro in astronomer#826
* Add import sorting (isort) to Cosmos by @jbandoro in astronomer#866
* Add Python 3.11 to CI/tests by @tatiana and @jbandoro in astronomer#821, astronomer#824
and astronomer#825
* Fix failing ``test_created_pod`` for
``apache-airflow-providers-cncf-kubernetes`` after v8.0.0 update by
@jbandoro in astronomer#854
* Extend ``DatabricksTokenProfileMapping`` test to include session
properties by @tatiana in astronomer#858
* Fix broken integration test uncovered from Pytest 8.0 update by
@jbandoro in astronomer#845
* Add Apache Airflow 2.9 to the test matrix by @tatiana in astronomer#940
* Replace deprecated ``DummyOperator`` by ``EmptyOperator`` if Airflow
>=2.4.0 by @tatiana in astronomer#900
* Improve logs to troubleshoot issue in 1.4.0a2 with astro-cli by
@tatiana in astronomer#947
* Fix issue when publishing a new release to PyPI by @tatiana in astronomer#946
* Pre-commit hook updates in astronomer#820, astronomer#834, astronomer#843 and astronomer#852, astronomer#890, astronomer#896,
astronomer#901, astronomer#905, astronomer#908, astronomer#919, astronomer#931, astronomer#941
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:parsing Related to parsing DAG/DBT improvement, issues, or fixes area:testing Related to testing, like unit tests, integration tests, etc dbt:parse Primarily related to dbt parse command or functionality lgtm This PR has been approved by a maintainer size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug when using enable_mock_profile=False in simple_dag.py with 1.4.0a2
2 participants