-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resolve CI failures for Apache Airflow 2.7 jobs #1182
Conversation
✅ Deploy Preview for sunny-pastelito-5ecb04 canceled.
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1182 +/- ##
=======================================
Coverage 95.95% 95.95%
=======================================
Files 64 64
Lines 3656 3656
=======================================
Hits 3508 3508
Misses 148 148 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for working on this, @pankajkoti , and coming up with a solution so efficiently!
New Features * Introduction of experimental support to run dbt BQ models using Airflow deferrable operators by @pankajkoti @pankajastro @tatiana in #1224 #1230. This is a first step in this journey and we would really appreciate feedback from the community. For more information, check the documentation: https://astronomer.github.io/astronomer-cosmos/getting_started/execution-modes.html#airflow-async-experimental This work has been inspired by the talk "Airflow at Monzo: Evolving our data platform as the bank scales" by @jonathanrainer @ed-sparkes given at Airflow Summit 2023: https://airflowsummit.org/sessions/2023/airflow-at-monzo-evolving-our-data-platform-as-the-bank-scales/. * Support using ``DatasetAlias`` and fix orphaning unreferenced dataset by @tatiana in #1217 #1240 Documentation: https://astronomer.github.io/astronomer-cosmos/configuration/scheduling.html#data-aware-scheduling * Add GCP_CLOUD_RUN_JOB execution mode by @ags-de #1153 Learn more about it: https://astronomer.github.io/astronomer-cosmos/getting_started/gcp-cloud-run-job.html Enhancements * Create single virtualenv when ``DbtVirtualenvBaseOperator`` has ``virtualenv_dir=None`` and ``is_virtualenv_dir_temporary=True`` by @kesompochy in #1200 * Consistently handle build and imports in ``cosmos/__init__.py`` by @tatiana in #1215 * Add enum constants to init for direct import by @fabiomx in #1184 Bug fixes * URL encode dataset names to support multibyte characters by @t0momi219 in #1198 * Fix invalid argument (``full_refresh``) passed to DbtTestAwsEksOperator (and others) by @johnhoran in #1175 * Fix ``printer_width`` arg type in ``DbtProfileConfigVars`` by @jessicaschueler in #1191 * Fix task owner fallback by @jmaicher in #1195 Docs * Add scarf to readme and docs for website analytics by @cmarteepants in #1221 * Add ``virtualenv_dir`` param to ``ExecutionConfig`` docs by @pankajkoti in #1173 * Give credits to @LennartKloppenburg in CHANGELOG.rst by @tatiana #1174 * Refactor docs for async mode execution by @pankajkoti in #1241 Others * Remove PR branch added for testing a change in CI in #1224 by @pankajkoti in #1233 * Fix CI wrt broken coverage upload artifact @pankajkoti in #1210 * Fix CI issues - Upgrade actions/upload-artifact & actions/download-artifact to v4 and set min version for packaging by @pankajkoti in #1208 * Resolve CI failures for Apache Airflow 2.7 jobs by @pankajkoti in #1182 * CI: Update GCP manifest file path based on new secret update by @pankajkoti in #1237 * Pre-commit hook updates in #1176 #1186, #1186, #1201, #1219, #1231
New Features * Introduction of experimental support to run dbt BQ models using Airflow deferrable operators by @pankajkoti @pankajastro @tatiana in #1224 #1230. This is a first step in this journey and we would really appreciate feedback from the community. For more information, check the documentation: https://astronomer.github.io/astronomer-cosmos/getting_started/execution-modes.html#airflow-async-experimental This work has been inspired by the talk "Airflow at Monzo: Evolving our data platform as the bank scales" by @jonathanrainer @ed-sparkes given at Airflow Summit 2023: https://airflowsummit.org/sessions/2023/airflow-at-monzo-evolving-our-data-platform-as-the-bank-scales/. * Support using ``DatasetAlias`` and fix orphaning unreferenced dataset by @tatiana in #1217 #1240 Documentation: https://astronomer.github.io/astronomer-cosmos/configuration/scheduling.html#data-aware-scheduling * Add GCP_CLOUD_RUN_JOB execution mode by @ags-de #1153 Learn more about it: https://astronomer.github.io/astronomer-cosmos/getting_started/gcp-cloud-run-job.html Enhancements * Create single virtualenv when ``DbtVirtualenvBaseOperator`` has ``virtualenv_dir=None`` and ``is_virtualenv_dir_temporary=True`` by @kesompochy in #1200 * Consistently handle build and imports in ``cosmos/__init__.py`` by @tatiana in #1215 * Add enum constants to init for direct import by @fabiomx in #1184 Bug fixes * URL encode dataset names to support multibyte characters by @t0momi219 in #1198 * Fix invalid argument (``full_refresh``) passed to DbtTestAwsEksOperator (and others) by @johnhoran in #1175 * Fix ``printer_width`` arg type in ``DbtProfileConfigVars`` by @jessicaschueler in #1191 * Fix task owner fallback by @jmaicher in #1195 Docs * Add scarf to readme and docs for website analytics by @cmarteepants in #1221 * Add ``virtualenv_dir`` param to ``ExecutionConfig`` docs by @pankajkoti in #1173 * Give credits to @LennartKloppenburg in CHANGELOG.rst by @tatiana #1174 * Refactor docs for async mode execution by @pankajkoti in #1241 Others * Remove PR branch added for testing a change in CI in #1224 by @pankajkoti in #1233 * Fix CI wrt broken coverage upload artifact @pankajkoti in #1210 * Fix CI issues - Upgrade actions/upload-artifact & actions/download-artifact to v4 and set min version for packaging by @pankajkoti in #1208 * Resolve CI failures for Apache Airflow 2.7 jobs by @pankajkoti in #1182 * CI: Update GCP manifest file path based on new secret update by @pankajkoti in #1237 * Pre-commit hook updates in #1176 #1186, #1186, #1201, #1219, #1231
**New Features** * Support using ``DatasetAlias`` and fix orphaning unreferenced dataset by @tatiana in #1217 #1240 Documentation: https://astronomer.github.io/astronomer-cosmos/configuration/scheduling.html#data-aware-scheduling * Add GCP_CLOUD_RUN_JOB execution mode by @ags-de #1153 Learn more about it: https://astronomer.github.io/astronomer-cosmos/getting_started/gcp-cloud-run-job.html * Introduction of experimental support to run dbt BQ models using Airflow deferrable operators by @pankajkoti @pankajastro @tatiana in #1224 #1230. This is the first step in the journey of running dbt resources with native Airflow, and we would appreciate feedback from the community. For more information, check the documentation: https://astronomer.github.io/astronomer-cosmos/getting_started/execution-modes.html#airflow-async-experimental This work has been inspired by the talk "Airflow at Monzo: Evolving our data platform as the bank scales" by @jonathanrainer @ed-sparkes given at Airflow Summit 2023: https://airflowsummit.org/sessions/2023/airflow-at-monzo-evolving-our-data-platform-as-the-bank-scales/. **Enhancements** * Create single virtualenv when ``DbtVirtualenvBaseOperator`` has ``virtualenv_dir=None`` and ``is_virtualenv_dir_temporary=True`` by @kesompochy in #1200 * Consistently handle build and imports in ``cosmos/__init__.py`` by @tatiana in #1215 * Add enum constants to init for direct import by @fabiomx in #1184 **Bug fixes** * URL encode dataset names to support multibyte characters by @t0momi219 in #1198 * Fix invalid argument (``full_refresh``) passed to DbtTestAwsEksOperator (and others) by @johnhoran in #1175 * Fix ``printer_width`` arg type in ``DbtProfileConfigVars`` by @jessicaschueler in #1191 * Fix task owner fallback by @jmaicher in #1195 **Docs** * Add scarf to readme and docs for website analytics by @cmarteepants in #1221 * Add ``virtualenv_dir`` param to ``ExecutionConfig`` docs by @pankajkoti in #1173 * Give credits to @LennartKloppenburg in CHANGELOG.rst by @tatiana #1174 * Refactor docs for async mode execution by @pankajkoti in #1241 Others * Remove PR branch added for testing a change in CI in #1224 by @pankajkoti in #1233 * Fix CI wrt broken coverage upload artifact @pankajkoti in #1210 * Fix CI issues - Upgrade actions/upload-artifact & actions/download-artifact to v4 and set min version for packaging by @pankajkoti in #1208 * Resolve CI failures for Apache Airflow 2.7 jobs by @pankajkoti in #1182 * CI: Update GCP manifest file path based on new secret update by @pankajkoti in #1237 * Pre-commit hook updates in #1176 #1186, #1186, #1201, #1219, #1231 --------- Co-authored-by: Pankaj Koti <[email protected]>
Description
This is an effort to fix the currently broken CI jobs. The CI pipeline was running smoothly until the Cosmos 1.6.0 release (#1103). However, after the recent release of Apache Airflow providers, we’ve been encountering deep resolution errors specifically with Apache Airflow 2.7 jobs.
To address this:
Despite pre-installing the providers, we’re noticing that on Python 3.8 and 3.9 with Apache Airflow 2.7, Cosmos still attempts to sync dependencies and continues to encounter deep resolution errors. Therefore, we’re specifically excluding those combinations from running in Unit and Integration test jobs. However, we have Python 3.10 and 3.11 jobs enabled and running fine for Airflow 2.7 in our CI.
Related Issue(s)
related: #1180
Breaking Change?
Checklist