-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support rendering tasks with non-ASCII characters #1278
Support rendering tasks with non-ASCII characters #1278
Conversation
❌ Deploy Preview for sunny-pastelito-5ecb04 failed.
|
Hi team, ( @tatiana @pankajkoti ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HI @t0momi219, thank you very much for the detailed explanation of the problem and for proposing a fix.
I was surprised with the amount of lines changed to fix the issue. It feels like you tried to do two things at once: fix the problem while refactoring the code. Would it be possible to solve the bug with less changes to the code?
As an example, what if we:
- Introduced a function (e.g.
normalize_node_name
that takes anode_name
, normalizes it, handling non-ASCII characters) - Replaced the ocurrences of
node.name
bynormalize_node_name(node.name)
Hi @tatiana @pankajkoti , PR Changes
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1278 +/- ##
=======================================
Coverage 96.51% 96.52%
=======================================
Files 71 71
Lines 4220 4228 +8
=======================================
+ Hits 4073 4081 +8
Misses 147 147 ☔ View full report in Codecov by Sentry. |
Hi @tatiana, @pankajkoti I am also working on the following PR, and I plan to modify the same generate_task_or_group function in both: #1317 Therefore, I’d like to complete this PR first before moving on to the next PR’s changes. If possible, could you please review this PR again? I believe I’ve completed the code adjustments. If there are still any issues, please let me know. Thank you, and I apologize if further modifications are needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @t0momi219, Could you please rebase it.
…1415) In PR [#1278](#1278), we introduced support for rendering non-ASCII characters in task IDs. However, due to limited access, we were unable to implement the same functionality for the build operator. This PR aims to extend that functionality by adding support for rendering build task IDs with non-ASCII characters.
…1415) In PR [#1278](#1278), we introduced support for rendering non-ASCII characters in task IDs. However, due to limited access, we were unable to implement the same functionality for the build operator. This PR aims to extend that functionality by adding support for rendering build task IDs with non-ASCII characters.
**New Features** * Support customizing Airflow operator arguments per dbt node by @wornjs in #1339. [More information](https://astronomer.github.io/astronomer-cosmos/getting_started/custom-airflow-properties.html). * Support uploading dbt artifacts to remote cloud storages via callback by @pankajkoti in #1389. [Read more](https://astronomer.github.io/astronomer-cosmos/configuration/callbacks.html). * Add support to ``TestBehavior.BUILD`` by @tatiana in #1377. [Documentation](https://astronomer.github.io/astronomer-cosmos/configuration/testing-behavior.html). * Add support for the "at" operator when using ``LoadMode.DBT_MANIFEST`` or ``CUSTOM`` by @benjy44 in #1372 * Add dbt clone operator by @pankajastro in #1326, as documented in [here](https://astronomer.github.io/astronomer-cosmos/getting_started/operators.html). * Support rendering tasks with non-ASCII characters by @t0momi219 in #1278 [Read more](https://astronomer.github.io/astronomer-cosmos/configuration/task-display-name.html) * Add warning callback on source freshness by @pankajastro in #1400 [Read more](https://astronomer.github.io/astronomer-cosmos/configuration/source-nodes-rendering.html#on-warning-callback-callback) * Add Oracle Profile mapping by @slords and @pankajkoti in #1190 and #1404 * Emit telemetry to Scarf during DAG run by @tatiana in #1397 * Save tasks map as ``DbtToAirflowConverter`` property by @internetcoffeephone and @hheemskerk in #1362 **Bug Fixes** * Fix the mock value of port in ``TrinoBaseProfileMapping`` to be an integer by @dwolfeu #1322 * Fix access to the ``dbt docs`` menu item outside of Astro cloud by @tatiana in #1312 * Add missing ``DbtSourceGcpCloudRunJobOperator`` in module ``cosmos.operators.gcp_cloud_run_job`` by @anai-s in #1290 * Support building ``DbtDag`` without setting paths in ``ProjectConfig`` by @tatiana in #1307 * Fix parsing dbt ls outputs that contain JSONs that are not dbt nodes by @tatiana in #1296 * Fix Snowflake Profile mapping when using AWS default region by @tatiana in #1406 * Fix dag rendering for taskflow + DbtTaskGroup combo by @pankajastro in #1360 **Enhancements** * Improve dbt command execution logs to troubleshoot ``None`` values by @tatiana in #1392 * Add logging of stdout to dbt graph run_command by @KarolGongola in #1390 * Save tasks map as DbtToAirflowConverter property by @internetcoffeephone and @hheemskerk in #1362 * Support rendering build operator task-id with non-ASCII characters by @pankajastro in #1415 **Docs** * Remove extra ` char from docs by @pankajastro in #1345 * Add limitation about copying target dir files to remote by @pankajkoti in #1305 * Generalise example from README by @ReadytoRocc in #1311 * Add security policy by @tatiana, @chaosmaw and @lzdanski in # 1385 * Mention in documentation that the callback functionality is supported in ``ExecutionMode.VIRTUALENV`` by @pankajkoti in #1401 **Others** * Restore Jaffle Shop so that ``basic_cosmos_dag`` works as documented by @tatiana in #1374 * Remove Pytest durations from tests scripts by @tatiana in #1383 * Remove typing-extensions as dependency by @pankajastro in #1381 * Pin dbt-databricks version to < 1.9 by @pankajastro in #1376 * Refactor ``dbt-sqlite`` tests to use ``dbt-postgres`` by @pankajastro in #1366 * Remove 'dbt-core<1.8.9' pin by @tatiana in #1371 * Remove dependency ``eval_type_backport`` by @tatiana in #1370 * Enable kubernetes tests for dbt>=1.8 by @pankajastro #1364 * CI Workaround: Pin dbt-core, Disable SQLite Tests, and Correctly Ignore Clone Test to Pass CI by @pankajastro in #1337 * Enable Azure task in the remote store manifest example DAG by @pankajkoti in #1333 * Enable GCP remote manifest task by @pankajastro in #1332 * Add exempt label option in GH action stale job by @pankajastro in #1328 * Add integration test for source node rendering by @pankajastro in #1327 * Fix vulnerability issue on docs dependency by @tatiana in #1313 * Add postgres pod status check for k8s tests in CI by @pankajkoti in #1320 * [CI] Reduce the amount taking to run tests in the CI from 5h to 11min by @tatiana in #1297 * Enable secret detection precommit check by @pankajastro in #1302 * Fix security vulnerability, by not pinning Airflow 2.10.0 by @tatiana in #1298 * Fix Netlify build timeouts by @tatiana in #1294 * Add stalebot to label/close stale PRs and issues by @tatiana in #1288 * Unpin dbt-databricks version by @pankajastro in #1409 * Fix source resource type tests by @pankajastro in #1405 * Increase performance tests models by @tatiana in #1403 * Drop running 1000 models in the CI by @pankajkoti in #1411 * Fix releasing package to PyPI by @tatiana in #1396 * Pre-commit hook updates in #1394, #1373, #1358, #1340, #1331, #1314, #1301 Co-authored-by: Pankaj Koti <[email protected]> Co-authored-by: Pankaj Singh <[email protected]> Closes: #1193 --------- Co-authored-by: Pankaj Koti <[email protected]> Co-authored-by: Pankaj Singh <[email protected]>
Description
When running models that have names containing multibyte characters, runtime errors occur in Airflow environments where statsd is enabled (e.g., MWAA uses this statsd metric for collecting metrics in Cloudwatch).
Related Issue: apache/airflow#18010
To address this, Airflow 2.9 introduced the ability to render tasks using display_name, which allows task names to be rendered separately from their task_id.
Reference: https://airflow.apache.org/docs/apache-airflow/stable/_modules/airflow/example_dags/example_display_name.html
This PR adds support for display_name, enabling users who use non-ASCII characters as their native language to display task names in their own language, even in environments like MWAA.
Details
The normalize_task_id parameter is added to RenderConfig.
This option accepts a function to generate a task ID from a node. This allows users to generate arbitrary task IDs from models. If a function is passed to this option, Cosmos will use the model name as the display_name for tasks while rendering them.
Related Issue(s)
closes #1277
Breaking Change?
Checklist