How to store all test results #801

DanMawdsleyBA · 2024-01-16T15:30:24Z

I'm looking to store test results (Both the successful and failures so https://docs.getdbt.com/reference/resource-configs/store_failures won't be sufficient). I'm considering using a dbt package such as https://github.com/brooklyn-data/dbt_artifacts which works with an on run end which will upload all the results at the end. As far as I'm aware airflow doesn't allow for always run a task on run end like how dbt does. So is there any way that I can materialize the results of all the tests using Cosmos? If not currently would it be possible to do somehow?

For added context we're also looking to integrate these test results into Atlan.

dosubot · 2024-01-16T15:30:41Z

🤖

Hi @DanMawdsleyBA! I'm Dosu, a friendly bot here to assist you with your questions and issues related to Cosmos while we wait for a human maintainer. Don't worry, although I'm a bot, I'm here to make your experience smoother and easier. Looking forward to helping you out!

Yes, it is possible to materialize and store all test results, both successful and failures, using Cosmos. You can achieve this by using the on_warning_callback parameter in the create_test_task_metadata function. This callback function is called on warnings with additional Context variables “test_names” and “test_results” of type List.

You can define a callback function that logs all test results and stores them in a desired location. This location can then be integrated with Atlan.

Here is an example of how you can define the callback function:

def log_test_results(context):
    test_names = context['test_names']
    test_results = context['test_results']
    
    for test_name, test_result in zip(test_names, test_results):
        # Log the test result
        logger.info(f"Test {test_name} result: {test_result}")
        
        # Store the test result in a desired location
        # This can be a database, a file, etc.
        store_test_result(test_name, test_result)

You can then pass this function as the on_warning_callback parameter when calling create_test_task_metadata:

test_meta = create_test_task_metadata(
    "test",
    execution_mode,
    test_indirect_selection,
    task_args=task_args,
    node=node,
    on_warning_callback=log_test_results,
)

This will ensure that all test results are logged and stored, and can then be integrated into Atlan.

The create_test_task_metadata function in the Cosmos codebase does support the on_warning_callback parameter for logging and storing all test results. This is evident in the DbtTestKubernetesOperator class where the on_warning_callback parameter is used in the __init__ method. If the on_warning_callback is provided, it is stored as an instance variable and used in the _handle_warnings method to process warnings from the logs.

In the _handle_warnings method, if there are warnings in the logs, the on_warning_callback is called with the updated context that includes the test names and results.

You can find more details in the cosmos/airflow/graph.py and cosmos/operators/kubernetes.py files in the Cosmos repository.

I hope this helps! If you have any other questions, feel free to ask.

-Dosu

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

DanMawdsleyBA · 2024-01-17T13:19:17Z

@tatiana / any thoughts on the above?

It might also be desirable to have a task run on end that syncs the run_results.json with an S3 bucket (Perhaps something similar to the docs operator) as Atlan can interpret it directly. This task would have to always run regardless of if previous tasks have failed (In order to capture test results that are failures)

Although my concern with the above would be what happens if an airflow deployment gets redeployed would those run_results.json files be overidden/deleted.

tatiana · 2024-01-18T11:06:26Z

Hi @DanMawdsleyBA, this is a very valid use case - and it would be great if Cosmos could support it.

At the moment, Cosmos 1.3 only offers a built-in "post command" customisation via the callback feature, which only works on failed tests.

The other possibility you'd have is to customise how you'd like dbt test to behave in Cosmos, using this feature.

How are you running your tests in Cosmos? Are you using TestBehavior.AFTER_ALL or TestBehavior.AFTER_EACH (default)?

From an Atlan perspective, would it be okay if those test results files were sent on a per-task basis, considering you're running using TestBehavior.AFTER_EACH?

Have you tried using https://github.com/brooklyn-data/dbt_artifacts with Cosmos? From what I understood from the documentation, it would be triggered as part of the dbt command execution itself, so I'd expect it to work.

Another feature we could consider implementing is to expose in Cosmos the ability to upload the artefacts generated in target to {s3, gcs, azure, etc). What do you think? One challenge with this approach may be how we'd like to differentiate between DAG runs/task runs.

Metystes · 2024-01-31T10:41:33Z

Another feature we could consider implementing is to expose in Cosmos the ability to upload the artefacts generated in target to {s3, gcs, azure, etc). What do you think?

That would be great functionality, and it could be useful in another scenarios. For example, when running a bigger DBT project as a single task (to avoid overhead with starting new pods), and one of the tasks fails, we could use the retry command to rerun only the failed models. To do that, we would need to have run_results.json. Plus, it would help with the debugging to see the compiled code.

ms32035 · 2024-04-05T13:41:07Z

@tatiana I will add one thing to the problem.

In case tests fail,

astronomer-cosmos/cosmos/operators/local.py

Line 342 in 7c4d7d6

self.handle_exception(result)

this line will actually throw an exception before the callback is executed, so run_results.json is actually lost and in many cases there's no way to report on failed tests.

fabiomx · 2024-04-17T16:41:42Z

@ms32035,

yes, FYI, I raised the same issue here: #867

luis-fnogueira · 2024-10-24T17:59:55Z

@tatiana / any thoughts on the above?

It might also be desirable to have a task run on end that syncs the run_results.json with an S3 bucket (Perhaps something similar to the docs operator) as Atlan can interpret it directly. This task would have to always run regardless of if previous tasks have failed (In order to capture test results that are failures)

Although my concern with the above would be what happens if an airflow deployment gets redeployed would those run_results.json files be overidden/deleted.

That'd be amazing!

tatiana · 2024-12-12T14:04:16Z

Duplicated request: #1259 - we can follow up once the feature is implemented there.

pankajkoti · 2024-12-17T16:36:00Z

hi @DanMawdsleyBA , @Metystes , @fabiomx we recently merged PR #1389, which introduces minor changes to the existing callback functionality and will be included in the upcoming Cosmos 1.8.0 release. In the PR, we also made changes so that the callback is called first and then the exceptions are raised/handled.

To allow users to try out these changes ahead of the official release, we have prepared an alpha release. You can install it using the following link: astronomer-cosmos 1.8.0a3. PR #1389 also provides examples showcasing how to use this callback functionality.

For additional guidance, refer to the documentation on leveraging callbacks: Callback Configuration. The helper functions demonstrated in the examples can be found here: cosmos/io.py. However, you are not limited to these; you can create your own custom callback functions using these examples as a reference and pass them via the callback argument.

We would greatly appreciate any feedback you have after testing this alpha release!

dosubot bot added area:testing Related to testing, like unit tests, integration tests, etc dbt:test Primarily related to dbt test command or functionality parsing:dbt_manifest Issues, questions, or features related to dbt_manifest parsing labels Jan 16, 2024

tatiana added this to the 1.5.0 milestone May 17, 2024

tatiana added the triage-needed Items need to be reviewed / assigned to milestone label May 17, 2024

tatiana modified the milestones: 1.5.0, 1.6.0 May 17, 2024

tatiana modified the milestones: Cosmos 1.6.0, Cosmos 1.7.0 Jul 30, 2024

tatiana mentioned this issue Jul 24, 2024

[Design]: Callback support #1121

Closed

tatiana modified the milestones: Cosmos 1.7.0, Triage Sep 20, 2024

tatiana modified the milestones: Triage, Cosmos 1.8.0 Oct 30, 2024

victormacaubas mentioned this issue Oct 30, 2024

[Feature] Store dbt artifacts (namely, manifest.json, catalog.json and run_results.json) after every dbt run. #1292

Closed

1 task

tatiana mentioned this issue Nov 29, 2024

Improve callback support #1349

Closed

tatiana added the execution:callback Tasks related to callback when executing tasks label Nov 29, 2024

pankajkoti self-assigned this Dec 11, 2024

tatiana mentioned this issue Dec 12, 2024

Setting remote_target_path and remote_target_path_conn_id, but no files land in S3 after running project dag #1259

Closed

pankajkoti mentioned this issue Dec 17, 2024

Add helper functions for uploading target directory artifacts to remote cloud storages #1389

Merged

2 tasks

tatiana closed this as completed in 0000f80 Dec 17, 2024

tatiana closed this as completed in #1389 Dec 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to store all test results #801

How to store all test results #801

DanMawdsleyBA commented Jan 16, 2024

dosubot bot commented Jan 16, 2024 •

edited

Loading

DanMawdsleyBA commented Jan 17, 2024

tatiana commented Jan 18, 2024

Metystes commented Jan 31, 2024

ms32035 commented Apr 5, 2024

fabiomx commented Apr 17, 2024

luis-fnogueira commented Oct 24, 2024

tatiana commented Dec 12, 2024

pankajkoti commented Dec 17, 2024

How to store all test results #801

How to store all test results #801

Comments

DanMawdsleyBA commented Jan 16, 2024

dosubot bot commented Jan 16, 2024 • edited Loading

Sources

DanMawdsleyBA commented Jan 17, 2024

tatiana commented Jan 18, 2024

Metystes commented Jan 31, 2024

ms32035 commented Apr 5, 2024

fabiomx commented Apr 17, 2024

luis-fnogueira commented Oct 24, 2024

tatiana commented Dec 12, 2024

pankajkoti commented Dec 17, 2024

dosubot bot commented Jan 16, 2024 •

edited

Loading