Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-93] [Feature] Enrich data from --store-failures #4624

Closed
1 task done
iknox-fa opened this issue Jan 25, 2022 · 5 comments
Closed
1 task done

[CT-93] [Feature] Enrich data from --store-failures #4624

iknox-fa opened this issue Jan 25, 2022 · 5 comments
Labels
enhancement New feature or request stale Issues that have gone stale

Comments

@iknox-fa
Copy link
Contributor

Is there an existing feature request for this?

  • I have searched the existing issues

Describe the Feature

We have had a number of people asking about getting a better handle on the rows saved when a run/build is executed with --store-failues enabled. Here's a few examples.

There's a few different ways to go about it, so this warrants further ticket refinement.

Describe alternatives you've considered

No response

Who will this benefit?

No response

Are you interested in contributing this feature?

No response

Anything else?

No response

@iknox-fa iknox-fa added enhancement New feature or request triage labels Jan 25, 2022
@github-actions github-actions bot changed the title [Feature] Enrich data from --store-failures [CT-93] [Feature] Enrich data from --store-failures Jan 25, 2022
@dlawrences
Copy link

Based on our experience at VmX and this reply #5313 (comment), I would suggest the following:

  • allow the option of storing all results, instead of only all failures, if desired by the user
  • interpret individual model record test result based on configurable metadata column, which could default to a standard column if not defined (we're used __dbt_test_passed which is obviously pretty binary)
  • apply https://docs.getdbt.com/reference/resource-configs/limit on the results of the test, regardless of whether it's configured to store all results or only all failures
  • give the option to serialise all model data that is returned by a test into a json object that can be inserted into a VARIANT column in Snowflake (or similar in other RDBMSs): this is what we're doing and it works beautifully
    • if configured this way, dbt can make use of a single target table to store test results; if not configured this way, dbt should define on the fly tables to store the results of each test
  • expose the compiled test name using a context variable that is available in a test block for people to custom in custom reporting if needed
  • build a hidden DAG node if the model that is tested is additionally using a https://docs.getdbt.com/reference/resource-configs/where clause: we consider that in this case, it's not the model that is being tested, but rather an artifact that is computed at test time which represents a subset of the model and should be part of the DAG since tests should be reported that they tested the artifact instead of the model

@ismailsimsek
Copy link

ismailsimsek commented Aug 18, 2022

  • +1 for an option to limit the number of rows saved. like save only 10 rows out of failed test records!
  • also exposing compiled test query could be helpful too. only when the test fails, printing it to log.

@github-actions
Copy link
Contributor

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

@github-actions github-actions bot added the stale Issues that have gone stale label Feb 15, 2023
@github-actions
Copy link
Contributor

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Feb 23, 2023
@simonhallberg
Copy link

Is there a way to specify the table where the records should be saved? I would like to be able to choose an existing table and reload it for every new run instead of creating a new table for every time a test is run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request stale Issues that have gone stale
Projects
None yet
Development

No branches or pull requests

4 participants