[CT-1123] Adapter pre_model_hook + post_model_hook for tests and compilation, too #212

jtcohen6 · 2022-09-06T09:58:27Z

Adapting from dbt-labs/dbt-snowflake#23 (comment):

Let's talk about adapter.pre_model_hook + adapter.post_model_hook.

Background

Here's where they're triggered to run, right before and after a model materialization:

https://github.com/dbt-labs/dbt-core/blob/8c8be687019014ced9be37c084f944205fc916ab/core/dbt/task/run.py#L279-L283

These are different from user-provided pre-hook and post-hook, which run within materializations. (I wish we named these things a bit more distinctly). They are also no-ops by default:

https://github.com/dbt-labs/dbt-core/blob/8c8be687019014ced9be37c084f944205fc916ab/core/dbt/adapters/base/impl.py#L1093-L1116

For certain adapter plugins, these "internal" hooks are the appropriate mechanism for database-specific behavior that needs to wrap a node's execution. For instance, on dbt-snowflake, this is where we turn the snowflake_warehouse config into a use warehouse command. @dataders and I were just discussing the same principle for use database (compute and storage) in serverless Azure Synapse (?).

Current limitations

These hooks are called for models, seeds, and snapshots, but not for tests, simply because the TestRunner does not inherit from the ModelRunner or extend its execute method (where adapter.pre_model_hook + adapter.post_model_hook get called). We've gotten the request to support snowflake_warehouse on tests several times.
While these hooks appropriately wrap all queries called during materialization, they do not seem to wrap compilation. Introspective queries called during compilation (e.g. run_query) will use the default warehouse (target.warehouse or user/role-configured), rather than the value of snowflake_warehouse configured on the model.

Why improve this

Consistency. This is what users expect to happen. Unified query history, so that all queries associated with a model actually use that model's configured compute.
It's desirable to configure specific tests to run with beefier or lesser compute, just as with models.

The text was updated successfully, but these errors were encountered:

Kratzbaum92 · 2023-01-11T09:36:54Z

Any updates on this topic?

pei0804 · 2023-02-20T00:18:29Z

Do you plan to resolve this issue?

mimoyer21 · 2023-05-03T18:36:16Z

@jtcohen6 Any update on timeline for this one? This would be a big help.

iknox-fa · 2023-05-15T19:24:03Z

per BLG-- Maybe use protocols?

rd144 · 2023-05-26T18:45:52Z

@jtcohen6 Is there any update on this? We're looking to reduce the snowflake costs of our pipelines and this functionality would support us in that.

dbeatty10 · 2023-06-04T19:36:16Z

Thank you to all of you that have asked about this recently -- it's helpful for us to see the interest level.

While we don't currently have a timeline for implementing this feature, we are still interested in:

improving consistency & aligning with user expectations
being able to configure the compute used for specific tests

SoumayaMauthoorMOJ · 2023-10-02T10:12:38Z

I’ve created a dbt project to demonstrate how to implement WAP on table materializations using “dummy” post-test-hooks:

Create a ‘wap’ version of your model:

--my_table_wap.sql
{{ config(
    table_type='iceberg',
    s3_data_naming='unique',
    s3_data_dir=transform_table.generate_s3_location(),
) }}

SELECT 1 AS id

Create a view model that references the wap model, and then deletes the view and renames the wap table as post-hook:

--my_table.sql
{{ config(
    materialized='view',
    post_hook=[ 
        "DROP VIEW {{ this }}",
        "{{ rename_relation(ref('my_table_wap'),this) }}",
    ],
) }}

SELECT * FROM {{ref('my_table_wap')}}

It applies the approach to the example jaffle_shop_db dbt project which uses the dbt-duckdb adapter.

Would this feature replace the need for these “dummy” post-test-hooks? Otherwise I'm happy to raise a ticket for implementing post-test-hooks since my approach is just a work-around

dbeatty10 · 2023-10-02T15:24:05Z

@SoumayaMauthoorMOJ cool that you were able to create an example of apply Write-Audit-Publish (WAP) with dbt build!

This issue (#212) seems focused on enabling use-cases like use warehouse in dbt-snowflake for dbt tests rather than WAP.

Rather, what you are asking about seems like it is covered by this Discussion: dbt-labs/dbt-core#5687. If applicable, do you want to add your thoughts or questions to that Discussion?

SoumayaMauthoorMOJ · 2023-10-02T15:32:33Z

@dbeatty10 thanks for clarifying. I did add a comment already (see here) but I didn't get a response so I thought an issue might be a good way to go to push the idea forward?

dbeatty10 · 2023-10-02T16:07:06Z

@SoumayaMauthoorMOJ Ah, I see that now! dbt-labs/dbt-core#5687 is still the best place at this stage, and it would also be the perfect place to share this!

If you are looking to instigate further discussion, you could try posing some questions that invite the community to think and interact with your idea.

e.g., you could ask folks for feedback on the pros/cons of your approach. You could also ask if anyone has ideas how to enable that pattern in dbt-core without the need for the “dummy” post-test-hooks.

vskarine · 2024-01-22T19:30:58Z

Hi,
has there been any updates on this issue? Alternative suggestions are also welcome. We are trying to get test to run on the same warehouse as models to optimize time and cost.
Thank you!

dbeatty10 · 2024-01-22T19:35:37Z

@vskarine We're still interested in this feature, but we don't currently have a timeline for implementing it.

vskarine · 2024-01-22T19:39:41Z

@vskarine We're still interested in this feature, but we don't currently have a timeline for implementing it.

Thanks for the update. I guess for now the only way for us to do it is to change warehouse size in the profile before each pipeline runs.

bjarneschroeder · 2024-02-02T15:25:02Z

Hey @dbeatty10,
although it will probably take some time, I would like to try to work on this. 😊

dbeatty10 · 2024-02-02T19:35:19Z

Awesome @bjarneschroeder ! 🏆

Give it a go and let us know if you need any help along the way.

bjarneschroeder · 2024-02-09T10:45:19Z

Hey @dbeatty10, quick update:

I'm on it, but I found that it takes me more time than I expected to understand the overall structure of dbt and how different parts of the project interact with each other under the hood. After diving deeper into the project and playing around with a custom project and implementing some first changes, I wanted to check out test cases which currently test the hook execution for the run command, so I can add similiar cases for the test command. And although I can generally run tests successfully with make. I currently struggle to find good cases to debug, so I can play around with the internal program state during a specific time in the execution.

I found what looks like appropriate test cases which were in dbt/tests/adapter but I struggled to execute those. I then just found out that they were removed in this commit and that some restructuring of tests is going on by finding out about dbt-labs/dbt-core#9513 and then seeing in the dbt-postgres repo that the moved tests seemed not to be integrated yet.

TLDR:
So currently I'm still finding a good way to understand the issue better and what is important for a good implementation.
Because hooks always require the interaction with adapters its tricky for me to find a good way of debugging things and understanding stuff better. I'm on it, its a grind (a fun one though). ☺️

jan-wolos-payu-gpo · 2024-04-15T19:58:49Z

Hey @bjarneschroeder, any luck with some progress on this? :)

bjarneschroeder · 2024-04-20T20:16:15Z

Hey @jwolos I started a new job a few weeks ago which keeps me very busy. I unfortunately do not really have the time to work on this at the moment. Sorry!

will-sargent-dbtlabs · 2024-05-06T18:30:53Z

Upvoting this as a request from several active customers

AlexanderStephenson · 2024-11-14T09:18:58Z

We are using Snowflake, and for our environment the default warehouse is a small. This is fast enough to run 95% of our models. Where we have a a model where the number of rows pushes the limit of the warehouse we have used the config snowflake_warehouse with a macro get_warehouse to set a larger warehouse. Using the macro we can set different warehouses for each of the different environments the model is being built in. Dev, UAT, Production.

What we are experiencing now is the model is being successfully built with an XLarge warehouse, but the Tests are failing as these are defaulting back to the environments default warehouse which is a small. The test is a unique test on a table with 14b rows. So it is not complex SQL, just a lot of data.

dbeatty10 · 2024-11-14T14:57:04Z

@AlexanderStephenson beginning in dbt-core v1.9 (currently in beta), you can do something like this to configure snowflake_warehouse for your models and data tests:

models:
  - name: my_model
    config:
      snowflake_warehouse: something
    columns:
      - name: id
        tests:
          - accepted_values:
              values: [2]
              config:
                severity: warn
                snowflake_warehouse: something_else

Wanna give that a shot and see if it works for you?

dbeatty10 · 2024-12-02T19:12:32Z

Closing as resolved by dbt-labs/dbt-core#10767.

Please open an issue here if you try this in dbt Core v1.9+ and it doesn't work for you.

Co-authored-by: allcontributors[bot] <46447321+allcontributors[bot]@users.noreply.github.com>

jtcohen6 added the enhancement New feature or request label Sep 6, 2022

github-actions bot changed the title ~~Adapter pre_model_model + post_model_hook for tests and compilation, too~~ [CT-1123] Adapter pre_model_model + post_model_hook for tests and compilation, too Sep 6, 2022

jtcohen6 mentioned this issue Sep 6, 2022

Snowflake model configs for tests, too dbt-labs/dbt-snowflake#23

Closed

jtcohen6 changed the title ~~[CT-1123] Adapter pre_model_model + post_model_hook for tests and compilation, too~~ [CT-1123] Adapter pre_model_hook + post_model_hook for tests and compilation, too Nov 14, 2022

jtcohen6 mentioned this issue Nov 14, 2022

[CT-1495] [Feature] Snowflake query tags and warehouse size during test exection dbt-labs/dbt-core#6239

Closed

3 tasks

jtcohen6 mentioned this issue Feb 7, 2023

[CT-2044] Secondary role in tests dbt-labs/dbt-snowflake#449

Open

3 tasks

github-actions bot added the triage label May 26, 2023

dbeatty10 removed the triage label Jun 4, 2023

jtcohen6 assigned aranke and unassigned aranke Jul 19, 2023

seub mentioned this issue Oct 12, 2023

[ADAP-944] [Feature] Support secondary roles dbt-labs/dbt-snowflake#803

Open

3 tasks

jtcohen6 added the paper_cut A small change that impacts lots of users in their day-to-day label Nov 28, 2023

dbeatty10 mentioned this issue Dec 13, 2023

[CT-3489] [Feature] Update run_results.json as each node finishes dbt-labs/dbt-core#9276

Closed

2 tasks

dbeatty10 added the help_wanted Extra attention is needed label Feb 2, 2024

jtcohen6 mentioned this issue May 5, 2024

Looking if we can add the ability to select a Snowflake warehouse for a test that is independent from the warehouse the model tested is built with. dbt-labs/dbt-core#10089

Closed

colin-rogers-dbt transferred this issue from dbt-labs/dbt-core May 15, 2024

graciegoheen assigned jtcohen6 May 21, 2024

jtcohen6 mentioned this issue May 21, 2024

[Impl] Call adapter.pre_model_hook + adapter.post_model_hook within TestTask dbt-labs/dbt-core#10198

Closed

1 task

dbeatty10 closed this as completed Dec 2, 2024

mikealfare pushed a commit that referenced this issue Dec 2, 2024

docs: add jessedobbelaere as a contributor for maintenance (#212)

2e018c6

Co-authored-by: allcontributors[bot] <46447321+allcontributors[bot]@users.noreply.github.com>

dbeatty10 mentioned this issue Dec 6, 2024

Generic data tests accept arbitrary configs dbt-labs/docs.getdbt.com#6312

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CT-1123] Adapter pre_model_hook + post_model_hook for tests and compilation, too #212

[CT-1123] Adapter pre_model_hook + post_model_hook for tests and compilation, too #212

jtcohen6 commented Sep 6, 2022 •

edited

Loading

Kratzbaum92 commented Jan 11, 2023

pei0804 commented Feb 20, 2023

mimoyer21 commented May 3, 2023 •

edited

Loading

iknox-fa commented May 15, 2023

rd144 commented May 26, 2023

dbeatty10 commented Jun 4, 2023

SoumayaMauthoorMOJ commented Oct 2, 2023 •

edited

Loading

dbeatty10 commented Oct 2, 2023

SoumayaMauthoorMOJ commented Oct 2, 2023

dbeatty10 commented Oct 2, 2023

vskarine commented Jan 22, 2024

dbeatty10 commented Jan 22, 2024

vskarine commented Jan 22, 2024

bjarneschroeder commented Feb 2, 2024

dbeatty10 commented Feb 2, 2024

bjarneschroeder commented Feb 9, 2024

jan-wolos-payu-gpo commented Apr 15, 2024

bjarneschroeder commented Apr 20, 2024

will-sargent-dbtlabs commented May 6, 2024

AlexanderStephenson commented Nov 14, 2024

dbeatty10 commented Nov 14, 2024

dbeatty10 commented Dec 2, 2024

[CT-1123] Adapter pre_model_hook + post_model_hook for tests and compilation, too #212

[CT-1123] Adapter pre_model_hook + post_model_hook for tests and compilation, too #212

Comments

jtcohen6 commented Sep 6, 2022 • edited Loading

Background

Current limitations

Why improve this

Kratzbaum92 commented Jan 11, 2023

pei0804 commented Feb 20, 2023

mimoyer21 commented May 3, 2023 • edited Loading

iknox-fa commented May 15, 2023

rd144 commented May 26, 2023

dbeatty10 commented Jun 4, 2023

SoumayaMauthoorMOJ commented Oct 2, 2023 • edited Loading

dbeatty10 commented Oct 2, 2023

SoumayaMauthoorMOJ commented Oct 2, 2023

dbeatty10 commented Oct 2, 2023

vskarine commented Jan 22, 2024

dbeatty10 commented Jan 22, 2024

vskarine commented Jan 22, 2024

bjarneschroeder commented Feb 2, 2024

dbeatty10 commented Feb 2, 2024

bjarneschroeder commented Feb 9, 2024

jan-wolos-payu-gpo commented Apr 15, 2024

bjarneschroeder commented Apr 20, 2024

will-sargent-dbtlabs commented May 6, 2024

AlexanderStephenson commented Nov 14, 2024

dbeatty10 commented Nov 14, 2024

dbeatty10 commented Dec 2, 2024

jtcohen6 commented Sep 6, 2022 •

edited

Loading

mimoyer21 commented May 3, 2023 •

edited

Loading

SoumayaMauthoorMOJ commented Oct 2, 2023 •

edited

Loading