-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
User seeing initial wh var being used rather than incremental var when running incremental model #41
Comments
I just wanted to note that I had multiple people do the package implementation and see the same issue I was having. When I build in dbt, the log gives the expected warehouse but when I check in snowflake, the wrong warehouse is being used. However, the macro looks straightforward and I'm curious if this is actually due to a dbt bug rather than an issue with this package. |
Hey folks - just wanted to clarify that dbt itself has never supported such behaviour - i.e.
Feel free to read relevant issues here: dbt-labs/dbt-core#7868 and here: dbt-labs/dbt-snowflake#103 The problem is that dbt has to set the model config - prior to even touching the datawarehouse to determnine the existence of a table. I can illustrate simply.
Modify the macro that comes from this package: -- dbt_packages/snowflake_utils/macros/warehouse_size.sql
{% macro warehouse_size() %}
{% set initial_wh = var('snowflake_utils:initial_run_warehouse', none) %}
{% set full_wh = var('snowflake_utils:full_refresh_run_warehouse', none) %}
{% set inc_wh = var('snowflake_utils:incremental_run_warehouse', none) %}
{% set relation = adapter.get_relation(this.database, this.schema, this.table) %}
{% do log("--------------------------------------------", True) %}
{% do log(relation, True) %}
{% do log("--------------------------------------------", True) %}
{#-- use alternative warehouse if full-refresh run #}
{# if relation is not none and flags.FULL_REFRESH and full_wh is not none #}
{% if flags.FULL_REFRESH and full_wh is not none %}
{% if execute %}
{{ dbt_utils.log_info("Full Refresh Run - Using alternative warehouse " ~ full_wh | upper) }}
{% endif %}
{% do return(full_wh) %}
{#-- use alternative warehouse if incremental run #}
{% elif relation is not none and flags.FULL_REFRESH == False and inc_wh is not none %}
{% if execute %}
{{ dbt_utils.log_info("Incremental Run - Using alternative warehouse " ~ inc_wh | upper) }}
{% endif %}
{% do return(inc_wh) %}
{#-- use alternative warehouse if initial run #}
{% elif relation is none and initial_wh is not none %}
{% if execute %}
{{ dbt_utils.log_info("Initial Run - Using alternative warehouse " ~ initial_wh | upper) }}
{% endif %}
{% do return(initial_wh) %}
{#-- use target warehouse if variable not configured for a condition #}
{% else %}
{{ dbt_utils.log_info("Using target warehouse " ~ target.warehouse | upper) }}
{% do return(target.warehouse) %}
{% endif %}
{% endmacro %} # dbt_project.yml
name: my_dbt_project
profile: all
config-version: 2
version: 1.0
models:
my_dbt_project:
+materialized: table
+query_tag: "{{ invocation_id }}"
# The default snowflake warehouse i.e. target.warehouse is 'analytics'
vars:
"snowflake_utils:initial_run_warehouse": "wh_initial"
"snowflake_utils:incremental_run_warehouse": "wh_incremental"
"snowflake_utils:full_refresh_run_warehouse": "wh_fullrefresh" -- models/my_inc.sql
{{
config(
materialized = "incremental",
snowflake_warehouse = snowflake_utils.warehouse_size()
)
}}
/* {{ invocation_id }} */
select 1 id First let's delete the incremental model so that it doesn't exist: Let's do our first run: $ dbt --debug run -s my_inc
============================== 2023-09-27 01:08:30.230650 | 3120452f-2d89-4797-8cd7-3e70d8bc95cb ==============================
01:08:30.230650 [info ] [MainThread]: Running with dbt=1.4.7
01:08:30.234895 [debug] [MainThread]: running dbt with arguments {'debug': True, 'write_json': True, 'use_colors': True, 'printer_width': 80, 'version_check': True, 'partial_parse': False, 'static_parser': True, 'profiles_dir': '/Users/jeremy/.dbt', 'send_anonymous_usage_stats': True, 'quiet': False, 'no_print': False, 'cache_selected_only': False, 'select': ['my_inc'], 'which': 'run', 'rpc_method': 'run', 'indirect_selection': 'eager'}
01:08:30.235581 [debug] [MainThread]: Tracking: tracking
01:08:30.236721 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'start', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x11defa100>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x11defadf0>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x11defad00>]}
01:08:30.301385 [debug] [MainThread]: checksum: f4f4cdde9b257b3a8e3791263c44f34b0bd0facc33ef1ba84a563bd63785ce73, vars: {}, profile: None, target: None, version: 1.4.7
01:08:30.303503 [debug] [MainThread]: Partial parsing not enabled
01:08:32.113136 [debug] [MainThread]: 1603: static parser failed on my_inc.sql
01:08:32.129938 [info ] [MainThread]: --------------------------------------------
01:08:32.130781 [info ] [MainThread]: None
01:08:32.131231 [info ] [MainThread]: --------------------------------------------
01:08:32.145543 [debug] [MainThread]: 1602: parser fallback to jinja rendering on my_inc.sql
01:08:32.148513 [debug] [MainThread]: 1603: static parser failed on foo.sql
01:08:32.155109 [debug] [MainThread]: 1602: parser fallback to jinja rendering on foo.sql
01:08:32.158015 [debug] [MainThread]: 1603: static parser failed on my_inc_b.sql
01:08:32.163338 [info ] [MainThread]: --------------------------------------------
01:08:32.164088 [info ] [MainThread]: None
01:08:32.164701 [info ] [MainThread]: --------------------------------------------
01:08:32.167961 [debug] [MainThread]: 1602: parser fallback to jinja rendering on my_inc_b.sql
01:08:32.257965 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'load_project', 'label': '3120452f-2d89-4797-8cd7-3e70d8bc95cb', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x11e0e45e0>]}
01:08:32.268349 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'resource_counts', 'label': '3120452f-2d89-4797-8cd7-3e70d8bc95cb', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x11dff4d30>]}
01:08:32.269248 [info ] [MainThread]: Found 3 models, 0 tests, 0 snapshots, 0 analyses, 429 macros, 0 operations, 0 seed files, 0 sources, 0 exposures, 0 metrics
01:08:32.269858 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'runnable_timing', 'label': '3120452f-2d89-4797-8cd7-3e70d8bc95cb', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x11dfffca0>]}
01:08:32.272476 [info ] [MainThread]:
01:08:32.273702 [debug] [MainThread]: Acquiring new snowflake connection 'master'
01:08:32.275337 [debug] [ThreadPool]: Acquiring new snowflake connection 'list_development'
01:08:32.298494 [debug] [ThreadPool]: Using snowflake connection "list_development"
01:08:32.299298 [debug] [ThreadPool]: On list_development: /* {"app": "dbt", "dbt_version": "1.4.7", "profile_name": "all", "target_name": "sf", "connection_name": "list_development"} */
show terse schemas in database development
limit 10000
01:08:32.299852 [debug] [ThreadPool]: Opening a new connection, currently in state init
01:08:34.716089 [debug] [ThreadPool]: SQL status: SUCCESS 294 in 2 seconds
01:08:34.729384 [debug] [ThreadPool]: On list_development: Close
01:08:35.323272 [debug] [ThreadPool]: Acquiring new snowflake connection 'list_development_dbt_jyeo'
01:08:35.335590 [debug] [ThreadPool]: Using snowflake connection "list_development_dbt_jyeo"
01:08:35.336415 [debug] [ThreadPool]: On list_development_dbt_jyeo: /* {"app": "dbt", "dbt_version": "1.4.7", "profile_name": "all", "target_name": "sf", "connection_name": "list_development_dbt_jyeo"} */
show terse objects in development.dbt_jyeo
01:08:35.337120 [debug] [ThreadPool]: Opening a new connection, currently in state closed
01:08:37.188054 [debug] [ThreadPool]: SQL status: SUCCESS 33 in 2 seconds
01:08:37.194920 [debug] [ThreadPool]: On list_development_dbt_jyeo: Close
01:08:37.830846 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'runnable_timing', 'label': '3120452f-2d89-4797-8cd7-3e70d8bc95cb', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x11deeffd0>]}
01:08:37.832870 [info ] [MainThread]: Concurrency: 1 threads (target='sf')
01:08:37.833749 [info ] [MainThread]:
01:08:37.843046 [debug] [Thread-1 ]: Began running node model.my_dbt_project.my_inc
01:08:37.844422 [info ] [Thread-1 ]: 1 of 1 START sql incremental model dbt_jyeo.my_inc ............................. [RUN]
01:08:37.845740 [debug] [Thread-1 ]: Acquiring new snowflake connection 'model.my_dbt_project.my_inc'
01:08:37.846506 [debug] [Thread-1 ]: Began compiling node model.my_dbt_project.my_inc
01:08:37.854740 [info ] [Thread-1 ]: --------------------------------------------
01:08:37.855617 [info ] [Thread-1 ]: None
01:08:37.856326 [info ] [Thread-1 ]: --------------------------------------------
01:08:37.871553 [info ] [Thread-1 ]: 14:08:37 + Initial Run - Using alternative warehouse WH_INITIAL
01:08:37.872508 [debug] [Thread-1 ]: Writing injected SQL for node "model.my_dbt_project.my_inc"
01:08:37.874446 [debug] [Thread-1 ]: Timing info for model.my_dbt_project.my_inc (compile): 2023-09-27 01:08:37.847129 => 2023-09-27 01:08:37.874318
01:08:37.875188 [debug] [Thread-1 ]: Began executing node model.my_dbt_project.my_inc
01:08:37.877964 [debug] [Thread-1 ]: Using snowflake connection "model.my_dbt_project.my_inc"
01:08:37.878609 [debug] [Thread-1 ]: On model.my_dbt_project.my_inc: /* {"app": "dbt", "dbt_version": "1.4.7", "profile_name": "all", "target_name": "sf", "node_id": "model.my_dbt_project.my_inc"} */
select current_warehouse() as warehouse
01:08:37.879108 [debug] [Thread-1 ]: Opening a new connection, currently in state closed
01:08:39.668617 [debug] [Thread-1 ]: SQL status: SUCCESS 1 in 2 seconds
01:08:39.686045 [debug] [Thread-1 ]: Using snowflake connection "model.my_dbt_project.my_inc"
01:08:39.686953 [debug] [Thread-1 ]: On model.my_dbt_project.my_inc: /* {"app": "dbt", "dbt_version": "1.4.7", "profile_name": "all", "target_name": "sf", "node_id": "model.my_dbt_project.my_inc"} */
use warehouse wh_initial
01:08:40.011748 [debug] [Thread-1 ]: SQL status: SUCCESS 1 in 0 seconds
01:08:40.056684 [debug] [Thread-1 ]: Using snowflake connection "model.my_dbt_project.my_inc"
01:08:40.057297 [debug] [Thread-1 ]: On model.my_dbt_project.my_inc: /* {"app": "dbt", "dbt_version": "1.4.7", "profile_name": "all", "target_name": "sf", "node_id": "model.my_dbt_project.my_inc"} */
show parameters like 'query_tag' in session
01:08:40.431236 [debug] [Thread-1 ]: SQL status: SUCCESS 1 in 0 seconds
01:08:40.433219 [debug] [Thread-1 ]: Setting query_tag to '3120452f-2d89-4797-8cd7-3e70d8bc95cb'. Will reset to '' after materialization.
01:08:40.434271 [debug] [Thread-1 ]: Using snowflake connection "model.my_dbt_project.my_inc"
01:08:40.434784 [debug] [Thread-1 ]: On model.my_dbt_project.my_inc: /* {"app": "dbt", "dbt_version": "1.4.7", "profile_name": "all", "target_name": "sf", "node_id": "model.my_dbt_project.my_inc"} */
alter session set query_tag = '3120452f-2d89-4797-8cd7-3e70d8bc95cb'
01:08:40.762479 [debug] [Thread-1 ]: SQL status: SUCCESS 1 in 0 seconds
01:08:40.813883 [debug] [Thread-1 ]: Writing runtime sql for node "model.my_dbt_project.my_inc"
01:08:40.815754 [debug] [Thread-1 ]: Using snowflake connection "model.my_dbt_project.my_inc"
01:08:40.816402 [debug] [Thread-1 ]: On model.my_dbt_project.my_inc: /* {"app": "dbt", "dbt_version": "1.4.7", "profile_name": "all", "target_name": "sf", "node_id": "model.my_dbt_project.my_inc"} */
create or replace transient table development.dbt_jyeo.my_inc as
(
/* 3120452f-2d89-4797-8cd7-3e70d8bc95cb */
select 1 id
);
01:08:42.433106 [debug] [Thread-1 ]: SQL status: SUCCESS 1 in 2 seconds
01:08:42.450984 [debug] [Thread-1 ]: Using snowflake connection "model.my_dbt_project.my_inc"
01:08:42.451786 [debug] [Thread-1 ]: On model.my_dbt_project.my_inc: /* {"app": "dbt", "dbt_version": "1.4.7", "profile_name": "all", "target_name": "sf", "node_id": "model.my_dbt_project.my_inc"} */
drop view if exists development.dbt_jyeo.my_inc__dbt_tmp cascade
01:08:42.877673 [debug] [Thread-1 ]: SQL status: SUCCESS 1 in 0 seconds
01:08:42.915554 [debug] [Thread-1 ]: No original query_tag, unsetting parameter.
01:08:42.917145 [debug] [Thread-1 ]: Using snowflake connection "model.my_dbt_project.my_inc"
01:08:42.918077 [debug] [Thread-1 ]: On model.my_dbt_project.my_inc: /* {"app": "dbt", "dbt_version": "1.4.7", "profile_name": "all", "target_name": "sf", "node_id": "model.my_dbt_project.my_inc"} */
alter session unset query_tag
01:08:43.225724 [debug] [Thread-1 ]: SQL status: SUCCESS 1 in 0 seconds
01:08:43.228577 [debug] [Thread-1 ]: Using snowflake connection "model.my_dbt_project.my_inc"
01:08:43.229217 [debug] [Thread-1 ]: On model.my_dbt_project.my_inc: /* {"app": "dbt", "dbt_version": "1.4.7", "profile_name": "all", "target_name": "sf", "node_id": "model.my_dbt_project.my_inc"} */
use warehouse ANALYTICS
01:08:43.562021 [debug] [Thread-1 ]: SQL status: SUCCESS 1 in 0 seconds
01:08:43.565530 [debug] [Thread-1 ]: Timing info for model.my_dbt_project.my_inc (execute): 2023-09-27 01:08:37.875604 => 2023-09-27 01:08:43.565388
01:08:43.566718 [debug] [Thread-1 ]: On model.my_dbt_project.my_inc: Close
01:08:44.157249 [debug] [Thread-1 ]: Sending event: {'category': 'dbt', 'action': 'run_model', 'label': '3120452f-2d89-4797-8cd7-3e70d8bc95cb', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x121fc9130>]}
01:08:44.159139 [info ] [Thread-1 ]: 1 of 1 OK created sql incremental model dbt_jyeo.my_inc ........................ [SUCCESS 1 in 6.31s]
01:08:44.162546 [debug] [Thread-1 ]: Finished running node model.my_dbt_project.my_inc
01:08:44.165893 [debug] [MainThread]: Acquiring new snowflake connection 'master'
01:08:44.167293 [debug] [MainThread]: Connection 'master' was properly closed.
01:08:44.168027 [debug] [MainThread]: Connection 'model.my_dbt_project.my_inc' was properly closed.
01:08:44.168750 [info ] [MainThread]:
01:08:44.169521 [info ] [MainThread]: Finished running 1 incremental model in 0 hours 0 minutes and 11.90 seconds (11.90s).
01:08:44.170380 [debug] [MainThread]: Command end result
01:08:44.184477 [info ] [MainThread]:
01:08:44.185244 [info ] [MainThread]: Completed successfully
01:08:44.185802 [info ] [MainThread]:
01:08:44.186317 [info ] [MainThread]: Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1
01:08:44.187056 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'end', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x121fc6bb0>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x121fc66d0>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x121fc6610>]}
01:08:44.187991 [debug] [MainThread]: Flushing usage events ^ Worth calling out now, there's 3 times the log I added was
Now, that the model exist, let's rerun and hopefully the warehouse used now is $ dbt --debug run -s my_inc
And now we have:
^ The logging out of:
Would have given users a false sense of "yup, this works". Because as you can tell, that warehouse is not really used as per the subsequent line:
So no matter the scenario, the snowflake warehouse is always set to This is the case for dbt 1.3, 1.4, 1.5 and 1.6 so I'm not 100% sure how users were using it in the first place. EDIT: Minor exception to the above wrt this packages' |
Very late response on this, my apologies! I'm no longer actively involved in dbt projects. When I initially wrote this macro (almost 4 years ago!), it was built with insights and feedback from Drew Banin, and it was absolutely working on the project I built it for. Now, dbt has evolved significantly in the last 4 years, and this macro was a bit of a hack to begin with, so it's no surprise that changes under the hood broke this macro at some point. As we haven't worked for that client in 3+ years, the issue was not apparent to us. I'm no longer working with dbt, so I'm not the best person to investigate and fix this, unfortunately. If someone wants to investigate if a refactor is possible and issue a PR, we're happy to merge any changes. Otherwise, I guess we might have to deprecate it if it's not working at all. @JFrackson fyi |
Hi, Thanks @MartinGuindon for the update. Running with dbt=1.7.6, I can confirm that this macro doesn't work as expected for incremental models. |
hey all, I have a user running into an issue while using snowflake_utils 0.5.0 (w/3 declared vars in dbt_project) and an incremental model.
If the model already exists, the logging shows the following:
it looks like it's going to set the correct TRANSFORMING warehouse but it goes with the initial warehouse var and errors out.
Tested on:
The text was updated successfully, but these errors were encountered: