Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Spike+][jinja if statement] [false positives] environment-aware logic in config should not cause resources to always be selected by state:modified #9563

Closed
3 tasks done
Tracked by #9562
graciegoheen opened this issue Feb 13, 2024 · 7 comments
Assignees
Labels
enhancement New feature or request state: modified state Stateful selection (state:modified, defer)

Comments

@graciegoheen
Copy link
Contributor

graciegoheen commented Feb 13, 2024

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

Many folks want to have logic in their dbt project that purposely differs by environment.

For example, let's say I want to materialize a model as a view in dev but as a table in prod.

To accomplish this, I use a jinja if statement in my config block:

{{ config(
    materialized = "{{ 'table' if target.name == 'prod' else 'view' }}"
) }}

select *
from ...

When selecting --select state:modified and comparing my dev environment to a manifest from prod, this model will ALWAYS be marked as modified. This is because, the materialized config appears different - in prod it's table, in dev it's view.

Instead, if I've changed nothing about this model, it shouldn't be marked as modified because the "if" statement/jinja logic hasn't changed.

Describe alternatives you've considered

A current work-around is to move all environment-based logic to my dbt_project.yml file.

If instead, I were to configure this model as so:

models:
  <path_to_my_model>:
    +materialized: "{{ 'table' if target.name == 'prod' else 'view' }}"

it would NOT be picked up when selecting --select state:modified.

But this is cumbersome, unexpected, and can lead to an unnecessarily massive dbt_project.yml file.

Who will this benefit?

Anyone who wants to use state:modified and has environment-based logic in their dbt project.

Are you interested in contributing this feature?

No response

Anything else?

Potential solution proposed here #3680

This is relevant for all resources that can be configured, not just models.

@graciegoheen graciegoheen added enhancement New feature or request triage and removed triage labels Feb 13, 2024
@graciegoheen graciegoheen changed the title jinja if statement, environment-aware logic should not be Always selected by state:modified [jinja if statement] environment-aware logic in config should not cause resources to Always selected by state:modified Feb 13, 2024
@jtcohen6
Copy link
Contributor

@graciegoheen How would you feel about recommending an in-between solution:

# models/path/to/my_model.yml
models:
  - name: my_model
    config:
      materialized: "{{ 'table' if target.name == 'prod' else 'view' }}"

Effectively, the same mechanism as in dbt_project.yml: we saved the raw Jinja text as unrendered_config, and compare the node's unrendered_config instead of the resolved config values. It does mean that, if someone edited the Jinja expression to be "{{ 'table' if target.name != 'prod' else 'view' }}" — adding a bit of whitespace, no functional change — it would still be detected as a change by state:modified.

Reasons I like this approach:

  • [usability] Avoids the problem of massive overloaded dbt_project.yml
  • [feasibility] I believe it would be much easier for us to implement than statically analyzing + saving {{ config() }} arguments within model files

@graciegoheen
Copy link
Contributor Author

graciegoheen commented Feb 13, 2024

@jtcohen6 can you help me understand the difference between:

models:
  - name: my_model
    config:
      materialized: "{{ 'table' if target.name == 'prod' else 'view' }}"

and

{{ config(
    materialized = "{{ 'table' if target.name == 'prod' else 'view' }}"
) }}

select *
from ...

Why can we save the raw Jinja text as unrendered_config in the first scenario but not the second?

How would you feel about recommending an in-between solution

It's better than nothing! But still confusing for users that this works in yml files but not in my config block

@jtcohen6
Copy link
Contributor

Why can we save the raw Jinja text as unrendered_config in the first scenario but not the second?

It has to do with the way dbt uses Jinja for rendering + storing configs. In the first case, because it's a standalone Jinja expression in yaml, we can save "{{ 'table' if target.name == 'prod' else 'view' }}" as the unrendered config value associated with just the materialized key.

But in the second case, what the user actually provides (no nested curlies except for pre/post-hook) is:

{{ config(
    materialized = ('table' if target.name == 'prod' else 'view')
) }}

select *
from ...

As I understand it, dbt resolves the entire {{ config() }} macro call, storing the fully resolved results ('table' or 'view'). There is no intermediate step where we can store just ('table' if target.name == 'prod' else 'view') as the unrendered config value of the materialized key. If we did want to do that, we'd need a static analysis/extraction step, as described in #3680.

I'm definitely not saying it's impossible, just more difficult to implement. I think the conditional-Jinja-in-yaml case offers us a quicker win with lower lift.

@jtcohen6
Copy link
Contributor

jtcohen6 commented Feb 14, 2024

To support this pattern with sources (and probably other "yaml-only" resource types), I believe we'd need to also do the work of treating database + schema + identifier/alias as configs, such that we can save their unrendered values when configs are calculated. In theory, that's happening here, but it doesn't work as I'd expect for the one actual source config we do support today, enabled.

Original issue for this:

@graciegoheen graciegoheen changed the title [jinja if statement] environment-aware logic in config should not cause resources to Always selected by state:modified [jinja if statement] [false positives] environment-aware logic in config should not cause resources to Always selected by state:modified Feb 14, 2024
@graciegoheen graciegoheen added state Stateful selection (state:modified, defer) state: modified labels Feb 14, 2024
@michael-urrutia
Copy link

michael-urrutia commented Feb 14, 2024

I ran into a similar issue recently where we had a macro for selecting the Snowflake warehouse size. Worth calling out that in an ideal world the logic could be encapsulated within a macro that is called within the project file. This is important to avoid redundancy and create more dry code, as well as handle complicated logic more elegantly. It is my understanding that macros cannot be currently referenced in the dbt_project.yml file today - issue here. Although different, I think these issues are pretty coupled. Model configs should be able to be applied at the project or individual model level with identical & expected behavior

@joellabes
Copy link
Contributor

joellabes commented Feb 15, 2024

I'd be all-in on jinja-in-yaml if we also put yaml-in-sql 😉

(but seriously: by doing this, we would still have the ability to colocate config with code, if that's what you want).

@ChenyuLInx ChenyuLInx changed the title [jinja if statement] [false positives] environment-aware logic in config should not cause resources to Always selected by state:modified [Spike+][jinja if statement] [false positives] environment-aware logic in config should not cause resources to Always selected by state:modified Jul 22, 2024
@MichelleArk
Copy link
Contributor

Closed by: https://github.com/dbt-labs/dbt-core/pull/10487/files#diff-b14dff35dc9759591980bed526df1689bbe35b4a06e176d8af0880ae43afbb3a

@dbeatty10 dbeatty10 changed the title [Spike+][jinja if statement] [false positives] environment-aware logic in config should not cause resources to Always selected by state:modified [Spike+][jinja if statement] [false positives] environment-aware logic in config should not cause resources to always be selected by state:modified Nov 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request state: modified state Stateful selection (state:modified, defer)
Projects
None yet
Development

No branches or pull requests

5 participants