Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement charindex macro DNA-15944 [DNA-20680] #78

Merged
merged 1 commit into from
Nov 15, 2023

Conversation

silviustanimir
Copy link
Contributor

@silviustanimir silviustanimir commented Nov 13, 2023

Description

  • Implement charindex macro for Spark compatbility.

Release

  • Direct release (main)
  • Merge to dev (or other) branch
    • Why: Merge to Databricks dev branch

Did you consider?

  • Does it Work on Automation Suite / SQL Server
  • Does it Work on Automation Cloud / Snowflake
  • What is the performance impact?
  • The version number in dbt_project.yml

@silviustanimir silviustanimir self-assigned this Nov 13, 2023
@silviustanimir silviustanimir changed the title feat: implement charindex macro Implement charindex macro DNA-15944 [DNA-2068] Nov 13, 2023
Base automatically changed from dev/databricks to main November 13, 2023 14:48
@silviustanimir silviustanimir force-pushed the feat/charindex branch 3 times, most recently from 0e33c68 to e5b8c10 Compare November 13, 2023 20:22
@silviustanimir silviustanimir changed the title Implement charindex macro DNA-15944 [DNA-2068] Implement charindex macro DNA-15944 [DNA-20680] Nov 14, 2023
{%- macro charindex(expression_to_find, expression_to_search, start_location=None) -%}
{%- if start_location is none -%}
{%- if target.type == 'databricks' -%}
position('{{ expression_to_find }}', '{{ expression_to_search }}')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see everywhere single quotes, but the input is already text/string. Is putting here quotes not breaking the functionality?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, on the contrary, it saves us from using extra escaping in the models. I.e., we can use
{{ pm_utils.charindex('Approve purchase order', 'Purchase_order_item_event_log_with_internal_id."Activity"') }} > 0
instead of
{{ pm_utils.charindex("'Approve purchase order'", 'Purchase_order_item_event_log_with_internal_id."Activity"') }} > 0
You could say this is redundant in the expression_to_search if we suppose that that is always a field in a table and not a hardcoded string, but we cover more use cases this way.

Copy link
Contributor

@KosmasH KosmasH Nov 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you compile now the code we are getting position('-', 'EKPO_input.NETWR_input')
while I believe we want position('-', EKPO_input.NETWR_input)
which you get if the macro implementation is position('{{ expression_to_find }}', {{ expression_to_search }})

@KosmasH
Copy link
Contributor

KosmasH commented Nov 14, 2023

You mentioned in the Jira task the following: "Note that line 4 yields NULL in SQL Server and 0 in Databricks and line 7 yields 1 in Databricks and 0 in SQL Server."
Should we not cover those differences with simple if statements? So that behavior is fully consistent. If there are reasons that this is not achievable, minimum is to call this out in the readme.

@silviustanimir
Copy link
Contributor Author

You mentioned in the Jira task the following: "Note that line 4 yields NULL in SQL Server and 0 in Databricks and line 7 yields 1 in Databricks and 0 in SQL Server." Should we not cover those differences with simple if statements? So that behavior is fully consistent. If there are reasons that this is not achievable, minimum is to call this out in the readme.

Line 4 is already covered in the macro, it was more as a reference info for the integration tests that we'll implement later. I updated the macro to also support Line 7.

README.md Outdated
This macro returns the starting position of the first occurrence of a string in another string. The search is not case-sensitive. If the string is not found, the function returns 0. This macro can be used to check whether a string contains another string.

Usage:
`{{ pm_utils.charindex('[expression_to_find]', '[expression_to_search]', '[start_location]') }}`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename to field

{% endif -%}
{% else -%}
{% if target.type == 'databricks' -%}
position('{{ expression_to_find }}', {{ field }}, '{{ start_location }}')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also the start location is expected to be an integer, in other words it does not work correctly. It runs, it compiles, but the result is incorrect.

@silviustanimir silviustanimir merged commit 3cdf83f into main Nov 15, 2023
3 checks passed
@silviustanimir silviustanimir deleted the feat/charindex branch November 15, 2023 17:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants