-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch to FULL OUTER JOIN
for derived & ratio metrics
#842
Merged
Merged
Changes from 5 commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
e0cb00f
Switch to OUTER JOIN for derived metrics
courtneyholcomb fab0350
Changelog
courtneyholcomb 7aa4682
Update snapshots & test SQL
courtneyholcomb a14e415
Update SQL engine snapshots
courtneyholcomb 3e5e8c2
Update test case SQL
courtneyholcomb dbb8a2f
Use FULL OUTER JOIN for ratio metrics, too
courtneyholcomb b243aad
Update docstring for visit_combine_metrics_node
courtneyholcomb 4c4f440
Remove join_type param
courtneyholcomb File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
kind: Breaking Changes | ||
body: Use FULL OUTER JOIN to combine input metrics for derived metrics. This is a change from using INNER JOIN and may result in changes in output. | ||
time: 2023-11-02T18:28:15.181064-07:00 | ||
custom: | ||
Author: courtneyholcomb | ||
Issue: "842" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tlento do you have context on why for
INNER JOIN
we add thisNULL
column comparison, but forFULL OUTER JOIN
we don't? Wondering if that's intentional or something I should fix!There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The NULL comparison is needed because NULL = NULL evaluates to NULL, which is false-y in SQL comparisons, but we have to include sets in the INNER JOIN where the dimension values on both sides are NULL. If a row doesn't match on an INNER JOIN it gets discarded, so we must handle that case in the join condition.
For FULL OUTER JOIN we can't do OR conditions (or inequalities) - a bunch of our engines don't support them. Everything has to be a strict intersection of equality checks. However, if you do a FULL OUTER JOIN and a comparison fails due to a NULL = NULL check, you still get the rows from both sides. They're just not matched to anything, so you effectively get one row per aggregated measure source. That's why we have all these MAX expressions, because that's the hack I put in to deduplicate those output rows.
I have to say I find all of this absurd, but that is where we are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oooh, I actually wrote it down!
https://github.com/dbt-labs/metricflow/blob/main/metricflow/plan_conversion/dataflow_to_sql.py#L906
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah thank you!! Super helpful