Add New SavedQuery Protocol #148

plypaul · 2023-09-14T00:39:30Z

Resolves #144

Description

Please see the associated issue for details.

Checklist

I have read the contributing guide and understand what's expected of me
I have signed the CLA
This PR includes tests, or tests are not required/relevant for this PR
I have run changie new to create a changelog entry

QMalcolm

This looks fantastic! Some small questions that we should figure out before approving and merging. Just want to have things really nailed down before starting the core work.

dbt_semantic_interfaces/protocols/saved_query.py

QMalcolm · 2023-09-14T19:22:47Z

dbt_semantic_interfaces/protocols/saved_query.py

+
+    @property
+    @abstractmethod
+    def group_by_item_names(self) -> Sequence[str]:  # noqa: D


I assume the idea is that group_by_item_names aren't required, but should be represented as an empty set when there are none? Was there a conversation somewhere about whether not we should mark them as optional (Optional[Sequence[str]])? I'd just like to understand the logic so I can keep it in mind when reviewing other parts of the protocol and ensuring consistency.

As a general rule, collections should not be typed as optional unless there's a meaningful difference between "no collection" and "empty collection".

So we should not make these optional return types, but on implementation we can default them to an empty Sequence type so users are not required to pass in a value.

Perfect, that works for me 🙂

dbt_semantic_interfaces/protocols/semantic_manifest.py

QMalcolm · 2023-09-14T19:26:11Z

dbt_semantic_interfaces/validations/saved_query.py

+    * Check if metric names exist in the manifest.
+    * Check that the where filter is valid using the same logic as WhereFiltersAreParsable


Do we also need to check group_by_item_names to check that they reference existing things? Additionally, what are valid group_by_item_names, is it just categorical dimensions?

@QMalcolm Yeah, that would be good to check, but I didn't see similar checks being done for filters. Is that right?

Oh right group_bys take the form of Dimension(...). Then we need a validation similar to the WhereFilter validations that checks that they're parsable by the jinja templater, because as with WhereFilters the jinja template parser is the only thing can enforce the protocol structure of these structured strings.

Changes to this look good!

tlento · 2023-09-14T20:33:37Z

dbt_semantic_interfaces/protocols/saved_query.py

+
+    @property
+    @abstractmethod
+    def group_by_item_names(self) -> Sequence[str]:  # noqa: D


As a general rule, collections should not be typed as optional unless there's a meaningful difference between "no collection" and "empty collection".

So we should not make these optional return types, but on implementation we can default them to an empty Sequence type so users are not required to pass in a value.

tlento · 2023-09-14T20:34:25Z

dbt_semantic_interfaces/implementations/saved_query.py

+
+    name: str
+    metrics: List[str]
+    group_bys: List[str]


@QMalcolm does it matter that this is not part of the protocol? It shouldn't, but I wanted to make sure this won't cause havoc with the core parsing setup.

Good catch. It would make things weird. To stick to the protocol we'd have to serialize group_by_item_names, but then MetricFlow uses these pydantic objects for deserialization, and since their is no setter for group_by_item_names on this object, they'd get dropped 😬 In a separate comment you brought up a question about why do we have both group_bys and group_by_item_names, and I think that's actually the better question. I think we just want group_bys? I'm not seeing what group_by_item_names provides.

dbt_semantic_interfaces/protocols/semantic_manifest.py

dbt_semantic_interfaces/protocols/saved_query.py

QMalcolm

We're looking really good, we just need to do a bit more on handing of group_bys I believe.

QMalcolm · 2023-09-15T20:15:56Z

dbt_semantic_interfaces/validations/saved_query.py

+        issues: List[ValidationIssue] = []
+
+        for group_by_item_name in saved_query.group_bys:
+            structured_name = DunderedNameFormatter.parse_name(group_by_item_name)


We already talked about it in a call, but documenting it here. This assumes that the group_bys will be strings of the form '<primary_enitity>__<dimension>'. Although eventually we want to support that, the strings right now will be of the form 'Dimension("<primary_entity>__<dimension>")'. We should probably have a GroupByParser, similar to the WhereFilterParser for parsing group_bys, and use that for validating that the group_bys have valid structuring.

Re-used the WhereFilterParser for now - there's some work going on right now to support Dimension(...).grain(...), so I'll wait until that's in place to switch out the implementation.

QMalcolm

Some small changes, left, but the overall implementation looks great!

dbt_semantic_interfaces/protocols/semantic_manifest.py

QMalcolm · 2023-09-18T19:34:50Z

dbt_semantic_interfaces/validations/saved_query.py

+    * Check if metric names exist in the manifest.
+    * Check that the where filter is valid using the same logic as WhereFiltersAreParsable


Changes to this look good!

dbt_semantic_interfaces/implementations/saved_query.py

QMalcolm · 2023-09-18T19:48:31Z

dbt_semantic_interfaces/parsing/schemas.py

+            "items": {"type": "string"},
+        },
+    },
+    "additionalProperties": False,


Nit: I don't think we have to, but we should probably mark the properties name and metrics to be required. I think parsing will error out either way if its not specified, as the creation of the object will fail. But specifying them in the jsonschema spec will give better errors I believe.

QMalcolm

Looks good to me! Sorry that there was so much back and forth on this one 🙃

This probably should have been part of #148, but we forgot. I'm adding these tests here because I plan to add `label` to the `SavedQuery` protocol in the coming commits. With that, I'll want to updated the parsing tests to check it. To do that, the tests need to exist.

plypaul added 2 commits September 12, 2023 12:34

Fix type error in the PrimaryEntityRule.

cde4a7d

Configure ruff in pre-commit to auto-fix unused import errors.

a408679

cla-bot bot added the cla:yes label Sep 14, 2023

plypaul mentioned this pull request Sep 14, 2023

Support Saved Queries in MetricFlow dbt-labs/metricflow#773

Closed

plypaul force-pushed the plypaul--49--saved-queries2 branch from 9d81690 to 57df3a2 Compare September 14, 2023 01:07

plypaul marked this pull request as ready for review September 14, 2023 01:09

plypaul requested a review from QMalcolm September 14, 2023 01:09

QMalcolm requested changes Sep 14, 2023

View reviewed changes

tlento self-requested a review September 14, 2023 20:31

tlento reviewed Sep 14, 2023

View reviewed changes

plypaul requested a review from QMalcolm September 15, 2023 18:12

QMalcolm requested changes Sep 15, 2023

View reviewed changes

plypaul force-pushed the plypaul--49--saved-queries2 branch 2 times, most recently from 5c3593c to 00a0821 Compare September 16, 2023 00:38

plypaul requested a review from QMalcolm September 16, 2023 00:56

plypaul force-pushed the plypaul--49--saved-queries2 branch from 00a0821 to 5e75b16 Compare September 16, 2023 00:59

QMalcolm requested changes Sep 18, 2023

View reviewed changes

plypaul requested a review from QMalcolm September 18, 2023 22:05

QMalcolm approved these changes Sep 18, 2023

View reviewed changes

plypaul added 6 commits September 19, 2023 11:25

Add saved query protocol and implementation.

0d4b70f

Update JSON schema to include saved queries.

50d0418

Update YAML parsing code to handle saved queries.

c5b5d73

Add validation for saved queries.

53bfb33

Add tests for the saved query validation rule.

776fb67

Add change log for #144

37c96f2

plypaul force-pushed the plypaul--49--saved-queries2 branch from 961e4ab to 37c96f2 Compare September 19, 2023 18:26

plypaul merged commit 3d02807 into main Sep 19, 2023
9 checks passed

plypaul deleted the plypaul--49--saved-queries2 branch September 19, 2023 18:34

QMalcolm mentioned this pull request Oct 23, 2023

Validate Saved Query Names #185

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add New SavedQuery Protocol #148

Add New SavedQuery Protocol #148

plypaul commented Sep 14, 2023 •

edited

Loading

QMalcolm left a comment

QMalcolm Sep 14, 2023

tlento Sep 14, 2023

QMalcolm Sep 14, 2023

QMalcolm Sep 14, 2023

plypaul Sep 15, 2023

QMalcolm Sep 15, 2023

QMalcolm Sep 18, 2023

tlento Sep 14, 2023

tlento Sep 14, 2023

QMalcolm Sep 14, 2023

QMalcolm left a comment

QMalcolm Sep 15, 2023

plypaul Sep 16, 2023

QMalcolm left a comment

QMalcolm Sep 18, 2023

QMalcolm Sep 18, 2023

plypaul Sep 18, 2023

QMalcolm left a comment

		* Check if metric names exist in the manifest.
		* Check that the where filter is valid using the same logic as WhereFiltersAreParsable

Add New SavedQuery Protocol #148

Add New SavedQuery Protocol #148

Conversation

plypaul commented Sep 14, 2023 • edited Loading

Description

Checklist

QMalcolm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

QMalcolm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

QMalcolm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

QMalcolm left a comment

Choose a reason for hiding this comment

plypaul commented Sep 14, 2023 •

edited

Loading