-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add New SavedQuery Protocol #148
Conversation
9d81690
to
57df3a2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks fantastic! Some small questions that we should figure out before approving and merging. Just want to have things really nailed down before starting the core work.
|
||
@property | ||
@abstractmethod | ||
def group_by_item_names(self) -> Sequence[str]: # noqa: D |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume the idea is that group_by_item_names
aren't required, but should be represented as an empty set when there are none? Was there a conversation somewhere about whether not we should mark them as optional (Optional[Sequence[str]]
)? I'd just like to understand the logic so I can keep it in mind when reviewing other parts of the protocol and ensuring consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a general rule, collections should not be typed as optional unless there's a meaningful difference between "no collection" and "empty collection".
So we should not make these optional return types, but on implementation we can default them to an empty Sequence type so users are not required to pass in a value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect, that works for me 🙂
* Check if metric names exist in the manifest. | ||
* Check that the where filter is valid using the same logic as WhereFiltersAreParsable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we also need to check group_by_item_names
to check that they reference existing things? Additionally, what are valid group_by_item_names
, is it just categorical dimensions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@QMalcolm Yeah, that would be good to check, but I didn't see similar checks being done for filters. Is that right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right group_bys take the form of Dimension(...)
. Then we need a validation similar to the WhereFilter
validations that checks that they're parsable by the jinja templater, because as with WhereFilters
the jinja template parser is the only thing can enforce the protocol structure of these structured strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes to this look good!
|
||
@property | ||
@abstractmethod | ||
def group_by_item_names(self) -> Sequence[str]: # noqa: D |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a general rule, collections should not be typed as optional unless there's a meaningful difference between "no collection" and "empty collection".
So we should not make these optional return types, but on implementation we can default them to an empty Sequence type so users are not required to pass in a value.
|
||
name: str | ||
metrics: List[str] | ||
group_bys: List[str] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@QMalcolm does it matter that this is not part of the protocol? It shouldn't, but I wanted to make sure this won't cause havoc with the core parsing setup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. It would make things weird. To stick to the protocol we'd have to serialize group_by_item_names
, but then MetricFlow uses these pydantic objects for deserialization, and since their is no setter for group_by_item_names
on this object, they'd get dropped 😬 In a separate comment you brought up a question about why do we have both group_bys
and group_by_item_names
, and I think that's actually the better question. I think we just want group_bys
? I'm not seeing what group_by_item_names
provides.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're looking really good, we just need to do a bit more on handing of group_bys I believe.
issues: List[ValidationIssue] = [] | ||
|
||
for group_by_item_name in saved_query.group_bys: | ||
structured_name = DunderedNameFormatter.parse_name(group_by_item_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already talked about it in a call, but documenting it here. This assumes that the group_bys
will be strings of the form '<primary_enitity>__<dimension>'
. Although eventually we want to support that, the strings right now will be of the form 'Dimension("<primary_entity>__<dimension>")'
. We should probably have a GroupByParser
, similar to the WhereFilterParser
for parsing group_bys, and use that for validating that the group_bys have valid structuring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re-used the WhereFilterParser
for now - there's some work going on right now to support Dimension(...).grain(...)
, so I'll wait until that's in place to switch out the implementation.
5c3593c
to
00a0821
Compare
00a0821
to
5e75b16
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some small changes, left, but the overall implementation looks great!
* Check if metric names exist in the manifest. | ||
* Check that the where filter is valid using the same logic as WhereFiltersAreParsable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes to this look good!
"items": {"type": "string"}, | ||
}, | ||
}, | ||
"additionalProperties": False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I don't think we have to, but we should probably mark the properties name
and metrics
to be required. I think parsing will error out either way if its not specified, as the creation of the object will fail. But specifying them in the jsonschema spec will give better errors I believe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! Sorry that there was so much back and forth on this one 🙃
961e4ab
to
37c96f2
Compare
This probably should have been part of #148, but we forgot. I'm adding these tests here because I plan to add `label` to the `SavedQuery` protocol in the coming commits. With that, I'll want to updated the parsing tests to check it. To do that, the tests need to exist.
Resolves #144
Description
Please see the associated issue for details.
Checklist
changie new
to create a changelog entry