-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: order_by
for compile_sql
now works as expected
#49
Conversation
The tests for `query` and `compile_sql` did not test all allowed parameters, which made the `order_by` bug go unnoticed. This commit fixes that by adding all the remaining parameters to the tests. Note that this commit is in a broken state since the fix hasn't been applied yet. That will come in a future patch.
d056afa
to
55b4ea6
Compare
""" | ||
|
||
name: str | ||
grain: Optional[TimeGranularity] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may need to update this (along with whereever else uses TimeGranularity
), since I believe with custom granularity, this can be an arbitrary string. cc: @courtneyholcomb to confirm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's correct!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep! I wanted to do that in a followup PR to avoid mixing things but I'm aware :)
In the folowup PR, I'll do something along the lines of
Grain = Union[TimeGranularity, str]
Then update all the call sites that take TimeGranularity
and make them use Grain
instead.
@@ -24,27 +52,30 @@ def test_serialize_query_params_complete_query() -> None: | |||
"metrics": ["a", "b"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have any tests against object syntax through adbc? Something like this should work
{{ semantic_layer.query(metrics=[Metric("orders")]) }}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh the SDK only generates object syntax for order by. For metrics and group by we just use regular string arrays.
Is there a reason we would need that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For order by there are tests tho
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For extra context: there's no way for a user to submit a raw string to be sent via ADBC. They have to do like:
with client.session():
table = client.query(metrics=["my_metric"], group_by=["my_dim"])
print(table)
Under the hood, the SDK will generate a SQL statement like the following and send it via ADBC.
SELECT * FROM {{ semantic_layer.query(metrics=["my_metric"], group_by=["my_dim"]) }}
d954309
to
3a107d2
Compare
This commit improves our validation and representation of query parameters, and fixes the bug with `order_by`. We're still in an inconsistent state: we gotta propagate the changes and use them in other classes. Will do that in the following patch.
This commit makes the ADBC and GraphQL protocol implementations use the new stricter representation for query params. It updates the tests accordingly.
This commit updates the public `.pyi` files to ensure we use the new order by spec
Added changelog entries related to the order by changes.
3a107d2
to
72dbfa5
Compare
Summary
This PR changes the internal representation of
QueryParameters
to add stricter validation and fix the ambiguity oforder_by
parameters when sent via GraphQL. It also exposes newOrderByMetric
andOrderByGroupBy
top-level objects so that users can specify their clauses more precisely if strings don't suffice.I recommend you review this commit by commit :)
Rationale of the changes
Improving the internal representation of query parameters
To facilitate typing across different classes, we created a
QueryParameters
class that is aTypedDict
. While this is convenient for static typing ofkwargs
,TypedDicts
don't do any validation. To fix that, I created two dataclasses:AdhocQueryParametersStrict
andSavedQueryQueryParametersStrict
, which represent "strictly" validated parameters that are ready to be sent to the API. Then, I improved thevalidate_query_parameters
method so that it returns one of these, and raises errors in case there was any validation error of the parameters dict.Remove ambiguity of
order_by
The GraphQL API forces us to specify whether an
order_by
clause is a metric or a dimension. To remove ambiguity from the representation, the "strict" parameter dataclasses only supportOrderByMetric
andOrderByGroupBy
, and it is the job of the validator to convert strings like-my_metric
into an object likeOrderByMetric(name="my_metric", descending=True)
. Now, theprotocol
implementations don't need to worry about what is what since that's solved in the validation layer, before calling the ADBC or GraphQL protocols.Making protocols use the new representation
Finally, I changed the GraphQL and ADBC protocols to use the return value of
validate_query_parameters
when building a query request.Breaking changes
Since we don't have the list of known metrics/dimensions at query time when querying through a saved query, you now have to specifically provide a
OrderByMetric
orOrderByGroupBy
in that case, otherwise the SDK will raise an error. For adhoc queries, you can still provide a string and the SDK will parse the spec objects from that.