Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to fold a query into an SqlArray if it contains multiple fields? #558

Open
abigailalice opened this issue Aug 26, 2022 · 10 comments
Open

Comments

@abigailalice
Copy link

abigailalice commented Aug 26, 2022

I'm trying to return some data from my database representing a one-to-many relation, and am trying to do it in a single query, but don't see how to accomplish it. There is arrayAgg :: Aggregator (Field a) (Field (SqlArray a)), but the type seems to restrict me from returning more than a single field in each element of the array. In Postgresql you can do things like select array_agg((x,y)) from sometable;, so I don't think this should be a limitation, but I'm not sure if the more first-class-ish tuples that can be put into arrays are actually supported. Unless there's some combinator that can do things like Field_ n a -> Field_ m b -> Field (Field_ n a, Field_ m b) I'm not sure how I might do this.

For the moment I just have a for loop with a separate query for every element retrieved from the first table, but this is obviously not ideal if there's a way around it. Ultimately I want the returned type from the query to be something like [(a, [b])].

@tomjaguarpaw
Copy link
Owner

Opaleye doesn't support SQL-level tuples (anonymous records) yet, so you can't do quite that, but you can aggregate into two separate arrays and then zip them on the Haskell side. That is, return something like [(a, [b], [c])] and then zip the bs and cs to get a [(b, c)]. Does that make sense?

@abigailalice
Copy link
Author

abigailalice commented Aug 28, 2022

That seems simple enough, though would the type of arrayAgg restrict that from working with nullable fields? I think I might be able to work around that if it's a limitation, but I'm curious if there's a way to do that.

@tomjaguarpaw
Copy link
Owner

Hmm, yes, but that seems like a bug in arrayAgg. Can you try

arrayAgg_ :: Aggregator (F.Field_ n a) (F.Field (T.SqlArray_ n a))
arrayAgg_ = Opaleye.Internal.Aggregate.makeAggr HPQ.AggrArr

and let me know if that works OK? If so I can add it to Opaleye.

@abigailalice
Copy link
Author

Should I need a DefaultFromFields (SqlArray_ 'Nullable sqlType) [Maybe haskellType] instance? Trying to test it in my own code said it was a missing instance, and I do see there's a Default (Inferrable FromField) (SqlArray_ Nullable sqlType) [Maybe hsType] instance that looks pretty close, but I don't quite get how they relate to each other.

@tomjaguarpaw
Copy link
Owner

Thanks for reporting all of these omissions regarding nullable fields. It looks like I made a lot of oversights when converting from Column to Field!

You are correct that the instance has been incorrectly omitted. Please try adding it back locally with

instance (Typeable b, DefaultFromField a b) =>
         DefaultFromField (T.SqlArray_ Nullable a) [Maybe b] where
  defaultFromField = fromFieldArrayNullable defaultFromField

I will add this instance in a bugfix release too.

tomjaguarpaw added a commit that referenced this issue Sep 1, 2022
This should have been added when we did the Column -> Field conversion

Addresses #558 (comment)
tomjaguarpaw added a commit that referenced this issue Sep 1, 2022
This should have been added when we did the Column -> Field conversion

Addresses #558 (comment)
tomjaguarpaw added a commit that referenced this issue Sep 1, 2022
This should have been added when we did the Column -> Field conversion

Addresses #558 (comment)
@tomjaguarpaw
Copy link
Owner

The instance issue should be fixed in https://hackage.haskell.org/package/opaleye-0.9.4.0. Let me know if this doesn't solve the problem for you.

@abigailalice
Copy link
Author

So I tried using arrayAgg_ as you suggested and it looks like things are working, though my use of it wasn't super extensive. The workaround with the zipping instead of a runSelect inside of a loop also seemed to work well; I assume the DBMS would rewrite the shared part of the aggregations into a single aggregation, but regardless it sped things from a couple of seconds to close to instantaneous.

@tomjaguarpaw
Copy link
Owner

Nice! I should add arrayAgg_ to Opaleye then.

@tomjaguarpaw
Copy link
Owner

OK, arrayAgg_ is available at https://hackage.haskell.org/package/opaleye-0.9.5.1

@abigailalice
Copy link
Author

abigailalice commented Sep 12, 2022

So if this can be accomplished with the product-profunctors interface instead of lenses it might be worthwhile adding an example to the documentation for arrayAgg_ mentioning the zipping idiom, but even if not I thought mentioning it might be helpful if anyone ends up with the same issue. But I ended up needing to do a large number of such aggregations, and found the easiest way to accomplish them was doing something like

aggregateOf
    :: ((Select s -> Select (Field (SqlArray_ n a))) -> w -> m t
    -> Lens.Getting (Field_ n a) s (Field_ n a)
    -> w
    -> m t
aggregateOf l l' = l (aggregate arrayAgg_ . fmap (Lens.view l'))

...
unzippedRows <- pure SomeRow
    { someColumn1 = selectSomeRow
    , someColumn2 = selectSomeRow
    }
    >>= aggregateOf someColumn1Lens someColumn1Lens
    >>= aggregateOf someColumn2Lens someColumn2Lens

where selectSomeRow :: Select (SomeRow (Field SqlCol1Field) (Field SqlCol2Field)). The basic idea is that the SomeRow datatype contains the select statement to retrieve itself in each field, and aggregateOf projects out the individual field, aggregates it, and then traverses out the select, so it now contains the list of that row.

It still makes me a touch nervous to trust postgresql to share the work duplicated in each aggregation, but I don't think there's any way to avoid that with the zipping idiom regardless of whether it's abstracted like this or not. I don't know if that concern is really warranted given how thoroughly DBMSes are optimized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants