-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GA4GH-wide pagination syntax #29
Comments
Currently, I'm only aware of the following GA4GH specs that incorporate pagination: looping in @mcupak and @denis-yuen from Data Connect (DC) and TRS, respectively. Miro and Denis, would you be interested in contributing to a pagination schema that could be used across multiple GA4GH specs? |
Yup, thanks for the tag. We have other endpoints which have grown in usage to the point that we need to add pagination to them as well so definitely would like to keep an eye on this conversation and/or contribute |
As another input to the discussion, here are Google's best practices for API pagination: https://cloud.google.com/apis/design/design_patterns#list_pagination |
Thanks @jb-adams! We also deferred a decision on pagination to a future release of the Service Registry so as to start small and not have to commit to a particular opinionated way of doing it, so that would be a good candidate for taking this up when it's ready as well. Some more details on pagination in Data Connect. To sum up, we go very minimal - we don't specify a way of requesting pages on the client side, we simply say the server can provide a next page URL, which can be any URL (relative or absolute). That way implementations choose their own pagination strategy, only relying on clients resolving the links. The next page URL is null/empty on the last page, and it's the only field we use in pagination. |
FWIW, that seems to match TRS. We provide |
I was thinking this too, that currently DC's
One idea for a path to a unified schema could involve:
Thoughts on the above? I also see that DC's pagination is given in the response body, whereas TRS's is in the header. I don't think this should be a blocker for the development of the schema itself, as it could be usable in either location (although represented differently either as JSON or HTTP headers, and this will affect the way it's written in the OpenAPI) |
Generally feels good. A few notes:
We should also standardize request parameters:
On syntax:
|
The final page. I agree that this could be optional since it is a just a convenience to quickly reach the last page (which may be more or less useful depending on how the pages is sorted). Ditto with self link.
I was under the impression that having these kind of links in the header is a standard. Searching around, this argues that GitHub is following rfc 5988
https://blog.arkey.fr/2017/10/02/pagination_hateoas_link.en/ |
Thanks @denis-yuen for the pointer to GitHub API pagination -- I hadn't seen that before, but like it as a pattern, and saying "we're the same as GitHub" is a good story. It has pretty much all the same features as TRS (so would be easy for any TRS server/client to adopt), but uses different syntax (which would be a temporary nuisance). Mapping my previous opinions to the GitHub syntax, every API that supports pagination:
|
A couple of quick comments
|
I'm supportive of the | The DataConnect use of pagination is using it as a way of polling if results are ready yet. i.e. you follow "next page" URLs until data appears. Is that a common paging need? Or a different use case? The | Beacon is another GA4GH API with a current approach to pagination that should be brought into the discussion looping @jrambla into this discussion. Jordi, we are discussing if we can converge around an approach to pagination that will be picked up by multiple GA4GH specs. |
I haven't looked at it in a while, but I vaguely recall some of the difference for TRS might be due to the fact that we were working in swagger 2.0 as opposed to openapi 3.0 now. But I'm commenting totally without double-checking |
Re polling support in DataConnect -- if I'm reading it right, I don't see a problem. The DC spec uses an empty data payload ("a TableData object whose data property is a zero element array") to mean "this page of results isn't ready yet", and clients are expected to use the I agree with @ianfore that it's a different use case, and I don't expect most APIs will need to support long-running queries, but I don't mind the slight overloading for those that do, like DataConnect. (And it feels equally clean in either the current syntax or the newly proposed syntax.) |
Beacon approach is using a naive old version of pagination, that we haven't revised in several years. For Beacon v2 we are reviewing it. Therefore I would not consider the current approach ( The discussion above triggers a couple of fundamental question for me:
My personal preference in Beacon has been for making it as much as protocol agnostic as possible. |
Notes from the FASP hackathon on this issue can be found here under item 2B |
Thanks for the notes. I can't tell current status -- do we feel better or
worse about the proposed GitHub-like approach, and (assuming better) how
close are we to having a votable PR?
…On Tue, Aug 10, 2021 at 5:46 AM Jeremy Adams ***@***.***> wrote:
Notes from the FASP hackathon on this issue can be found here
<https://docs.google.com/document/d/1h1EP8Z8SyTS_T4pwMVO_SfvEynHw2U7hNk8I8qhp4OA/edit#heading=h.1n4wzfw8085v>
under item 2B
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#29 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABMOBXJBYWJT53PGPDP5TTTT4ENTLANCNFSM44R7YPJQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email>
.
|
Checking in, since this was mentioned at the Steering Committee just now as ongoing, but I haven't seen any updates in almost a year -- what's the current status, and how close are we to having a votable PR? |
FWIW, probably not very useful, but I haven't seen/heard anything on my end. |
Me neither, althogh, as you can deduct, we have something already in the approved Beacon v2 |
@dglazer, you are right; nothing much has happened since the last discussions before last year's FASP hackathon. We will table this issue for discussion at TASC's upcoming meeting. It looks like the GitHub-like approach is what most would go for. |
In most of the REST API's that I have used, Pagination is not strictly in the header but gets incorporated directly in the response body. As an implementor, putting the pagination in the headers adds a bit of a conundrum since we often fetch data from a remote serve and pass the response body back to a UI or client. At this stage all of the context around headers has been stripped out and is no longer accessible to the end users. Additionally, placing pagination outside of the Response Body means that any internal handling of that response needs to pass more information around then just translating the body from json into some sort of Object. Another issue is one of practical integration with other tools ie I am definitely in favour of having all information related to pagination directly in the response body. It is easy to interpret, it is disjoint from the actual original request (ie you can look at a JSON alone and determine what the next page should be), and IMO its going to be the easiest thing for people to implement on both sides of the implementation spectrum (IE servers and clients) I am a bit concerned about the Github Header based approach of pagination |
It is also worth asking the question of who the end user is, and how we expect they will interact with our apis. GitHub's approach is a tried and true method as evidenced by its usage in the GitHub api. That being said, I don't really know anyone other then application developers who've used the GitHub api directly. |
I'm still more keen on the body option, as stated in my comment on Jul 4, 2021. |
@jrambla That is a good point. I think Data Connect demonstrated the power of using pagination within the body by allowing you to create a fully functioning API WITH pagination strictly by using static files plopped in a bucket. |
FWIW, did a bit more looking around. It looks like rfc 5988 is followed by rfc 8288 which has a few more examples. Can also search github for examples of libraries and code that use them, for example https://github.com/search?q=rfc8288&type=commits If we do end up rolling our own standard in the response body, I would suggest keeping the link headers so that we remain compatible even if it is redundant. |
@denis-yuen thanks for the links to the rfc. I think it's 100% acceptable to allow both approaches, but just decide that there is a minimum approach that all apis should have in regards to pagination. |
Draft proposal will be distributed asap for Monday's TASC meeting |
Closing this as #55 will ensure the change is moved to approved |
Cloud WS is looking to incorporate a harmonized, GA4GH-wide pagination syntax into their specs. This is most pertinent for DRS, as we want to update the spec to allow bulk object request and hence paginated results.
Pagination is currently handled by a number of GA4GH specs, but there is not necessarily interop between them. A goal would be to converge upon a minimal common pagination schema that existing specs can make use of, that the DRS spec will then uptake if approved.
@dglazer @ianfore
The text was updated successfully, but these errors were encountered: