Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use more standard paging for GET /runs endpoint #159

Open
ghost opened this issue Oct 1, 2020 · 8 comments
Open

Use more standard paging for GET /runs endpoint #159

ghost opened this issue Oct 1, 2020 · 8 comments

Comments

@ghost
Copy link

ghost commented Oct 1, 2020

The GET /runs endpoint (https://ga4gh.github.io/workflow-execution-service-schemas/docs/#operation/ListRuns) uses an unorthodox pagination method.

It would be better to use more standard pagination, such as that suggested here

@ruchim
Copy link
Collaborator

ruchim commented Dec 8, 2020

thanks @kaushik-work -- will check with the TES team to make sure we have consistency in pagination across TES & WES.

@ruchim
Copy link
Collaborator

ruchim commented Dec 8, 2020

@kellrott -- is this an example of where WES could learn from what the TES standard is doing? Or is this a problem that needs solving for TES as well?

@patmagee
Copy link
Contributor

patmagee commented Jan 7, 2021

FWIW Search Has implemented somthing similar to this here

{
  "pagination": {
    "next_page_url": "http://.....",
    "previous_page_url": "http://...."
  }
}

@denis-yuen
Copy link
Member

@Mifrill
Copy link

Mifrill commented Jul 14, 2022

any updates on that?

As an alternative it also might be useful to add total_pages or total_runs as new property for endpoint response:

properties:
runs:
type: array
items:
$ref: './RunStatus.yaml'
description: A list of workflow runs that the service has executed or is executing. The list is filtered to only include runs that the caller has permission to see.
next_page_token:
type: string
description: A token which may be supplied as `page_token` in workflow run list request to get the next page of results. An empty string indicates there are no more items to return.

or like it's done in TRS via header:

last_page:
  description: A URL that can be used to reach the last page based on the current
    page record limit.
  schema:
    type: string

https://github.com/ga4gh/tool-registry-service-schemas/blob/2ecf30ebd877256d521c5d82db577c69decf0b1b/openapi/openapi.yaml#L181-L185

since currently there is no way to implement "orthodox" pagiantion with first / next / last references for pages, only possible to implement Next or Prev page, that might be solved by addition total number of runs.

@Mifrill
Copy link

Mifrill commented Jul 28, 2022

@denis-yuen

TRS uses HTTP headers for pagination links

That's correct, but why WES is not?
https://github.com/ga4gh/workflow-execution-service-schemas/blob/develop/openapi/components/schemas/RunRequest.yaml

responses:
200:
description: ''
headers: {}
content:
application/json:
schema:
$ref: '#/components/schemas/RunListResponse'

Why it's using page_size and page_token query params instead of TRE format?
https://github.com/ga4gh/tool-registry-service-schemas/blob/c343e9c64e6e480f0b5a212d93473d8aba59ed0e/openapi/openapi.yaml?rgh-link-date=2021-01-07T15%3A07%3A12Z#L203-L204

@patmagee
Copy link
Contributor

@Mifrill I think the likely answer to that question is that WES was developed independently of TRS. TRS was limited in how they communicated pagination information because they chose to return a List instead of an Object when doing a listing and I am guessing that lead them to consider the headers. Personally, I am not a fan of solely using the headers to convey pagination. I think it is definitely helpful to have it there, but they are much more awkward to work with. For example, it becomes pretty difficult to page over responses if you are using tools like curl and jq if the pagination is not embedded in the JSON response body. The github docs that @denis-yuen linked provides a bit of an example, but not one that is usable programatically.

As for the page_token and the other parameters (page_size, limit, offset etc), I think the sweet spot here is defining a single way to specify the page size that you want (let's call this limit), then allowing the engine to return a complete URL to the next page. This would allow a user to simply follow the links, and they could point anywhere! Requiring the user to pass the next_page_token instead of simply providing the next_page_url seem's like a solution that is 1/2 there. AT the same time, requiring every single implementation to have limit and offset may not work depending on the underlying infrastracture/db. For example, if your implementation does not allow for slicing of data (similar to a relational database) but only streaming and holding a cursor, how would offset be used?

For the same reason, I do not think it is possible to require the last_page url or even previous_page url, since the system might not actually be capable of generating those links. In a busy system a single WES instance can be responsible for dealing with 1,000,000's of run logs so it may not be performant to provide the last page

@denis-yuen
Copy link
Member

(note related discussion in ga4gh/TASC#29 )

For example, it becomes pretty difficult to page over responses if you are using tools like curl and jq if the pagination is not embedded in the JSON response body. The github docs that @denis-yuen linked provides a bit of an example, but not one that is usable programatically.

FWIW, we do use pagination programmatically as a client using a library which is aware of headers
https://github-api.kohsuke.org/apidocs/org/kohsuke/github/PagedIterable.html
And GitHub links to https://github.com/octokit/octokit.rb#pagination

That said, I don't think there's any reason why the information could not be duplicated into the JSON body under a _links section, we did have a suggestion to add this into TRS here ga4gh/tool-registry-service-schemas#160

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants