Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom Response Parsing #127

Open
ChristianGruen opened this issue Oct 22, 2018 · 0 comments
Open

Custom Response Parsing #127

ChristianGruen opened this issue Oct 22, 2018 · 0 comments

Comments

@ChristianGruen
Copy link
Member

Here is a summary on the discussion on custom response parsing (#108, #125, and others):

In version 1 of the HTTP Client Module, the override-media-type was available (inspired by the override-content-type option in XProc). It could be used to overwrite the Content-Type header of a response.

In practice, the approach turned out be fairly flexible, but it had some shortcomings: It did not allow for a fine-grained processing of multipart bodies, and it was not intuitive enough for all users.

The following alternatives have been proposed in the scope of version 2 of the spec:

parse-response (boolean)

Adam pointed out that the name may be misleading, so it’s named parse-response-entity-body in the current draft. My suggestion in #108 was to call it parse-bodies: Only responses are “parsed” (requests are serialized), and the plural form indicates that we may have multiple bodies in a response.

parse-response (enum)

Adam made a suggestion for extending the proposal in #108:

  • raw. We don't have an equivalent option at the moment, but the idea is that the raw response from the server is returned. i.e. no parsing occurs, no status, no headers. This has applications for debugging and also for logging responses.
  • status. This would be equivalent to status-only: true().
  • headers. This would be the equivalent to parse-response-entity-body : false()
  • multipart-raw. This would extract the headers of the response, and locate the multipart bodies, however this would present each multipart in a raw manner, i.e. no multipart headers would be parsed.
  • full. This would be the default, and basically the same as the current parse-response-entity-body : true()

parse-response (map)

In #125, the proposal was extended to a nested map (further discussion see #125 (comment)).

I decided to summarize the proposals as I believe that a plain and simple solution might lead to less confusion and may even be more flexible, because a user can always do post-processing in XQuery.

In my opinion, the major requirements for (non-implicit) response parsing is to be able to retrieve bodies (single part, multiple bodies) in their original representation. In #125, I proposed the following solution:

parse / parse-bodies (string)

Option Description
auto implicit parsing (default)
string return all bodies as strings
binary return all bodies as binaries
skip ignore response body

I believe this approach would be sufficient to cover most challenges people will be confronted with (but, honestly, not all that we could envision):

  • In most cases, people will use the default (auto).
  • If the requested result cannot be converted to the implicit target format, or if another format is required than resulting from the implicit conversion, the string option can be used for textual results. All bodies will be converted to strings, based on the encoding that is returned by the server (optionally) via the original Content-Type header and the charset option.
  • The binary option is helpful…
    • if the conversion is no text,
    • if the string conversion fails,
    • if some bodies of a multipart response are textual and some are binary, or
    • if the results needs to be processed only as simple stream.
  • The skip is option is used if only the headers of a result are required.

Some more thoughts on this simplified approach are listed in #125.

Examples for using the approach:

(: return single JSON response as XML :)
http:get('http://json.db/doc123', map { 'parse': 'string' })?body
=> fn:json-to-xml()
(: store returned multipart bodies :)
for $part at $pos in http:get('http://multipart.db/data123', map { 'parse': 'binary' })?body
return file:write-binary($pos || '.bin', $part?body)
(: ignore reponse bodies :)
http:get('http://json.db/doc123', map { 'parse': 'skip' })

@adamretter: Maybe my thoughts are too plain and simple? Do you get some more use cases in mind that we should consider? Looking forward to feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant