You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Here is a summary on the discussion on custom response parsing (#108, #125, and others):
In version 1 of the HTTP Client Module, the override-media-type was available (inspired by the override-content-type option in XProc). It could be used to overwrite the Content-Type header of a response.
In practice, the approach turned out be fairly flexible, but it had some shortcomings: It did not allow for a fine-grained processing of multipart bodies, and it was not intuitive enough for all users.
The following alternatives have been proposed in the scope of version 2 of the spec:
Description: Parsing of the response body can be disabled via the parse-response option. All bodies of single and multipart responses will be returned as binary items of type xs:base64Binary, and the values can be processed (stored, parsed, forwarded) in a second step.
Adam pointed out that the name may be misleading, so it’s named parse-response-entity-body in the current draft. My suggestion in #108 was to call it parse-bodies: Only responses are “parsed” (requests are serialized), and the plural form indicates that we may have multiple bodies in a response.
parse-response (enum)
Adam made a suggestion for extending the proposal in #108:
raw. We don't have an equivalent option at the moment, but the idea is that the raw response from the server is returned. i.e. no parsing occurs, no status, no headers. This has applications for debugging and also for logging responses.
status. This would be equivalent to status-only: true().
headers. This would be the equivalent to parse-response-entity-body : false()
multipart-raw. This would extract the headers of the response, and locate the multipart bodies, however this would present each multipart in a raw manner, i.e. no multipart headers would be parsed.
full. This would be the default, and basically the same as the current parse-response-entity-body : true()
parse-response (map)
In #125, the proposal was extended to a nested map (further discussion see #125 (comment)).
I decided to summarize the proposals as I believe that a plain and simple solution might lead to less confusion and may even be more flexible, because a user can always do post-processing in XQuery.
In my opinion, the major requirements for (non-implicit) response parsing is to be able to retrieve bodies (single part, multiple bodies) in their original representation. In #125, I proposed the following solution:
parse / parse-bodies (string)
Option
Description
auto
implicit parsing (default)
string
return all bodies as strings
binary
return all bodies as binaries
skip
ignore response body
I believe this approach would be sufficient to cover most challenges people will be confronted with (but, honestly, not all that we could envision):
In most cases, people will use the default (auto).
If the requested result cannot be converted to the implicit target format, or if another format is required than resulting from the implicit conversion, the string option can be used for textual results. All bodies will be converted to strings, based on the encoding that is returned by the server (optionally) via the original Content-Type header and the charset option.
The binary option is helpful…
if the conversion is no text,
if the string conversion fails,
if some bodies of a multipart response are textual and some are binary, or
if the results needs to be processed only as simple stream.
The skip is option is used if only the headers of a result are required.
Some more thoughts on this simplified approach are listed in #125.
Examples for using the approach:
(: return single JSON response as XML :)http:get('http://json.db/doc123', map { 'parse': 'string' })?body
=> fn:json-to-xml()
(: store returned multipart bodies :)for $part at $pos inhttp:get('http://multipart.db/data123', map { 'parse': 'binary' })?body
returnfile:write-binary($pos || '.bin', $part?body)
Here is a summary on the discussion on custom response parsing (#108, #125, and others):
In version 1 of the HTTP Client Module, the
override-media-type
was available (inspired by theoverride-content-type
option in XProc). It could be used to overwrite theContent-Type
header of a response.In practice, the approach turned out be fairly flexible, but it had some shortcomings: It did not allow for a fine-grained processing of multipart bodies, and it was not intuitive enough for all users.
The following alternatives have been proposed in the scope of version 2 of the spec:
parse-response
(boolean)parse-response
option. All bodies of single and multipart responses will be returned as binary items of typexs:base64Binary
, and the values can be processed (stored, parsed, forwarded) in a second step.Adam pointed out that the name may be misleading, so it’s named
parse-response-entity-body
in the current draft. My suggestion in #108 was to call itparse-bodies
: Only responses are “parsed” (requests are serialized), and the plural form indicates that we may have multiple bodies in a response.parse-response
(enum)Adam made a suggestion for extending the proposal in #108:
raw
. We don't have an equivalent option at the moment, but the idea is that the raw response from the server is returned. i.e. no parsing occurs, no status, no headers. This has applications for debugging and also for logging responses.status
. This would be equivalent tostatus-only: true()
.headers
. This would be the equivalent toparse-response-entity-body : false()
multipart-raw
. This would extract the headers of the response, and locate the multipart bodies, however this would present each multipart in a raw manner, i.e. no multipart headers would be parsed.full
. This would be the default, and basically the same as the currentparse-response-entity-body : true()
parse-response
(map)In #125, the proposal was extended to a nested map (further discussion see #125 (comment)).
I decided to summarize the proposals as I believe that a plain and simple solution might lead to less confusion and may even be more flexible, because a user can always do post-processing in XQuery.
In my opinion, the major requirements for (non-implicit) response parsing is to be able to retrieve bodies (single part, multiple bodies) in their original representation. In #125, I proposed the following solution:
parse
/parse-bodies
(string)auto
string
binary
skip
I believe this approach would be sufficient to cover most challenges people will be confronted with (but, honestly, not all that we could envision):
auto
).string
option can be used for textual results. All bodies will be converted to strings, based on the encoding that is returned by the server (optionally) via the originalContent-Type
header and thecharset
option.binary
option is helpful…skip
is option is used if only the headers of a result are required.Some more thoughts on this simplified approach are listed in #125.
Examples for using the approach:
@adamretter: Maybe my thoughts are too plain and simple? Do you get some more use cases in mind that we should consider? Looking forward to feedback!
The text was updated successfully, but these errors were encountered: