-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CV2-5117] spec doc describing batch format only #116
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,179 @@ | ||
# Expected incoming request and outgoing response datastructures for Presto models | ||
|
||
Presto models should use these structures to ensure that tools that consume the data can access in a consistant way across models | ||
|
||
Details of specific model's input and response formats are usually in the `<model_name>_response.py` files. i.e. `classycat_response.py` | ||
|
||
## Incoming requests to presto | ||
|
||
General design | ||
|
||
* Models that don't handle requests in parallel need to implement batch format (but can give errors if they recieve more than 1 item) | ||
* "top level" elements in the payload (outermost dictionary keys) are processed and controlled by the Presto system | ||
* Elements inside parameters, and inside individual input items are passed through to the model | ||
* From the perspective of the calling service, all the of the items in a single batch call must be dispatched to the same Presto model service. (don't mix items for paraphrase and classycat it the same call) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. as I had mentioned in the previous PR too, this is misleading and should be reworded more specifically like: "you can include only one model name in a batch call." what happens in the background of that request and whether the request gets routed to multiple models (classycat LLM vs classycat local classification) is not a concern for the consumer of the service, as long as it is getting out what it expected when issuing the call. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hard agree on this assumption |
||
* Any parameter settings to the model must apply to all items in batch | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My question here would be how would global parameters even be honored / used? Should all that info just get delegated down to the per-item level of detail? Could get messy to have that info at the task level |
||
* Request itself needs a unique id, (and this is different than the id of the content) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does it though? I could see not providing it and just getting a job id back that's a randomly assigned UUID on the response from presto? |
||
* There is single callback URL for entire batch | ||
* Some items need additional input parameters per item (i.e. “language” for YAKE), Presto just passes thes through | ||
* The input structure will be returned in the response by the presto system, so individual models don't have to copy inputs to outputs. As long as .. | ||
* All input_items must include an `id` element that can be used as a reference from the output items to match up to input parameters | ||
|
||
Example PROPOSED input object structure (classycat) | ||
``` | ||
{ | ||
request_id:122, | ||
model_name:"classycat__Model", # model to be used | ||
model_parameters: { | ||
"event_type": "classify", | ||
"schema_id": "4a026b82-4a16-440d-aed7-bec07af12205", | ||
}, | ||
input_items:[ | ||
{ | ||
id:"11", | ||
text:"this is some text to classify", | ||
workspace_id:"timpani_meedan", | ||
target_state:"classified", | ||
}, | ||
{ | ||
id:"12", | ||
text:"Este é um texto para classificar", | ||
workspace_id:"timpani_meedan", | ||
target_state:"classified", | ||
}, | ||
], | ||
callback_url: "http://conductor/batch_reply/", | ||
} | ||
|
||
``` | ||
Example PROPOSED input object structure (YAKE) | ||
{ | ||
request_id:422, | ||
model_name:"yake_keywords__Model", # model to be used | ||
model_parameters: { | ||
"max_ngram_size":3, | ||
"deduplication_threshold":0.25 | ||
}, | ||
input_items:[ | ||
{ | ||
id:"11", | ||
text:"this is some text to classify", | ||
workspace_id:"timpani_meedan", | ||
target_state:"keyworded", | ||
language:"en", | ||
|
||
}, | ||
], | ||
callback_url: "http://conductor/batch_reply/", | ||
} | ||
|
||
|
||
Example PROPOSED input object structure (paraphrase-multilingual) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it's important to have a rough idea of what this looks like, since this is a very different model endpoint from classycat and yake. |
||
|
||
|
||
Example 'curl' command (put here after implementation) | ||
|
||
|
||
## Outgoing requests from presto | ||
|
||
* The outgoing reponse request includes the payloads and parameters from the incoming request | ||
* items in the response must include their id, so that the caller can match and lookup individual-level properties from incoming request. | ||
* status code is present at top level so it can be checked to know how to parse the rest of the payload (error etc) | ||
|
||
Example PROPOSED output structure | ||
``` | ||
{ | ||
request_id:122, | ||
model_name:"classycat__Model", # model to be used | ||
model_parameters: { | ||
"event_type": "classify", | ||
"schema_id": "4a026b82-4a16-440d-aed7-bec07af12205", | ||
}, | ||
input_items:[ | ||
{ | ||
id:"11", | ||
text:"this is some text to classify", | ||
workspace_id:"timpani_meedan", | ||
target_state:"classified", | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. do we currently have a model that can support multiple target states? otherwise I don't see how this is helpful and not something that would be a constant on every request. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah I'm thinking the same thing here - target_state seems implied by model_parameters. I think you either have it at the top-level or at the item-level but not both levels. |
||
language:"en" | ||
|
||
}, | ||
{ | ||
id:"12", | ||
text:"Este é um texto para classificar", | ||
workspace_id:"timpani_meedan", | ||
target_state:"classified", | ||
language:"pt" | ||
}, | ||
], | ||
callback_url: "http://conductor/batch_reply/", | ||
|
||
# elements below here are added in response | ||
|
||
"retry_count": 0 | ||
"status_code": 200 | ||
"status_message":"success" | ||
"results": [ | ||
{ | ||
"id": "11", # this id can be used to look up params from input | ||
"text": "this is some text to classify", # text is redundant with input, but helpful for model-level debugging | ||
"labels": [ | ||
"Testing", | ||
"Classification" | ||
] | ||
}, | ||
{ | ||
"id": "12", | ||
"text": "Este é um texto para classificar", | ||
"labels": [ | ||
"Testing", | ||
"Classification" | ||
] | ||
} | ||
] | ||
}, | ||
|
||
``` | ||
|
||
Example PROPOSED output object structure (YAKE) | ||
{ | ||
request_id:422, | ||
model_name:"yake_keywords__Model", # model to be used | ||
model_parameters: { | ||
"max_ngram_size":3, | ||
"deduplication_threshold":0.25 | ||
}, | ||
input_items:[ | ||
{ | ||
id:"11", | ||
text:"this is some text to classify", | ||
workspace_id:"timpani_meedan", | ||
target_state:"keyworded", | ||
language:"en", | ||
|
||
}, | ||
], | ||
callback_url: "http://conductor/batch_reply/", | ||
"retry_count": 0 | ||
"status_code": 200 | ||
"status_message":"success" | ||
"results": [ | ||
{ | ||
"id": "11", # this id can be used to look up params from input | ||
"text": "this is some text to classify", | ||
"keywords": [ | ||
["lookup",0.0020211625251083634], | ||
["params",0.0020211625251083634] | ||
] | ||
}, | ||
|
||
] | ||
} | ||
|
||
# Error response structures | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. it would be helpful if you could elaborate more on how we want to use the http error codes in responses. if an example helps, classycat returns a whole range of different error codes that would be good for explaining/reviewing a variety of use cases. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Status codes make sense as top-level responses, sort of (they aren't actually responses, they are sending requests, so it is not technically a status code. I get if we want to use typical codes as loanwords for why something didn't work, but I can also see this getting strange over time since the receiving end isn't actually going to rescue them with HTTP exception logic). The issue I see with them is that they apply only to the top-level response but not on a per-item basis? How do we indicate per-item error, if at all? |
||
|
||
"status_code": 403 | ||
"status_message":"unable to access model" | ||
|
||
## Not covered | ||
per item errors? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think all models should support batch format, and those that can't do parallel ops just do them serially then return, IMO...