-
Notifications
You must be signed in to change notification settings - Fork 498
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add flow framework documentation #6257
Conversation
Signed-off-by: Fanit Kolchina <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! A few comments!
* Workflow steps requiring an OpenSearch plugin which is not installed | ||
* Workflow steps relying on previous node input that is provided by those steps | ||
* Workflow step fields with invalid values | ||
* Workflow graph (node/edge) configurations containing cycles or having duplicate IDs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be a good place to reference the get workflow steps API.
|
||
Once a workflow is created, provide its `workflow_id` to other APIs. | ||
|
||
The `POST` method creates a new workflow. The `PUT` method updates an existing workflow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The workflow must not have been provisioned for PUT to be successful. We might want to add that here.
|
||
# Get workflow steps | ||
|
||
OpenSearch validates workflows using the [JSON file](https://github.com/opensearch-project/automating-workflows/blob/main/src/main/resources/mappings/workflow-steps.json) that lists the required inputs and generated outputs for all steps. The Get Workflow Steps API retrieves this file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Links to the github repo should have flow-framework
rather than automating-workflows
.
But I'm not sure we should refer to this file by link, particularly on the main
branch:
- We have an open issue to store it programmatically and eliminate the file, which almost certainly will be done before 2.13.
- Any new steps we add will change this file on
main
for upcoming versions but wouldn't be reflected in the released API which includes this file as a classpath resource.
|
||
### Example response | ||
|
||
OpenSearch responds with the [JSON file](https://github.com/opensearch-project/automating-workflows/blob/main/src/main/resources/mappings/workflow-steps.json) containing the steps. The order of fields in the returned steps may not exactly match the original JSON but will function identically. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comment here about a direct file link. Do you think we should include the JSON from a sample workflow step showing what it contains (input, output, required plugins, and default step timeout)? Such as
{
"register_remote_model": {
"inputs": [
"name",
"connector_id"
],
"outputs": [
"model_id",
"register_model_status"
],
"required_plugins":[
"opensearch-ml"
]
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, I will delete the links to the file and add an example. The users will see the template once they call the API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall with few comments.
``` | ||
|
||
YAML templates permit comments. | ||
{: .tip} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For my understanding. What does .tip
does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- source: create_connector_1 | ||
dest: register_model_2 | ||
- source: register_model_2 | ||
dest: deploy_model_3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dbwiddis do you think we should also mention that if previous_node_inputs
are defined then defining edges is optional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is mentioned immediately above in the comment (line 142-144)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add a callout in the tutorial so this info is explicitly mentioned
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tried to mention it in multiple places, such as where previous_node_inputs are defined as well.
|`create_connector` |[Create Connector]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/connector-apis/create-connector/) |Creates a connector to a model hosted on a third-party platform. | | ||
|`delete_connector` |[Delete Connector]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/connector-apis/delete-connector/) |Deletes a connector to a model hosted on a third-party platform. | | ||
|`register_remote_model` |[Register Model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/register-model/) |Registers a model hosted on a third-party platform. If the `user_inputs` field contains a `deploy` key that is set to `true`, also deploys the model. | | ||
|`register_model_group` |[Register Model Group]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-group-apis/register-model-group/) |Registers a model group. The model group will be deleted automatically once no model is present in the group. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joshpalis should we also add local models workflow steps here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes we have RegisterLocalPretrainedModel
, RegisterLocalCustomModel
, and RegisterLocalSparseEncodingModel
steps
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joshpalis Can you provide a one liner for each step here so that @kolchfa-aws can update the doc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joshpalis I updated the doc with these 3 steps. I assume they also have an option of deploying when user_inputs
contains deploy
=true
. Please review. Thanks!
Signed-off-by: Fanit Kolchina <[email protected]>
Co-authored-by: Owais Kazi <[email protected]> Signed-off-by: kolchfa-aws <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kolchfa-aws Exceptional writing--clear, simple language for a technically complex topic. Well done!
|
||
## 2. Use the deployed model for inference | ||
|
||
A CoT agent can use the deployed model in a tool using the [ML Commons Agent Framework](Link TBD). This step doesn’t strictly correspond to an API but represents a component of the body that the Register Agent API requires. This simplifies the register request and allows reuse of the same tool in multiple agents. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See previous comments re: MLCommons
Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: kolchfa-aws <[email protected]>
Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: kolchfa-aws <[email protected]>
Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: kolchfa-aws <[email protected]>
Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: kolchfa-aws <[email protected]>
Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: kolchfa-aws <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Provisioning a workflow refers to a one-time setup process, usually performed by a cluster administrator to create resources which will be used by end users. | ||
|
||
The `workflows` field of the template may contain multiple workflows. The workflow with the key `provision` can be executed with this API. This API is also executed when the [Create or Update Workflow API]({{site.url}}{{site.baseurl}}/automating-workflows/api/create-workflow/) is called with the `provision` parameter set to `true`. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am adding code now to prevent provisioning a workflow that has already been provisioned. This is very similar (and will use the same code) as the create workflow not allowing you to update after it's been provisioned:
You can only update a workflow if it has not yet been provisioned.
{: .note}
Can we add a similar note:
You can only provision a workflow if it has not yet been provisioned. Deprovision the workflow if you need to repeat provisioning.
{: .note}
Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: kolchfa-aws <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: kolchfa-aws <[email protected]>
Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: kolchfa-aws <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: kolchfa-aws <[email protected]>
Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: kolchfa-aws <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Overall LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @kolchfa-aws!
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Hi @kolchfa-aws , for 2.12, Flow framework would be launched as experimental. Is there a way to capture it in your docs? |
sorry i accidentally closed the PR but have reopened it. |
@minalsha Yes, I'll add an experimental label to all pages. We also include a link for users to provide feedback for experimental issues. Is there an issue or a forum thread where you'd like to collect feedback? |
Signed-off-by: Fanit Kolchina <[email protected]>
I would default to just asking them to create an Issue on Github? |
{: .info} | ||
|
||
OpenSearch use case templates provide a compact description of the setup process in a JSON or YAML document. These templates describe automated workflow configurations for conversational chat or query generation, AI connectors, tools, agents, and other components that prepare OpenSearch as a backend for generative models. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a link here to sample templates? I have PR open opensearch-project/flow-framework#498 which will add some sample templates in this folder. (I'll have a README there with more detailed instructions shortly.)
https://github.com/opensearch-project/flow-framework/tree/main/sample-templates
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's great, thanks! Added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re: issue on github. Normally, we want the users to follow the progress of the feature, which they can't do by opening an issue. Is there a meta issue or rfc that we can reference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
opensearch-project/flow-framework#475 is probably a good place!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Added. Let me know if there is anything else for this issue. Otherwise, we can merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's it!
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Closes #5386
Checklist
For more information on following Developer Certificate of Origin and signing off your commits, please check here.