-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question/Enhancement/Research Batch Managmenent #81
Comments
Hi @kwojcicki, thank you for creating the issue. It does make a lot of sense to have a batch management system that utilizes OpenFaaS. I'm not entirely convinced that OpenFaaS needs to build the support for batch being a runtime for functions. Similarly, a batch management system may provide different constructs for batch jobs where function execution is one of them. The current direction of Faas-flow is providing an SDK to Construct a Workflow as a DAG with Operations and provide an Executor that implements the controller and executes Operations in Order. Currently, the SDK is highly coupled with OpenFaaS but In near future it will look like:
In short, the SDK supposed to provide extensibility. For example, Now as per the current status Faas-flow supports some of the features you mentioned but not all of them. For example, it doesn't support pausing/canceling of a Job, exponential backoff of failed functions, caching of function results out of the box. Although these may be implemented inside the With all that, Faas-flow SDK stays the same. We can make one Openfaas Batch Management System by packaging all the implementation related to Batch Execution in Openfaas in a function Template, and provide a simpler wrapper on top of the core SDK. Currently, we have only one implementation as Now coming to
I went very lengthy here, let me know your thoughts on the same. |
Thanks for giving such a detailed response @s8sg ^_^
Thanks for the great explanation of where faas-flow is progressing to!
Based on your explanation of where faas-flow is going + your above explanation. It seems like much of my asks/general asks for a batch management platform are almost there. I think a minimal viable batch management system would let you view the status of a batch + the jobs within a batch, cancel a batch/job and maybe retrying a job (this may be tough to define when one should restart a job and from where).
I think batchs will be a common use case of OpenFaas and these are some good first steps into providing that :) Let me know what you think |
@kwojcicki Few questions status viewing:
Canceling
Retrying
|
I think the best would be if users could provide a flow-batch-id so they themselves can specify function calls x,y,z are in flow-batch-id: "batch1", while function calls q,w,e are in flow-batch-id: "batch2".
Yup so going from my previous example it would be neat if faas-flow-tower would know there are 2 batches and could show a grid showing 2 rows one for each batch. With the possibility that each row is expanded showing the individual job status.
I think having the batch jobs go through a faas-flow-queue may be helping in identifying all batchs. So if I queue 2 batches each with 5000 jobs (ie 10, 000 function calls). When each function call is queued up faas-flow (not necessarily this specific library but some faas-flow-batch component) can put in the state store function with
By saying "one need to know how to handle the statestore" do you mean how to know when a function is canceled or what should be into the statestore after a flow has received a signal to cancel itself and then successfully canceled itself.
I assume when you say call the flow function you mean call some component that updates the statestore not the actual function endpoint itself because if you call the function endpoint that would spawn another faas-flow I believe?
|
Great inputs. Let's have a separate issue for each of the items. And I'm not very experienced in UI, so PR's will be great For queuing ReportExecutionForward(nodeId string, requestId string) We can handle the call in template and that information in monitoring backend, for that we need changes in the template.
Not necessarily. We can call a
|
@kwojcicki I'm Considering to have a separate template as As the |
i have some questions about batch jobs. |
TLDR
Wondering the place faas-flow/faas-tower has in batch management ie batch management built on top of faas-flow or completely separate. What I call batch management: (job here is referencing a function call that is part of a batch not a k8s job) viewing batch/job states, pausing/canceling jobs/batches, notification if a batch/job fails, exponential backoff of failed functions, caching of function results.
Longer version
I've recently been doing some PoC at work wrt to OpenFaas. Aside from using it as just a FaaS many are interested in shifting their batch processing to OpenFaas. I don't think thats necessarily wrong and potentially OpenFaas itself may add some functionality wrt to batch management (openfaas/faas#657). In the mean time I was looking to build a small management layer that could do the above stated tasks.
I was looking to get some input from you (@s8sg ) if you think its best to build on top of faas-tower to get this functionality (using the distributed tracing you already have built) or maybe adding some extra functionality into faas-flow that could facilitate batch management 😄
The text was updated successfully, but these errors were encountered: