-
-
Notifications
You must be signed in to change notification settings - Fork 413
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[💡FEATURE REQUEST]: Adaptive scaling of the workers #97
Comments
@stefanos82 hi! The idea of RR is that we keep in memory as much code as possible and don't have to bootstrap system on each request. |
This is possible and has been planned from the beginning (that's why worker pool has name StaticPool and interfaced for usage in Server). The only trick in this feature is to properly define scaling logic to push/pull workers from allocation channel. We are very actively talking about this feature internally, cos if it's done wrong effect on application can be very harmful. |
Hmmm, I see. Well, since we care about performance, maybe we should have figured the number of workers based on CPU cores? That's also an option I would say. package main
import (
"fmt"
"runtime"
)
func main() {
fmt.Println(runtime.NumCPU())
} |
I can only set it as default value for the pool.numWorkers option. |
OK, what if I would like to increase or decrease the number of CPU workers on the fly. Is there any hotkey for this option? |
Currently this API is not exposed, but it is possible to configure pool with different configuration: https://github.com/spiral/roadrunner/blob/master/server.go#L113 |
You mean to use Reconfigure...when exactly? I'm referring at runtime. If for instance, I run Am I forced to stop it, increase the number for workers in |
Currently yes, Reconfigure is what used in http:reset which you call in runtime (without stopping the server). I guess I can add a flag to alter number of workers for this function. |
That would be more than awesome. |
After couple of intense internal discussions (thank you @ValeryPiashchynski, Andrew M, Alexei N, and @vvval ) we have come up with a plan of adding basic balancing mechanism based on 2 derivative metrics - allocation time and processing time. Thought, more metrics can be added in a future, this two should cover a lot of possible use cases such as a lot of fast queries, few amount of large queries and so on. If anyone have anything to share regarding adaptive scaling mechanism algorithms we are glad to listen. |
It would be a lot helpful if you could expand more on this, much like a case study, what led you to choose these two derivative metrics, and so forth. I could investigate it and see whether there is a better alternative that could be applied. |
Well, cost they both derivative :) Each metric depends on CPU load, number of connections, processing time and etc.
In theory, even one of this metrics should include enough information to scale system up and down. I will try to explain couple of scenarios (green = processing time, orange = allocation time):
-- please do not consider this whole chart as the timeline for the app, it's an example ---
This is not final, we are still having the discussions and open to suggestions, this is our first shoot (before the implementation). I'm ready to accept that we have the fatal flaw in this logic, however, this metrics are easy to retrieve and process, so they looks promising for first version of adaptive scaling mechanism. Clearly both metrics should be used in combination with CPU, memory stats, min/max boundaries and proper hysteresis logic. Also we have to consider the cost of worker creation. |
I believe we can also calculate 2nd derivative to build better prediction logic, but this is not type of rabbit hole I would like to jump into... yet. |
Very informative. Now I have a clearer view about the whole thing, thank you. |
This article could be used as a source of inspiration: Building a Worker Pool in Golang It does not mean it demonstrates 100% what I have suggested, but the concept around dynamism is demonstrated in it. Please bear in mind that there is a high possibility that I'm wrong about the article's content and that I most probably have misunderstood its concept. Nevertheless, it's a very informative article that makes you appreciate the use of channels. |
@wolfy-j It's not visible for me I'm afraid. Can you take a screenshot and paste it here please? |
Idea:
The sub of those timers will give us a piece of information about how long the request waited for the actual execution (we may also sub an
Results (example): 3 seconds waiting for the worker, 500ms actual work in the PHP worker. 3-0.5=2.5s (data science here), threshold - 1s --> decision: allocate a worker (or 2-3-5 according to the step but no more than Next request: 2 seconds waiting for the worker (time reduced for example), 500ms execution --> decision: allocate a worker. |
Hello! How is work progressing in this feature? I see that the label "v2023.1.0" has been removed. Is the feature still planned? |
Not planned ATM. Still not sure the RR should have this feature in the modern era of K8s and other orchestration tools that can scale pods on demand. From the RR side, it provides metrics to make the decision about scaling (like queue size) for the orchestration tools. |
Reopening, in the next release (v2024.3) RR would have a How it works: |
Wow, what a gift! |
With PHP-FPM we have three options:
static
,dynamic
, andondemand
.Can we accomplish such thing with rr? I don't think it makes sense to waste resources when a website is more or less in idle mode; It should be able to limits its workers to the lowest level possible for obvious reasons.
Thoughts and / or suggestions?
The text was updated successfully, but these errors were encountered: