-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changing TX-Queuelength to enhance Bandwidth #100
Comments
Buffer sizes should be symmetric for servers: |
Increase input queue lengths: |
I guess, this goes into the ffh.supernode role. |
Please make sure, that this does not introduce additional buffer bloat. |
@lemoer I did, but I didn't measure any improvement. So it seems it is not our bottleneck. A queue length of 16384 is enough for flattening spikes on a 2 Gbit/s link for about 5 ms. So this is nothing critical for bufferbloat, but just what we need for a 2 GBit/s link to not excessively start dropping on a high load. About the ring buffer sizes... They should be symmetric on a server that has symmetric up and down rates. Debian's defaults are for "client devices" which download more than they upload. |
@CodeFetch If you did not measure any improvement, can this issue be closed? |
We will sooner or later run into these issues when we have higher bandwidth demands e.g. with WireGuard, because the default configuration of Debian is not for routers or better say machines that symmetrically receive/transmit packets. |
So, what did you mean by "but I didn't measure any improvement"? |
@1977er That it's not the bottleneck at the moment. But I think it becomes relevant with more than 1 gbit/s traffic forwarding from different connections. |
This will never happen with our current hardware and our resources. |
@CodeFetch @1977er So this is stalled/blocked, at least until we rolled out WireGuard for the broad mass? |
If these settings do not harm, we can introduce them for future use. |
@CodeFetch if they don't, do you want to implement this? |
@1977er Settings like TXQueuelength can drastically reduce network perfomace due to bufferbloat. Therefore they can cause harm. |
I guess then it's settled. As long as its not needed, don't change it. @CodeFetch if you disagree, please re-open again. |
That's not how codel etc. work as these algorithms look at the latency for deciding to drop packets. Therefore we should give it a try as the current buffer sizes are not optimized for routing and this could partially explain our current problems. |
Tested the setting of 16k again on a supernode with quite decent wg usage (already having the problem of dropping packets due to overusage). Looking at the % of neighbours with TQ < 95% chart I can see no effect after introducing the setting. (injected at 9:38)
I suggest to close this issue until we know, that we need this. |
Maybe we can re-test this once we have WG roled out, just to settle this once and for all then. |
@1977er @AiyionPrime We should at least set this setting symmetric as we are using the servers as routers and not desktop machines. The current/asymmetric values are definitely wrong. |
Source: https://www.bufferbloat.net/projects/codel/wiki/Best_practices_for_benchmarking_Codel_and_FQ_Codel/ Whatever that means... Edit: "Time Squeeze" constantly increases so we might want to try tuning NAPI budgets Edit2: txqueuelen and netdev_max_backlog need to be set high enough for avoiding packet drops. At the same time NAPI weight and budget need to be set high enough for the softirq to handle all packets in time. Our issue is indeed softirq backpressure, but it seems that the reason for it is that the CPU cores are actually saturated. Maybe it makes sense to dedicate cores to softirq handling as recommended here (to reduce rescheduling and thereby increase cache warmth?), but that needs to be tested: Edit3: I like this guide more than the previous ones as it is more compact still covering all tips I've read about so far: |
Momentarily we use Debian's default values for e.g. the txqueuelen. These values are not optimized for routing. Increasing the buffer sizes and queue lengths will likely reduce the number of context switches. When setting them too high we might starve the userspace. Therefore we need to find reasonable values depending on the interface type (e.g. userspace VPN like fastd vs. kernel land WireGuard).
The text was updated successfully, but these errors were encountered: