Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[backport -> release/3.5.x] perf(proxy): use higher default keepalive request value for Nginx tuning (#12223) #12530

Merged
merged 1 commit into from
Feb 20, 2024

Conversation

chobits
Copy link
Contributor

@chobits chobits commented Feb 5, 2024

Bumped default values of nginx_http_keepalive_requests and upstream_keepalive_max_requests to 10000.
backport #12223

Summary

Checklist

  • N/AThe Pull Request has tests
  • A changelog file has been created under changelog/unreleased/kong or skip-changelog label added on PR if changelog is unnecessary. README.md
  • N/AThere is a user-facing docs PR against https://github.com/Kong/docs.konghq.com - PUT DOCS PR HERE

Issue reference

Fix KAG-3360

@github-actions github-actions bot added core/templates core/wasm Everything relevant to [proxy-]wasm labels Feb 5, 2024
@chobits chobits changed the title perf(proxy): use higher default keepalive request value for Nginx tun… perf(proxy): use higher default keepalive request value for Nginx tuning (#12223) Feb 5, 2024
@bungle
Copy link
Member

bungle commented Feb 5, 2024

What I have found is that on Mac the default ulimit -n seems to be 256 and these settings make the Kong not to work correctly. So perhaps a word somewhere about ulimit -n, would be good.

@chobits
Copy link
Contributor Author

chobits commented Feb 6, 2024

What I have found is that on Mac the default ulimit -n seems to be 256 and these settings make the Kong not to work correctly. So perhaps a word somewhere about ulimit -n, would be good.

hi @bungle What do you find in macOS? I tested this branch in my macOS Bazel environment and did not find any related error reports.
I re-reviewed the implementation of these two parameters in Nginx, thinking that they should not be related to any ulimit -n options (fd limit). These two options simply increase the allowed number of requests within a single connection.


If I set the limit to 256 mannually to test, the warnning message I get is not related to this PR, I revert this PR, and still get them:

(kong-dev) xc kong $ ulimit -n 256
(kong-dev) xc kong $ kong restart
2024/02/06 16:24:28 [warn] ulimit is currently set to "256". For better performance set it to at least "4096" using "ulimit -n"
Kong stopped
2024/02/06 16:24:29 [warn] ulimit is currently set to "256". For better performance set it to at least "4096" using "ulimit -n"
Kong started

@@ -0,0 +1,3 @@
message: Bumped default values of `nginx_http_keepalive_requests` and `upstream_keepalive_max_requests` to `10000`.
type: performance
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although this PR modifies the default value of NGINX, it doesn't have a significant impact on the client's side. Therefore, I don't believe we need to categorize this feature as a breaking_change.

Typically, users won't notice this alteration. However, under stressful testing scenarios, they might observe an increase in NGINX's performance, specifically in RPS

@bungle
Copy link
Member

bungle commented Feb 6, 2024

What I have found is that on Mac the default ulimit -n seems to be 256 and these settings make the Kong not to work correctly. So perhaps a word somewhere about ulimit -n, would be good.

hi @bungle What do you find in macOS?

See this: https://kongstrong.slack.com/archives/C02L7HR45DY/p1706256832168089

I think that was related. Aka kong started, and displayed those warnings. But then timers or events didn't work anymore.

In logs you will get:


2024/02/06 19:51:51 [alert] 78828#0: 1024 worker_connections are not enough
2024/02/06 19:51:51 [alert] 78828#0: lua failed to run timer with function defined at @/Users/bungle/.luaver/luarocks/3.9.1_5.1/share/lua/5.1/resty/timerng/thread/loop.lua:162: could not create fake connection
2024/02/06 19:51:51 [alert] 78828#0: 1024 worker_connections are not enough
2024/02/06 19:51:51 [alert] 78828#0: lua failed to run timer with function defined at @/Users/bungle/.luaver/luarocks/3.9.1_5.1/share/lua/5.1/resty/timerng/thread/loop.lua:162: could not create fake connection
2024/02/06 19:51:51 [alert] 78828#0: 1024 worker_connections are not enough
2024/02/06 19:51:51 [error] 78828#0: *3 [lua] worker.lua:138: communicate(): failed to connect: failed to receive response header: closed, context: ngx.timer
2024/02/06 19:51:51 [debug] 78828#0: *5 [lua] upstreams.lua:108: no upstreams were specified

Aka, kong starts and accepts admin api calls, but doesn't work. But perhaps it is releated to some other bump, e.g. did we change something else too? Before some change, it worked fine with ulimit -n 256, it does not work anymore.

@chobits
Copy link
Contributor Author

chobits commented Feb 7, 2024

What I have found is that on Mac the default ulimit -n seems to be 256 and these settings make the Kong not to work correctly. So perhaps a word somewhere about ulimit -n, would be good.

hi @bungle What do you find in macOS?

See this: https://kongstrong.slack.com/archives/C02L7HR45DY/p1706256832168089

I think that was related. Aka kong started, and displayed those warnings. But then timers or events didn't work anymore.

In logs you will get:


2024/02/06 19:51:51 [alert] 78828#0: 1024 worker_connections are not enough
2024/02/06 19:51:51 [alert] 78828#0: lua failed to run timer with function defined at @/Users/bungle/.luaver/luarocks/3.9.1_5.1/share/lua/5.1/resty/timerng/thread/loop.lua:162: could not create fake connection
2024/02/06 19:51:51 [alert] 78828#0: 1024 worker_connections are not enough
2024/02/06 19:51:51 [alert] 78828#0: lua failed to run timer with function defined at @/Users/bungle/.luaver/luarocks/3.9.1_5.1/share/lua/5.1/resty/timerng/thread/loop.lua:162: could not create fake connection
2024/02/06 19:51:51 [alert] 78828#0: 1024 worker_connections are not enough
2024/02/06 19:51:51 [error] 78828#0: *3 [lua] worker.lua:138: communicate(): failed to connect: failed to receive response header: closed, context: ngx.timer
2024/02/06 19:51:51 [debug] 78828#0: *5 [lua] upstreams.lua:108: no upstreams were specified

Aka, kong starts and accepts admin api calls, but doesn't work. But perhaps it is releated to some other bump, e.g. did we change something else too? Before some change, it worked fine with ulimit -n 256, it does not work anymore.

Got it, let me try, will get back to you if I have some new discoveries.

@chobits
Copy link
Contributor Author

chobits commented Feb 19, 2024

We have pinpointed the reason why kong worker generated so many alert logs in #12530 (comment), it is related to this PR: https://github.com/Kong/kong-ee/pull/7880. So we can advance the merging progress of this PR.

@ADD-SP ADD-SP force-pushed the backport-12223-to-release/3.5.x branch from c2646de to 3ea7ce3 Compare February 19, 2024 09:38
@ADD-SP
Copy link
Contributor

ADD-SP commented Feb 19, 2024

Rebased on master as master has too many changes.

Copy link
Contributor

@ADD-SP ADD-SP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merge on CI passes

@chobits chobits closed this Feb 20, 2024
@chobits chobits force-pushed the backport-12223-to-release/3.5.x branch from cf748ed to c7589e6 Compare February 20, 2024 02:06
…ing (#12223)

Bumped default values of `nginx_http_keepalive_requests` and
`upstream_keepalive_max_requests` to `10000`.

KAG-3360

---------

Co-authored-by: Datong Sun <[email protected]>
(cherry picked from commit f7e6eee)
@chobits chobits reopened this Feb 20, 2024
@chobits chobits changed the title perf(proxy): use higher default keepalive request value for Nginx tuning (#12223) [backport -> release/3.5.x] perf(proxy): use higher default keepalive request value for Nginx tuning (#12223) Feb 20, 2024
@ADD-SP ADD-SP merged commit 0082757 into release/3.5.x Feb 20, 2024
47 checks passed
@ADD-SP ADD-SP deleted the backport-12223-to-release/3.5.x branch February 20, 2024 03:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core/templates core/wasm Everything relevant to [proxy-]wasm size/XS
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants