Skip to content
This repository has been archived by the owner on Jan 24, 2018. It is now read-only.

Only use flask for debug #1607

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

david4096
Copy link
Member

@david4096 david4096 commented Mar 9, 2017

This PR provides an experimental "run under gunicorn" option to ga4gh_server.

This PR runs the server behind the gunicorn WSGI server when running via ga4gh_server. This makes the software much easier to use and more practical for people who don't know what "Apache" or "nginx" are.

This spawns multiple worker processes that can each handle their own requests. This standalone WSGI HTTP Server is well tested with Flask, however, our implementation includes some challenges. The first is that TLS mode won't work directly, and logging needs to be handled differently.

I'd like to get this experimental feature out there for folks to test, I think that by default when running ga4gh_server we would like to run behind gunicorn as opposed to the SimpleHTTPServer. It would allow us to simplify the Apache instructions, and the dockerfile. For apache, one simply sets up a proxy and doesn't need to use mod_wsgi.

(env) ➜  ga4gh-server git:(1548_gunicorn_optional) ✗ python server_dev.py
--------------------------------------------------------------------------------
DEBUG in __init__ [ga4gh/server/network/__init__.py:54]:
Peer already in registry http://1kgenomes.ga4gh.org UNIQUE constraint failed: peer.url
--------------------------------------------------------------------------------
[2017-03-08 17:56:01 +0000] [16971] [INFO] Starting gunicorn 19.7.0
[2017-03-08 17:56:01 +0000] [16971] [INFO] Listening at: http://127.0.0.1:8000 (16971)
[2017-03-08 17:56:01 +0000] [16971] [INFO] Using worker: sync
[2017-03-08 17:56:01 +0000] [16976] [INFO] Booting worker with pid: 16976
[2017-03-08 17:56:01 +0000] [16977] [INFO] Booting worker with pid: 16977
[2017-03-08 17:56:01 +0000] [16978] [INFO] Booting worker with pid: 16978
[2017-03-08 17:56:01 +0000] [16979] [INFO] Booting worker with pid: 16979
[2017-03-08 17:56:01 +0000] [16980] [INFO] Booting worker with pid: 16980
[2017-03-08 17:56:01 +0000] [16981] [INFO] Booting worker with pid: 16981
[2017-03-08 17:56:01 +0000] [16982] [INFO] Booting worker with pid: 16982
[2017-03-08 17:56:01 +0000] [16983] [INFO] Booting worker with pid: 16983
[2017-03-08 17:56:01 +0000] [16984] [INFO] Booting worker with pid: 16984

Close #1548

TODO

  • Verify if auth works in this configuration
  • Set up logging to stdout (where are worker logs redirected to?)
  • Make gunicorn default production run setting

@david4096 david4096 force-pushed the 1548_gunicorn_optional branch 2 times, most recently from ea9874e to 5fac2e3 Compare March 9, 2017 02:05
@codecov-io
Copy link

codecov-io commented Mar 9, 2017

Codecov Report

Merging #1607 into master will decrease coverage by 0.07%.
The diff coverage is 58.33%.

@@            Coverage Diff             @@
##           master    #1607      +/-   ##
==========================================
- Coverage   86.65%   86.57%   -0.08%     
==========================================
  Files          33       33              
  Lines        7471     7494      +23     
  Branches      926      928       +2     
==========================================
+ Hits         6474     6488      +14     
- Misses        805      813       +8     
- Partials      192      193       +1
Impacted Files Coverage Δ
ga4gh/server/cli/server.py 66.03% <58.33%> (-3.97%)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ec57524...6f59917. Read the comment docs.

@david4096 david4096 changed the title Add gunicorn option Only use flask for debug Mar 10, 2017
@david4096 david4096 force-pushed the 1548_gunicorn_optional branch from 5fac2e3 to 9164eef Compare March 10, 2017 05:30
@david4096
Copy link
Member Author

david4096 commented Mar 10, 2017

To better demonstrate what this gives us, I ran the locust load testing suite with this locustfile against this PR in two modes. This test is a little better isolated than the previous which used curl. Locust uses gevent to spawn workers at a lower cost on the system.

First with the -d option which uses the flask debug we already ship with. 68 requests/second

"Method" "Name" "# requests" "# failures" "Median response time" "Average response time" "Min response time" "Max response time" "Average Content Size" "Requests/s"
"GET" "/" 42 1 5000 14854 1254 67302 16007 0.32
"POST" "/datasets/search" 565 2 2200 5941 3 69022 158 4.36
"POST" "/features/search" 631 3 2100 6016 3 67749 2 4.87
"POST" "/featuresets/search" 631 0 2000 5738 5 67694 364 4.87
"POST" "/readgroupsets/search" 669 1 2000 6138 13 69053 23775 5.16
"POST" "/reads/search" 6355 19 2000 5938 3 69506 10048 49.02
"None" "Total" 8893 26 2000 5987 3 69506 9081 68.60

And then with the gunicorn server (running without -d), 130 requests/sec

"Method" "Name" "# requests" "# failures" "Median response time" "Average response time" "Min response time" "Max response time" "Average Content Size" "Requests/s"
"GET" "/" 18 0 800 796 669 921 16007 0.22
"POST" "/datasets/search" 747 0 370 514 3 2683 158 9.04
"POST" "/features/search" 748 0 390 527 3 2630 2 9.05
"POST" "/featuresets/search" 800 0 390 521 3 2634 364 9.68
"POST" "/readgroupsets/search" 802 0 440 555 8 2492 23775 9.70
"POST" "/reads/search" 7645 0 430 557 3 2760 9718 92.51
"None" "Total" 10760 0 420 549 3 2760 8741 130.20

Note the failures with the builtin flask server, which doesn't handle simultaneous requests well.

@david4096
Copy link
Member Author

david4096 commented Mar 10, 2017

If we merge this we should follow it with a PR that improves test coverage under gunicorn mode in test_gestalt.

Tests that involve ensuring an announce request is sent and received will be easier to write, I believe.

@david4096 david4096 requested review from ejacox, dcolligan and kozbo March 10, 2017 21:39
@david4096
Copy link
Member Author

I don't want to rush this into the release, but I would love to get feedback!

Copy link
Member

@dcolligan dcolligan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks fine as long as we never run gunicorn unless explicitly invoked

parser = common_cli.createArgumentParser("GA4GH reference server")
addServerOptions(parser)
return parser


def number_of_workers():
return (multiprocessing.cpu_count() * 2) + 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should probably have a comment here justifying this formula

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

help="Don't use the flask reloader")
help="Don't use the Flask reloader (for Flask debug)")
parser.add_argument(
"--debug", "-d", action='store_true', default=False,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the docs say the -g flag is to run gunicorn

Copy link
Member Author

@david4096 david4096 Mar 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I started by making running under gunicorn a flag, but I believe the default functionality when running ga4gh_server should be to run in a multiworker environment. The -d will run using the Flask debug server (werkzeug).

@@ -194,6 +194,7 @@ def getCmdLine(self):
python server_dev.py
--dont-use-reloader
--disable-urllib-warnings
-d
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

best to use the non-shorthand version of the flag here, for clarity in reading this file

host=parsedArgs.host, port=parsedArgs.port,
use_reloader=not parsedArgs.dont_use_reloader,
ssl_context=sslContext)
if parsedArgs.debug:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get why this is called debug ... seems like this should be run_guincorn or something...

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants