Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Defer HTTP parsing to Hyper (h11 project) #262

Closed
wants to merge 37 commits into from

Conversation

ian-otto
Copy link
Contributor

@ian-otto ian-otto commented Jan 22, 2020

❓ What kind of change does this PR introduce?

  • 🐞 bug fix
  • 🐣 feature
  • πŸ“‹ docs update
  • πŸ“‹ tests/coverage improvement
  • πŸ“‹ refactoring
  • πŸ’₯ other

πŸ“‹ What is the related issue number (starting with #)

Resolves #201

❓ What is the current behavior? (You can also link to an open issue here)

Currently, the Cheroot project has its own HTTP parser and logic.

❓ What is the new behavior (if this is a feature change)?

This PR will defer the primary HTTP parser to the h11 project's implementation.

πŸ“‹ Other information:

Depends on: python-hyper/h11#99

πŸ“‹ Checklist:

  • I think the code is well written
  • I wrote good commit messages
  • I have squashed related commits together after the changes have been approved
  • Unit tests for the changes exist
  • Integration tests for the changes exist (if applicable)
  • I used the same coding conventions as the rest of the project
  • The new code doesn't generate linter offenses
  • Documentation reflects the changes
  • The PR relates to only one subject with a clear title
    and description in grammatically correct, complete sentences

This change is Reviewable

@ian-otto
Copy link
Contributor Author

Something of note, is I'm not sure whether I should anticipate supporting Python 2.7 specifically in this PR as it appears there is a discussion of its deprecation (#252)

@lgtm-com

This comment has been minimized.

cheroot/server.py Outdated Show resolved Hide resolved
cheroot/server.py Outdated Show resolved Hide resolved
cheroot/server.py Outdated Show resolved Hide resolved
cheroot/server.py Outdated Show resolved Hide resolved
cheroot/server.py Outdated Show resolved Hide resolved
line = self.rfile.readline()
while line:
event = self.conn.next_event()
if event is h11.NEED_DATA:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plz reverse this check in order to reduce the following block indentation

Suggested change
if event is h11.NEED_DATA:
if event is not h11.NEED_DATA:
continue

cheroot/server.py Outdated Show resolved Hide resolved
if event is h11.NEED_DATA:
self.conn.receive_data(self.rfile.read(size))
event = self.conn.next_event()
assert isinstance(event, h11.Data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this some sort of an exit-check for ensuring contracts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anything other than h11.Data here should be undefined behavior. Currently, I believe the only possible way we don't is if we get an EndOfMessage, but I'm pretty sure that were that to happen, h11 would raise an exception. I believe I saw this in the example code that they provide, so I kept it in for fun.

I can't really think of any other way it would be anything else, so I kept it in as a general catch-all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright. This is a practice people follow when doing contract programming just as a sanity-check. asserts are stripped off in optimized mode. But it's useful to keep this as an exit-check.

@webknjaz
Copy link
Member

webknjaz commented Jan 22, 2020

I'm not sure whether I should anticipate supporting Python 2.7 specifically in this PR as it appears there is a discussion of its deprecation

@ian-otto I wanted to deprecate Python 2.7 after the HTTP parser replacement so that the last py2-compatible release would get the changes and then we'd start dropping support for this. Will probably merge #243 too before that. And maybe even #256 if we agree on it. That said, this PR should also be tested against CherryPy 17 which is the last major version to support Python 2.

@njsmith thanks a lot for your comments here!

cheroot/server.py Outdated Show resolved Hide resolved
cheroot/server.py Outdated Show resolved Hide resolved
cheroot/server.py Outdated Show resolved Hide resolved
cheroot/server.py Outdated Show resolved Hide resolved
cheroot/server.py Outdated Show resolved Hide resolved
cheroot/server.py Outdated Show resolved Hide resolved
if isinstance(req_line, h11.Request):
self.started_request = True
self.uri = req_line.target
scheme, netloc, path, query, fragment = urllib.parse.urlsplit(self.uri)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could save a few lines by doing the assignments directly

Suggested change
scheme, netloc, path, query, fragment = urllib.parse.urlsplit(self.uri)
_scheme, _netloc, self.path, self.qs, _fragment = urllib.parse.urlsplit(self.uri)

cheroot/server.py Outdated Show resolved Hide resolved
"""The HTTPConnection object on which this request connected."""

inheaders = {}
"""A dict of request headers."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a dict, though? I seem to recall that there could be header duplicates...

cheroot/server.py Outdated Show resolved Hide resolved
cheroot/server.py Outdated Show resolved Hide resolved
cheroot/server.py Outdated Show resolved Hide resolved
))

res = h11.Response(status_code=status, headers=self.outheaders,
http_version=self.response_protocol, reason=self.status[3:])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe?

Suggested change
http_version=self.response_protocol, reason=self.status[3:])
http_version=self.response_protocol, reason=self.status[4:])

@ian-otto
Copy link
Contributor Author

So I've been peeking over the code reviews and I agree with most, if not all of them. I do have a few questions in regards to a few of them however.

There are a few instances where I'm attempting to take things from h11 (a lot of byte-strings), and shim them into the format that the old HTTPRequest class used, which increases the complexity and I believe a few of the errors I've made throughout.

@webknjaz Would you like me to attempt to make this change backwards compatible with the existing class, or throw that out in favor of increasing interoperability with h11? Note that some of the expectations (such as sending an HTTP 1.0 request and expecting an HTTP 1.0 response) are already going to be violated.

Co-authored-by: Sviatoslav Sydorenko <[email protected]>
@lgtm-com
Copy link

lgtm-com bot commented Jul 30, 2020

This pull request fixes 2 alerts when merging fa618ab into b6edd3a - view on LGTM.com

fixed alerts:

  • 2 for Unused local variable

cheroot/server.py Outdated Show resolved Hide resolved
@lgtm-com
Copy link

lgtm-com bot commented Jul 31, 2020

This pull request fixes 2 alerts when merging b8d555a into b6edd3a - view on LGTM.com

fixed alerts:

  • 2 for Unused local variable

setup.cfg Outdated Show resolved Hide resolved
Comment on lines +198 to +199
# TODO: h11 doesn't have this specific of messaging. See python-hyper/h11#98
# assert response.read(21) == b'Malformed Request-URI'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ian-otto I think this can be corrected now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes and no. My PR over in that repo only handled updating the messaging for a few specific cases that I felt were very important to be distinct. That said, h11 still gives fairly generic errors when it comes to request line problems. Part of me thinks that in that case, it may be okay to let that happen, since request-line issues are generally pretty easy to spot. I'm curious on your opinion though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we could leave it as-is for now and maybe add that in the future. OTOH it'd nice UX/DX-wise to have precise messages in general.

@ian-otto
Copy link
Contributor Author

As an aside, one of the tests seems to intermittently hang for seemingly days depending on CI platform. Can we set a lower timeout?

@webknjaz
Copy link
Member

@ian-otto what CI job are you talking about specifically? There's some known issues that happen flakily and are unrelated to your changes here...

@ian-otto
Copy link
Contributor Author

ian-otto commented Aug 12, 2020

Back when it was still able to execute tests (running from h11's Git), I had a test timeout after 6 hours on GitHub, and another fail after 12 hours. Here are the related workflow runs:

https://github.com/ian-otto/cheroot/actions/runs/189982387
https://github.com/ian-otto/cheroot/actions/runs/189218067

Edit: I've seen the test after this fail on my local machine as well, I do think it's related to some changed behavior between the old parser and h11: [gw1] [ 95%] PASSED cheroot/test/test_ssl.py::test_https_over_http_error[::] . I'll do some digging to figure out exactly which test seems to fail.

@webknjaz
Copy link
Member

webknjaz commented Aug 12, 2020

@ian-otto there's no 12-hour jobs among your links. Looks like the workflows were stuck because of GitHub's outage. It should be ok now. Also, a few TLS tests are extremely flaky (mostly in slow envs like macOS or Windows) so ignore them.

GH has some free minutes quota for GHA, I'm curious if those outages were factored-in...

@ian-otto
Copy link
Contributor Author

ian-otto commented Aug 12, 2020

I see that you've made a few commits to this PR, but I can't pull them in. I'm not particularly sure how this works, is there a way to fetch them?

P.S. The test suite that timed out on tests/3.5-ubuntu16.04. It then aborted the run, I think it did that repeatedly for every non-2.7 build.

@webknjaz
Copy link
Member

@ian-otto try git pull --rebase, this should help if you have extra commits that haven't been pushed to GitHub and now the history diverges. This may produce some conflicts for you to resolve, though. If you have more problems with that, just tell me and I'll try to help.

Also, this branch is a bit out of date and would need to get the updates from master. Normally, I'd suggest rebasing but seeing that your branch already contains some merge commits, I'd say just merge-in master from upstream (because rebasing with merge commits is PITA most of the time).

@webknjaz
Copy link
Member

webknjaz commented Oct 8, 2020

@ian-otto hey, do you need me to help you out with Git?

tox.ini Outdated Show resolved Hide resolved
setup.cfg Outdated Show resolved Hide resolved
@@ -341,6 +342,7 @@ def test_malformed_request_line(
c.close()


@pytest.mark.xfail(reason='h11 normalizes the method and ignores non-capitalized methods')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ian-otto looks like v0.11.0 fixes this

@stale

This comment was marked as outdated.

@stale stale bot added the stale This thing has been ignored for too long label Dec 25, 2020
@webknjaz webknjaz removed the stale This thing has been ignored for too long label Dec 29, 2020
@jaraco
Copy link
Member

jaraco commented Nov 19, 2022

Closing as abandoned. Feel free to revive (original poster or others).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improvement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Defer HTTP/1.1 parsing to Hyper (h11 project)
4 participants