Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Django with django-seo-js returning "invalid code length" error to browser and curl #23

Open
jdotjdot opened this issue Apr 9, 2016 · 2 comments

Comments

@jdotjdot
Copy link
Contributor

jdotjdot commented Apr 9, 2016

Hey,

All of the sudden, Google, Facebook, and other crawlers started reporting my website as being unavailable. We're using django-seo-js==0.2.4 and the paid version of Prerender.io. Testing using _escaped_fragment_ in the browser and using cURL showed that responses are for some reason returning invalid:
image

curl: (61) Error while processing content unencoding: invalid code lengths set

I made no changes to anything affecting Prerender.io nor this configuration of this library; this just started happening out of the blue.

Doing some deeper walkthroughs through the code, Prerender.io appears to cache the content correctly, and calling self.backend.get_response_for_url(url) also returns a response with the correctly rendered HTML content, including getting the response from Prerender and transforming the requests response into a Django HttpResponse object.

When that gets returned, though, for some reason both the browser and curl think it's invalid.

I've done plenty of debugging but I'm a bit at a loss here; all I can think of is that base.py:56 is too naive with r['content-length'] = len(response.content), or it's some type of gzip issue, where somehow headers or encodings or getting passed on that shouldn't be.

Ultimately, though, my site is currently not crawlable, and that's obviously a major issue for us.

@jdotjdot
Copy link
Contributor Author

jdotjdot commented Apr 9, 2016

Some more research on this is showing that it might be because django-seo-js depends on requests 2.2.1, which is an older version of requests. It may be incompatible with the current requests 2.9.1.

@jdotjdot
Copy link
Contributor Author

jdotjdot commented Apr 9, 2016

It ended up turning out that the issue was django_seo_js is passing on a Content-Encoding header from PrerenderIO, which is causing all the problems.

Subclassing with the below code fixed it:

from django.http import HttpResponse
from django_seo_js.backends import PrerenderIO
from django_seo_js.backends.base import RequestsBasedBackend, IGNORED_HEADERS

class FixedRequestsBasedBackend(RequestsBasedBackend):

    def build_django_response_from_requests_response(self, response):
        # Key difference -- we're excluding "content-encoding" from the response
        r = HttpResponse(response.content)
        for k, v in response.headers.items():
            if k.lower() not in IGNORED_HEADERS:
                r[k] = v
        r['content-length'] = len(response.content)
        r.status_code = response.status_code
        return r


class FixedPrerenderIO(FixedRequestsBasedBackend, PrerenderIO):
    pass

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant