-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
don't show base64 data to user #92
Comments
Hi, do you have an example HAR? |
Yep! habr.ru.har.zip |
Thanks. How was this HAR generated? I don't recognise the following
or User Agent header: Mozilla/5.0 (Macintosh; Intel Mac OS X) AppleWebKit/538.1 (KHTML, like Gecko) server.py Safari/538.1 |
@gitgrimbo it was generated using Splash. Splash uses HAR as a data export format; it also embeds harviewer in a script debugging page. |
Thanks. I see what you mean. Pasting image here for reference. First row shows a base64-encoded HTML response in the Response tab. Second row shows a similar HTML response, but decoded in the Highlighted tab. To discuss a couple of your points:
|
Thanks for looking at it! Regarding Response tab: currently it doesn't show raw content of the webpage or raw response content, it shows data stored in HAR JSON as-is. This is not the same as raw response content because 'encoding' HAR argument is not handled (see http://www.softwareishard.com/blog/har-12-spec/#content). This is useful for debugging HAR files, but not for debugging received responses. This base64 encoding is a technical detail of how the data is stored in HAR, not something specific to a website. That's why I think Response tab should show response content; currently it doesn't show it. |
Yeah I think you're right. So is it true that whenever the I'm trying to think of any exception to that rule. |
Yeah, I think it is good to always decode text if encoding is present. The exception could be unknown encoding (only base64 is mentioned is standard). Another tricky case is binary (or any non-utf8) data; it is not clear how to show it in a decoded form. |
Hi @kmike, I've uploaded this branch for you to try, http://gitgrimbo.github.io/harviewer/issue-92/. It simply tries to decode every HAR entry for the Response tab. It does the right thing for the first two HTML entries in your example HAR. But the third seems to have charset issues; the title displays as follows: And now images and other binary files are also shown in their decoded raw state. I'm not sure if this is a good or bad thing to be honest. If you could take a look I'd appreciate it, and maybe think of any reasons why every entry should not be decoded in this way as I'm not sure I've thought about all the possibilities here. |
Using the tips from here, https://developer.mozilla.org/en/docs/Web/API/WindowBase64/Base64_encoding_and_decoding, I think the issue is a UTF8/UTF16 thing. Following the tips the text now renders correctly. |
Images should probably be encoded in base64 too. Is there a way to view the decoded base64 with the appropriate type? i.e. if it's a base64-ed image, let the user view the image, if it's base64-ed HTML, let the user view the resulting (decoded) page. |
Hey,
I started to use base64-encoded HAR content recently - it is not possible to guarantee that content can be passed in JSON otherwise, even for content with html or json mime types. HTML can use encoding other than utf-8, and even data which is sent with application/json content-type can be binary if server wants.
But this switch to 'base64 by default' makes it less easy for harviewer: e.g. for HTML both 'Response' and 'HTML' tabs display base64-encoded data. 'Highlighted' gets a decoded version, but for large HTML pages it is very slow. There is a similar issue for JSON files: 'Response' tab displays confusing base64-encoded data. 'Response' tab for images also shows base64 version of the binary data.
I think it is better to either remove tabs with base64-encoded data, or to try decoding it more aggresively. I'm not sure what's the use case for showing base64 to user; user may think it is a bug (which I think happened already for Splash). Also, there is no visual distinction between base64-encoded data and non-base64-encoded data, so e.g. a true base64 response will look the same as a HTML response which HAR generating software encoded to base64 in order to store without data loss.
The text was updated successfully, but these errors were encountered: