-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Jinja2 autoescaping #1921
Use Jinja2 autoescaping #1921
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks reasonable. Please proceed with the Jinja2 autoescape approach 👍
6039713
to
3757368
Compare
Integration tests fail because the recently released filecheck 1.0.0 doesn't support |
3757368
to
d1a7be8
Compare
Sorry for the inconvenience. We are in the process of transferring the filecheck project to a group of developers who want to actively maintain and develop their own version of I have reported this here: AntonLydike/filecheck#19 (comment). |
d1a7be8
to
9b71ab0
Compare
The latest commit "Mark all template.render() output as Markup" is to be discussed: We have to flag every programmatically generated HTML string as "safe", otherwise autoescaping would wrongly process them. This notably applies to all I asked Jinja2 folks whether such a feature could be implemented, see pallets/jinja#2003. |
9b71ab0
to
92f45e1
Compare
...eqif/UC55_G1_validations/UC55_G1_T01_not_a_reqif_format/test_UC55_G1_T01_not_a_reqif_file.py
Show resolved
Hide resolved
I like the idea of I commented two times, asking about the possible redundancy of Markup / autoescape=True. I would like you to confirm that both mechanisms have to be used by us (Markup and autospace=True).
I could not understand what you are referring to in your example there:
How can it make the result of render automatically if the result is a simple >>> inner = Template('<b>{{ first_name }}</b> {{ last_name }}', autoescape=True).render(first_name='John', last_name="Doe")
>>> print(type(inner))
<class 'str'> While reviewing your PR, I considered an alternative approach that could be possible instead of doing the escaping with Jinja. I recently encountered several implicit conversions with Jinja, when an object is implicitly called An alternative could be that the whole StrictDoc uses a wrapper of a With this approach, the diff could be even bigger but in return we would get a very good control over the strings that we really want to use everywhere. It is in the same direction like the code of I have confidence that this approach could work just fine but I don't have enough data to compare the advantages/disadvantages of using Jinja's escaping magic vs packing that explicitly in The Another argument for maintaining And finally there was another source for me thinking in the direction of custom string wrappers. StrictDoc has many cases where we want to deal with a EDITED: These are only thoughts. Let's move forward with Jinja's autoescape, safe, Markup. The alternative approach could work but let's keep it aside. |
f9de42d
to
175b8a8
Compare
175b8a8
to
e198268
Compare
Sorry for not commenting on your alternative approach yet. It sounds interesting. I was quite focused here and wanted to get autoescaping in a workable state first. Will comment soon and suggest to hop back to #1920 for discussion to not interleave the autoescaping approach with the sdoc_str approach too much. |
strictdoc/export/html/generators/view_objects/document_screen_view_object.py
Show resolved
Hide resolved
Thanks a lot for contributing this! |
SourceFileViewHTMLGenerator escaping code had multiple issues: - When marking each line from range_start_pragma_processor as safe, we wrapped not only strings but also whole tuples in Markup(). - For tuple elements we didn't differentiate whether they from pygments (already escaped) or from source file (not yet escaped). Only the later must be explicitly escaped. Relates to strictdoc-project#1921.
SourceFileViewHTMLGenerator escaping code had multiple issues: - When marking each line from range_start_pragma_processor as safe, we wrapped not only strings but also whole tuples in Markup(). - For tuple elements we didn't differentiate whether they from pygments (already escaped) or from source file (not yet escaped). Only the later must be explicitly escaped. Some changes are there to make mypy happy. Relates to strictdoc-project#1921.
SourceFileViewHTMLGenerator escaping code had multiple issues: - When marking each line from range_start_pragma_processor as safe, we wrapped not only strings but also whole tuples in Markup(). - For tuple elements we didn't differentiate whether they from pygments (already escaped) or from source file (not yet escaped). Only the later must be explicitly escaped. Some changes are there to make mypy happy. Relates to strictdoc-project#1921.
SourceFileViewHTMLGenerator escaping code had multiple issues: - When marking each line from range_start_pragma_processor as safe, we wrapped not only strings but also whole tuples in Markup(). - For tuple elements we didn't differentiate whether they from pygments (already escaped) or from source file (not yet escaped). Only the later must be explicitly escaped. Some changes are there to make mypy happy. Relates to strictdoc-project#1921.
SourceFileViewHTMLGenerator escaping code had multiple issues: - When marking each line from range_start_pragma_processor as safe, we wrapped not only strings but also whole tuples in Markup(). - For tuple elements we didn't differentiate whether they from pygments (already escaped) or from source file (not yet escaped). Only the later must be explicitly escaped. Some changes are there to make mypy happy. Relates to strictdoc-project#1921.
This enables Jinja2 autoescaping everywhere, except for the diff screen which relies on pre generated HTML. The MR is not meant to be merged as-is.
It fixes the issues observed in #1920. Most integration tests pass. Tests fail for UI forms, presumably because they call
html.escape
manually and now are escaped twice.Jinja2 recommneds autoescaping:
Templates access way more functions and variables than just
enumerate_all_fields()
. All need to be escaped. This does it automatically without risk of forgetting some places.This MR changes some formerly explicit escaping:
enumerate_all_fields_escaped
was thehtml.escape()
variant ofenumerate_all_fields
. It's only evaluated in Jinja2 templates, therefore completely covered by autoescaping now. This also eliminates the need forget_text_value_escaped
, which was only used byenumerate_all_fields_escaped
.RequirementFormField.field_escaped_value
was thehtml.escape()
variant offield_unescaped_value
.field_escaped_value
is only evaluated in Jinja2 templates, therefore completely covered by autoescaping now.SearchScreenViewObject.search_value
was explictlyhtml.escape()
ed.search_value
is only read by Jinja2 templates, therefore completely covered by autoescaping now.SourceFileViewHTMLGenerator
changes some pygments output lines. pygements output is escaped, but we still have to escape our own changes. Changedhtml.escape
tomarkupsafe.escape
because that's what Jinja2 uses. It's better to use the same escaping function everywhere.TextToHtmlWriter
escaped plain text from multiline statements. We still need to do that. Changedhtml.escape
tomarkupsafe.escape
for same reason as above.Considerations about what to escape
render_*_link
make strings used as URL in href="xyz
" . It's derived from filenames and UIDs. UIDs are restricted by a grammar regex. But Unix files can have weird names likedocument"<injected>.sdoc
. So generated links should be URL encoded. HTML encoding is then not strictly necessary (wouldn't hurt either; could be considered as safety net). Suggestion: Do proper URL encoding in another PR; for now autoescaping prevents at least injection attacks.TODO:
| safe
withMarkup()
in codeTemplate.render()
programmatically. This output can be considered safe, yet we have to wrap it inMarkup
. Is there a way to automatically flagTemplate.render()
as safe? Should already be that way: "Jinja functions (macros, super, self.BLOCKNAME) always return template data that is marked as safe."