Replies: 3 comments
-
Hmm. Based on my understanding of Section 2.2 of XML 1.0 5th ed., control characters are invalid in XML documents even when represented by a character reference. Just in case I misinterpreted the spec, I tried parsing
I'd prefer not to relax this check, even via an option, since it seems like doing so would result in a parser that consumes invalid documents, and my goal with parse-xml is to conform to the spec as closely as is practical (minus the unsafe parts). Of course, you should feel free to maintain this change yourself in a fork if it would be useful to you! |
Beta Was this translation helpful? Give feedback.
-
Thank you for your feedback. It seems that my files do conform to XML 1.1, as Section 2.2. of XML 1.1 does expand the char class to allow these additional control characters. My understanding is that XML 1.1 is not very common or widely implemented, so I'll just try some changes in my fork. Thanks again. |
Beta Was this translation helpful? Give feedback.
-
Ah, interesting. My reason for not implementing an XML 1.1 parser is basically that I came across this old excerpt from the book Effective XML, which pretty strongly recommends not bothering with XML 1.1:
But I'm not completely opposed to the idea of adding XML 1.1 parsing capability as an opt-in feature via the |
Beta Was this translation helpful? Give feedback.
-
Hi there,
I'm across some XML files that contain some hex references of control characters, e.g.
<a>hello</a>
. Currently, these files give an error parsing with this library due to the check on line 400.What are your thoughts on relaxing this check, or perhaps having an explicit opt-in option, to allow explicit character references in this control character range?
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions