-
Notifications
You must be signed in to change notification settings - Fork 13
Unknown Html Tags
JensDiemer edited this page Apr 5, 2015
·
1 revision
We have different modes to handle unknown html tags when converting html to creole:
- Raise !NotImplementedError on unknown tags.
- Use
<<html>>
macro to mask unknown tags. - Escape all unknown tags.
- Remove all unknown tags.
As default behaviour we use the last one and remove all unknown html tags.
You can change the default behaviour by passing a callable to !Html2CreoleEmitter() class or to html2creole() function.
from creole import html2creole
from creole.shared.unknown_tags import raise_unknown_node
print html2creole(u"<unknown><strong>foo</strong></unknown>", unknown_emit=raise_unknown_node)
result:
Traceback (most recent call last):
...
NotImplementedError: Node from type 'unknown' is not implemented!
from creole import html2creole
from creole.shared.unknown_tags import use_html_macro
print html2creole(u"<unknown><strong>foo</strong></unknown>", unknown_emit=use_html_macro)
result:
<<html>><unknown><</html>>**foo**<<html>></unknown><</html>>
from creole import html2creole
from creole.shared.unknown_tags import escape_unknown_nodes
print html2creole(u"<unknown><strong>foo</strong></unknown>", unknown_emit=escape_unknown_nodes)
result:
<unknown>**foo**</unknown>
from creole import html2creole
from creole.shared.unknown_tags import transparent_unknown_nodes
print html2creole(u"<unknown><strong>foo</strong></unknown>", unknown_emit=transparent_unknown_nodes)
result:
**foo**
You can also pass the callable to !Html2CreoleEmitter():
from creole.html_parser.parser import HtmlParser
from creole.html2creole.emitter import CreoleEmitter
from creole.shared.unknown_tags import escape_unknown_nodes
h2c = HtmlParser(debug=False)
document_tree = h2c.feed(u"<unknown><strong>foo</strong></unknown>")
emitter = CreoleEmitter(document_tree, debug=False, unknown_emit=escape_unknown_nodes)
print emitter.emit()
result:
<unknown>**foo**</unknown>