-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow using alternative markdown renderers #455
base: master
Are you sure you want to change the base?
Allow using alternative markdown renderers #455
Conversation
Shelling out to pandoc twice for each markdown cell can be very slow. Using a single-process/in-memory renderer for any markdown can be noticably faster. For instance, builds of https://github.com/stellargraph/stellargraph's docs go from ~2 minutes to 40 seconds.
Hello @huonw! Thanks for opening this PR. We checked the lines you've touched for PEP 8 issues, and found:
|
I would point out: https://github.com/executablebooks/markdown-it-py, which has tables, among other extensions, and is already used in https://myst-parser.readthedocs.io, https://myst-nb.readthedocs.io/en/latest/ and https://jupyterbook.org 😁 |
Yes, definitely! Instead of converting from Markdown to reST and then from reST to a
Ideally this should be a generic extension point, so we wouldn't have to support a limited set of parsers.
Yes, I think it should have at least that level of generality. I'm not sure a class is the best API, though. I think a function should suffice. I think that's what
This means, instead of Markdown "renderers" which generate reST code, we would need Markdown "parsers" that can append to a doctree. A very important feature of those parsers would be to support piecemeal parsing, while not destroying the section hierarchy. For example, one Markdown cell might "open" a section of a certain level. I don't know if the existing Markdown/CommonMark parsers with "doctree" output do support this feature. @chrisjsewell Can MyST-Parser parse partial Markdown documents? I think the most promising option for a parser would be something between markdown-it-py and MyST-Parser. Something that supports parsing into |
That issue has been open for a while, has there been more progress other than the comments on it? If not, is it worth pursuing this PR as an intermediate step? One concern that I guess you might have is migration: at the moment I guess switching to docutils directly would be unobservable to a user (except for differences to Pandoc's rendering), whereas with this change (with the arbitrary-function version) it would be observable to users who have a custom
If this is generalised, would you want built-in support for any parsers other than Pandoc? At least with the current MD -> reST approach, the function required for other libraries is tiny (e.g. the core of the CommonMark one is two lines, and mistune would be similar).
@chrisjsewell I think this doesn't work with the current setup of nbsphinx, for the same reason as Python-Markdown (and mistune < 2): no reStructuredText rendering? |
Not that I know of, except the things already discussed in this PR. We would need an appropriate Markdown parser before #36 can be tackled. The MyST/markdown-it-py project looks promising, but I think it doesn't have all the needed features (yet?).
No, not when it exposes an implementation detail like the intermediate reST representation. But if it just "invisibly" switches the Markdown parser, I'm all for it!
Exactly.
I guess there should be some parser selected by default. I don't care which one, as long as it has the right features.
You would still have to take care of the Markdown extensions used in Jupyter. Do you already have a feature-complete parser to replace Pandoc? I have no problem with switching from Pandoc to another parser, as long as we don't expose the intermediate reST representation. |
Shelling out to pandoc twice for each markdown cell can be slow. Using a single-process/in-memory renderer for any markdown can be noticably faster.
This adds support for using commonmark-py via `nbsphinx_markdown_renderer = "commonmark".
For instance, builds of https://github.com/stellargraph/stellargraph's docs go from ~2 minutes to 40 seconds (stellargraph/stellargraph#1517).
This is a draft PR because there's several open questions, that would be good to collaborate on/clarify before I spend more effort on this:
nbconvert
depends on version 0.8.4nbsphinx_markdown_renderer
to be set to any subclass ofnbsphinx.MarkdownRenderer
(in addition to astr
name) to let users customise(NB. the diff here is much smaller ignoring whitespace, to skip over the changed indentation of the
markdown2rst
: https://github.com/spatialaudio/nbsphinx/pull/455/files?w=1)