Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Monkeypatching html.parser fails when importing with zipimporter #1132

Open
remyroy opened this issue May 12, 2021 · 6 comments
Open

Monkeypatching html.parser fails when importing with zipimporter #1132

remyroy opened this issue May 12, 2021 · 6 comments
Labels
bug Bug report. core Related to the core parser code. someday-maybe Approved low priority request.

Comments

@remyroy
Copy link

remyroy commented May 12, 2021

When trying to monkeypatch a copy of html.parser, markdown fails when html.parser comes from a zip file and an instance of zipimporter is used as the loader.

To reproduce

  1. Download an embeddable package version of Python like the latest Windows embeddable package (64-bit) (Python 3.9.5)
  2. Unzip it
  3. Copy the markdown package into the python folder
  4. Run python
  5. Import markdown
  6. Notice the following traceback:
$ python
Python 3.9.5 (tags/v3.9.5:0a7dcbd, May  3 2021, 17:27:52) [MSC v.1928 64 bit (AMD64)] on win32
>>> import markdown
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\Rémy Roy\Projects\CDDA-Game-Launcher\build\test\markdown\__init__.py", line 29, in <module>
    from .core import Markdown, markdown, markdownFromFile  # noqa: E402
  File "C:\Users\Rémy Roy\Projects\CDDA-Game-Launcher\build\test\markdown\core.py", line 27, in <module>
    from .preprocessors import build_preprocessors
  File "C:\Users\Rémy Roy\Projects\CDDA-Game-Launcher\build\test\markdown\preprocessors.py", line 29, in <module>
    from .htmlparser import HTMLExtractor
  File "C:\Users\Rémy Roy\Projects\CDDA-Game-Launcher\build\test\markdown\htmlparser.py", line 31, in <module>
    spec.loader.exec_module(htmlparser)
AttributeError: 'zipimporter' object has no attribute 'exec_module'

The main issue is that the zipimporter loader does not have an exec_module method.

@remyroy
Copy link
Author

remyroy commented May 12, 2021

This is related to the work of @waylan in #803

@waylan
Copy link
Member

waylan commented May 13, 2021

This puts us in a weird jam. We need to monkeypatch the html parser. However, we should never be changing the behavior of the default parser. For example, a user of Python-Markdown could import and run markdown and in the same script import and use an instance of the html parser from the standard lib. In that case, the user should not get our modified behavior. To avoid that, we specifically import the lib in a way that Python sees it as a separate package than the normally imported package. Is there a way to accomplish that which also works with zipimporter? If not, then we can't support zipimporter.

@waylan waylan added the needs-decision A decision needs to be made regarding request. label May 13, 2021
@remyroy
Copy link
Author

remyroy commented May 13, 2021

I'm not familiar enough with the importing internals. My guess is that there should be way to support that with the zipimporter since it's just the same source code in a zip file.

@waylan waylan added the core Related to the core parser code. label Nov 3, 2021
@waylan waylan added bug Bug report. more-info-needed More information needs to be provided. labels Aug 22, 2022
@waylan
Copy link
Member

waylan commented Aug 22, 2022

I just did some digging on this issue and it appears that this has been fixed upstream (in Python's zipimport module) in https://bugs.python.org/issue42131 (see the changes in d2e94bb0). It would appear that the changes are available in Python 3.10.

We could close this as an upstream issue. However, that does not address the issue for Python < 3.10. If someone was to provide a patch for that it would be considered. Therefore, I will leave this open for the time being.

@waylan waylan removed the more-info-needed More information needs to be provided. label Aug 22, 2022
@Alexey-T
Copy link

Alexey-T commented Aug 22, 2022

@waylan

Unfortunately, I am not aware of any workarounds.

i suggest this workaround.
you get patched version of zipimport and put it near the MD module. you rename it a little - eg to zipimport_fixed. you use renamed module instead. use relative import please, from . import zipimport_fixed

@veksha FYI

@waylan
Copy link
Member

waylan commented Aug 22, 2022

@Alexey-T we have certainly done that in the past. However, we don't call the zipimport lib at all. However, packages which use embeddable Python do. Therefore, the package which explicitly imports zipimport would need to take the steps you recommend.

By way of explanation, we use the new standard API defined in PEP 451 as found in the importlib module of the Python Standard Library. However, embeddable Python replaces calls to importlib with zipimport, which up until Python 3.10 still only defined the old API from PEP 302. Therefore, we don't have any control over which lib from the standard library is called.

I'm sure there is still some workaround that would address the issue, but it is not immediately clear what that would be and it would likely require extensive testing on a system we don't use (and are not familiar with). Therefore, it is not a high priority issue for us. However, if someone who is familiar with and uses a relevant system were to work up and submit a fix with tests, then we would be willing to review it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug report. core Related to the core parser code. someday-maybe Approved low priority request.
Projects
None yet
Development

No branches or pull requests

3 participants