premailer doesn't appear to work on m1 macs #249

davidfwatson · 2021-02-23T01:39:46Z

Running premailer on an m1 mac with pretty standard html content works correctly on my 2019 imac, but on my m1 mac, it results in garbage output:

❯ python -m premailer -f example_html.html

h t m l l a n g = " e n " >

%

gdvalderrama · 2021-06-17T07:58:20Z

@davidfwatson could you add the content of example_html.html or some other file that reproduces the problem?

davidfwatson · 2021-06-19T17:57:50Z

I think the issue is unicode characters! I've created an example, and attached it.
example_html.html.zip

peterbe · 2021-06-21T13:19:03Z

I think the issue is unicode characters! I've created an example, and attached it.
example_html.html.zip

import premailer

with open('/Users/peterbe/Downloads/example_html.html/example_html.html') as f:
   html = f.read()

out = premailer.transform(
    html,
)
print(out)

outputs:

<html lang="en">
    <head>
        <meta http-equiv="Content-Type" content="text/html charset=UTF-8">
        <title>Example title</title>
    </head>
    <body>
🌐
    </body>
</html>

By the way, it's supposed to be:

-<meta http-equiv="Content-Type" content="text/html charset=UTF-8" />
+<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

in case that matters to your other tooling.

davidfwatson · 2021-06-23T19:29:05Z

So, on my m1 mac, I don't get that, output, I get this:

❯ python3 -m premailer -f example_html.html
<html><head></head><body><p>h   t   m   l       l   a   n   g   =   "   e   n   "   &gt;
                   </p></body></html>%

peterbe · 2021-06-23T20:47:06Z

Here's what I get:

▶ python3 -m premailer -f /Users/peterbe/Downloads/example_html.html/example_html.html ; echo
<html lang="en">
    <head>
        <meta http-equiv="Content-Type" content="text/html charset=UTF-8">
        <title>Example title</title>
    </head>
    <body>
🌐
    </body>
</html>

I have...:

▶ python3 --version
Python 3.8.1
▶ pip list | rg lxml
lxml              4.5.0

What do you have?

And do you have the file program to check the encoding of the file? This is what I get:

▶ file /Users/peterbe/Downloads/example_html.html/example_html.html
/Users/peterbe/Downloads/example_html.html/example_html.html: HTML document text, UTF-8 Unicode text

davidfwatson · 2021-06-23T21:01:22Z

❯ python3 --version
Python 3.9.2
❯ pip list | grep lxml
lxml                     4.6.2
❯ file example_html.html
example_html.html: HTML document text, UTF-8 Unicode text

peterbe · 2021-06-24T12:43:13Z

@davidfwatson Just for sanity checking, what do you get when you run:

cat example_html.html

davidfwatson · 2021-06-25T04:57:54Z

❯ cat example_html.html
<html lang="en">
    <head>
        <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
        <title>Example title</title>
    </head>
    <body>
🌐
    </body>
</html>%

peterbe · 2021-06-25T18:50:09Z

I'm at loss then. I don't know what could be going on.

I have seen really strange behaviors coming out of lxml before when emojiis are involved.
It would be nice to be able to understand when that HTML string, after being read in, becomes garble. And if any of that is related to premailer or somewhere else.

davidfwatson · 2021-06-25T20:35:07Z

So you’re getting good results on an m1 Mac?

…

On Fri, Jun 25, 2021 at 11:50 Peter Bengtsson ***@***.***> wrote: I'm at loss then. I don't know what could be going on. I have seen really strange behaviors coming out of lxml before when emojiis are involved. It would be nice to be able to understand when that HTML string, after being read in, becomes garble. And if any of that is related to premailer or somewhere else. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#249 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAYE3KCNEEJSC3C3PVFVXPLTUTFWVANCNFSM4YBSUTRQ> .

peterbe · 2021-06-28T13:36:53Z

I don’t have m1 Mac :( It would be interesting to install that exact version of lxml on mine to see if it matters. Or if you try to up- or downgrade lxml to see if the problem goes away. On Fri, Jun 25, 2021 at 4:35 PM davidfwatson ***@***.***> wrote:

So you’re getting good results on an m1 Mac? On Fri, Jun 25, 2021 at 11:50 Peter Bengtsson ***@***.***> wrote: > I'm at loss then. I don't know what could be going on. > > I have seen really strange behaviors coming out of lxml before when > emojiis are involved. > It would be nice to be able to understand when that HTML string, after > being read in, becomes garble. And if any of that is related to premailer > or somewhere else. > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <#249 (comment) >, > or unsubscribe > < https://github.com/notifications/unsubscribe-auth/AAYE3KCNEEJSC3C3PVFVXPLTUTFWVANCNFSM4YBSUTRQ > > . > — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#249 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAGQ427CIXQSXM6NH3YTTLTUTSAJANCNFSM4YBSUTRQ> .

-- Peter Bengtsson Mozilla MDN Web Docs https://www.peterbe.com

davidfwatson · 2021-06-28T16:43:11Z

I can give that a shot, but just to be clear, I ran it on my intel mac with identical versions and it did work. I'll try to find time to match your versions and rerun today, but I suspect the result will be the same.

securibee · 2021-07-05T16:22:58Z

I'm running into this identical issue on a M1 mac as well.

python3 --version
Python 3.8.2
pip list | grep lxml
lxml 4.6.3

peterbe · 2021-07-06T11:33:07Z

Can we try to figure out if premailer is using lxml in a way that can be fixed for m1 macs? Or is it a hard bug in lxml and if so do we have a tracker URL for that?

securibee · 2021-07-17T21:17:47Z

I downgraded lxml and it's working for me with version 4.5.0, give that a go @davidfwatson.

davidfwatson · 2021-07-24T06:46:16Z

Sorry to report, but it doesn't appear to have made a difference for me:

❯ pip3 list | grep lxml

lxml       4.5.0

❯ cat example_html.html
<html lang="en">
    <head>
        <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
        <title>Example title</title>
    </head>
    <body>
🌐
    </body>
</html>%

❯ python3 -m premailer -f example_html.html
<html><head></head><body><p>h   t   m   l       l   a   n   g   =   "   e   n   "   &gt;
                   </p></body></html>%

laurikari · 2022-05-16T18:40:21Z

This does appear to be a bug in lxml, and the most recent version 4.8.0 is still affected.

Here's the lxml bug: https://bugs.launchpad.net/lxml/+bug/1949271

The bug report reveals that a workaround is to use UTF-16 or UTF-32 instead of UTF-8.

Instead of

html = transform(html)

this works for me:

parsed = etree.fromstring(html.strip().encode('utf-32'), etree.HTMLParser())
html = etree.tostring(transform(parsed), method='html', encoding='utf-8').decode()

It ain't pretty, but at least it works.

medmunds · 2023-05-10T01:13:14Z

I'm able to use html entity encoding as a workaround. (I guess anything that avoids lxml having to deal with utf-8 input...)

>>> from premailer import transform
>>> html = "<p>🌐</p>"

>>> transform(html)
Traceback (most recent call last):
  ...
  File ".../python3.11/site-packages/premailer/premailer.py", line 353, in transform
    tree = etree.fromstring(stripped, parser).getroottree()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'getroottree'

>>> html.encode("ascii", "xmlcharrefreplace").decode("ascii")
'<p>&#127760;</p>'

>>> transform(html.encode("ascii", "xmlcharrefreplace").decode("ascii"))
'<html>\n<head></head>\n<body><p>🌐</p></body>\n</html>\n'

Premailer 3.10.0, lxml 4.9.2, Python 3.11.1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

premailer doesn't appear to work on m1 macs #249

premailer doesn't appear to work on m1 macs #249

davidfwatson commented Feb 23, 2021

gdvalderrama commented Jun 17, 2021

davidfwatson commented Jun 19, 2021

peterbe commented Jun 21, 2021

davidfwatson commented Jun 23, 2021

peterbe commented Jun 23, 2021

davidfwatson commented Jun 23, 2021

peterbe commented Jun 24, 2021

davidfwatson commented Jun 25, 2021

peterbe commented Jun 25, 2021

davidfwatson commented Jun 25, 2021 via email

peterbe commented Jun 28, 2021 via email

davidfwatson commented Jun 28, 2021

securibee commented Jul 5, 2021 •

edited

Loading

peterbe commented Jul 6, 2021

securibee commented Jul 17, 2021

davidfwatson commented Jul 24, 2021

laurikari commented May 16, 2022

medmunds commented May 10, 2023

premailer doesn't appear to work on m1 macs #249

premailer doesn't appear to work on m1 macs #249

Comments

davidfwatson commented Feb 23, 2021

gdvalderrama commented Jun 17, 2021

davidfwatson commented Jun 19, 2021

peterbe commented Jun 21, 2021

davidfwatson commented Jun 23, 2021

peterbe commented Jun 23, 2021

davidfwatson commented Jun 23, 2021

peterbe commented Jun 24, 2021

davidfwatson commented Jun 25, 2021

peterbe commented Jun 25, 2021

davidfwatson commented Jun 25, 2021 via email

peterbe commented Jun 28, 2021 via email

davidfwatson commented Jun 28, 2021

securibee commented Jul 5, 2021 • edited Loading

peterbe commented Jul 6, 2021

securibee commented Jul 17, 2021

davidfwatson commented Jul 24, 2021

laurikari commented May 16, 2022

medmunds commented May 10, 2023

securibee commented Jul 5, 2021 •

edited

Loading