You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I believe that BiDi control characters (that includes 200E and 200F, along with 202A-202E and 2066-2069, are handled entirely in the text shaping and layout engine, and do not need cmap entries in the font.
you don't normally have explicit glyphs in the font for control characters, since they are handled higher up the text-processing stack and won't appear in runs to be shaped.
It would be useful to add a check for control characters in this range and flag at WARN level. It does not violate spec, but appears to be unnecessary and (likely very slightly) increases file size.
davelab6
changed the title
Confirm that there are no bidi control character cmap entries in fonts
Confirm that there are blank control character cmap entries in fonts
Mar 18, 2021
This is the list of code points based on the current state of the conversation:
0x00AD // SOFT HYPHEN
0x034F // COMBINING GRAPHEME JOINER
0x061C // ARABIC LETTER MARK
(0x200C <= c && c <= 0x200F) // ZERO WIDTH NON-JOINER..RIGHT-TO-LEFT MARK
(0x202A <= c && c <= 0x202E) // LEFT-TO-RIGHT EMBEDDING..RIGHT-TO-LEFT OVERRIDE
(0x2066 <= c && c <= 0x2069) // LEFT-TO-RIGHT ISOLATE..POP DIRECTIONAL ISOLATE
0xFEFF // BYTE ORDER MARK
So, it seems to me we ought to map which scripts should include which (empty) control characters, and then check what the script is, and per script, if they exist in cmap with no ink data.
Behdad said he isn't aware of such a mapping. A harfbuzz community member contributed some interesting information to the hb wiki last month, that might help create such a mapping: harfbuzz/harfbuzz#2862
He also offered this tip:
ZWJ/ZWNJ is useful in all Arabic-joining-like and all Brahmi-like scripts. That's everything going to Arabic, Indic, Khmer, Myanmar, and USE shapers in HarfBuzz. The mapping is in:
We recognized that some Noto fonts (e.g., Noto Sans Hebrew) include bidi control characters.
According to @raphlinus:
and @simoncozens:
It would be useful to add a check for control characters in this range and flag at WARN level. It does not violate spec, but appears to be unnecessary and (likely very slightly) increases file size.
Related https://github.com/googlefonts/noto-fonts/issues/2036
The text was updated successfully, but these errors were encountered: