-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split numbering systems out of decimal data #5822
Conversation
Digits becomes much larger in "all" mode. I've added code for that but not hooked it in yet.
|
455e5e9
to
6c46fc5
Compare
6c46fc5
to
73fe997
Compare
I don't really understand the question? CompactDecimalFormatter depends on FixedDecimalFormatter and so it should already be using the new data markers. |
Ah, we lose out on some of our wins when we handle the fact that the symbols data can differ for a given locale between numbering systems.
|
73fe997
to
1c0321e
Compare
ae2f69b
to
6e99b6d
Compare
4e2f9b6
to
e8eaea9
Compare
2df57b9
to
a047e62
Compare
Depends on #5822 Minor win: ``` Before: decimal/symbols@2, <total>, 2740B, 1368B, 49 unique payloads After: decimal/symbols@2, <total>, 2657B, 1285B, 49 unique payloads ``` And an 8 byte win on FDF stack size. <!-- Thank you for your pull request to ICU4X! Reminder: try to use [Conventional Comments](https://conventionalcomments.org/) to make comments clearer. Please see https://github.com/unicode-org/icu4x/blob/main/CONTRIBUTING.md for general information on contributing to ICU4X. -->
Fixes #5818
Before:
After:
Saving ~1.5kB, a good third of the data size. A lot of the wins are just in deduplication.
I'm also going to try moving the tinystr into the VarZeroVec and seeing what happens.
I may also try and store the digits more compactly as an
enum { Sequential(char), Many(ZeroVec<char>) }
. A downside of this is that the Sequential case would need UTF8 validation every time, though we could make it so that that's just the wire format and we expand to a digit array on data load.Todo: add configurability for this.
@sffc In the long run CompactDecimal / etc should also be using this data. In that case, should we just always generate all known decimal systems? How would the unification work across keys?