Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Binary size benchmarking: Rust script to measure size of the ICU4X ex… #871

Merged
merged 2 commits into from
Jul 29, 2021

Conversation

gnrunge
Copy link
Contributor

@gnrunge gnrunge commented Jul 21, 2021

…amples

compiled into wasm binaries. Set up GHA to build wasm binaries, measure
file sizes, push results into benchmark dashboard .

Resolves ticket #140.

@gnrunge gnrunge requested a review from a team as a code owner July 21, 2021 16:35
@codecov-commenter
Copy link

codecov-commenter commented Jul 21, 2021

Codecov Report

Merging #871 (ab01351) into main (e5f7a5e) will decrease coverage by 0.10%.
The diff coverage is 3.84%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #871      +/-   ##
==========================================
- Coverage   74.30%   74.20%   -0.11%     
==========================================
  Files         206      207       +1     
  Lines       13038    12371     -667     
==========================================
- Hits         9688     9180     -508     
+ Misses       3350     3191     -159     
Impacted Files Coverage Δ
tools/benchmark/binsize/src/main.rs 3.84% <3.84%> (ø)
components/datetime/src/provider/gregory.rs 63.01% <0.00%> (-12.33%) ⬇️
provider/core/src/erased.rs 77.41% <0.00%> (-5.47%) ⬇️
provider/core/src/dynutil.rs 37.14% <0.00%> (-2.86%) ⬇️
utils/yoke/src/zero_copy_from.rs 75.00% <0.00%> (-2.78%) ⬇️
provider/blob/src/export/blob_exporter.rs 94.44% <0.00%> (-2.78%) ⬇️
components/uniset/src/uniset.rs 98.93% <0.00%> (-1.07%) ⬇️
provider/core/src/serde.rs 43.39% <0.00%> (-1.05%) ⬇️
components/datetime/src/skeleton.rs 81.96% <0.00%> (-0.06%) ⬇️
ffi/capi/src/locale.rs 0.00% <0.00%> (ø)
... and 65 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e5f7a5e...ab01351. Read the comment docs.

@coveralls
Copy link

coveralls commented Jul 21, 2021

Pull Request Test Coverage Report for Build d8c22f5e10e5328bf95eb47f89b472e40a6fe45f-PR-871

  • 1 of 26 (3.85%) changed or added relevant lines in 1 file are covered.
  • 129 unchanged lines in 14 files lost coverage.
  • Overall coverage decreased (-0.2%) to 74.274%

Changes Missing Coverage Covered Lines Changed/Added Lines %
tools/benchmark/binsize/src/main.rs 1 26 3.85%
Files with Coverage Reduction New Missed Lines %
experimental/provider_ppucd/src/parse_ppucd.rs 1 93.29%
provider/core/src/struct_provider.rs 1 85.71%
provider/fs/src/fs_data_provider.rs 1 89.66%
components/decimal/src/grouper.rs 2 93.1%
utils/yoke/src/yoke.rs 5 91.8%
components/datetime/src/provider/time_zones.rs 7 47.5%
provider/cldr/src/transform/time_zones/mod.rs 8 75.93%
components/datetime/src/provider/gregory.rs 9 63.01%
provider/core/src/data_provider/test.rs 9 91.95%
provider/core/src/dynutil.rs 9 37.14%
Totals Coverage Status
Change from base Build f5bc1660a2efa1c88f6ab3ee7a6e68af92498ffd: -0.2%
Covered Lines: 9311
Relevant Lines: 12536

💛 - Coveralls

@sffc sffc linked an issue Jul 21, 2021 that may be closed by this pull request
@sffc sffc requested review from echeran and gregtatum and removed request for a team July 21, 2021 19:19
.github/workflows/build-test.yml Outdated Show resolved Hide resolved
.github/workflows/build-test.yml Outdated Show resolved Hide resolved
.github/workflows/build-test.yml Outdated Show resolved Hide resolved
tools/benchmark/binsize/Cargo.toml Outdated Show resolved Hide resolved
tools/benchmark/binsize/Cargo.toml Outdated Show resolved Hide resolved
Copy link
Member

@gregtatum gregtatum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll defer to Elango's review of the CI side of things, but I looked mainly at the bincode measurements and binsize tool. I'm marking as "Request changes" as there are a bunch of questions, and suggestions I left that would be good to have a review cycle on.

tools/benchmark/binsize/src/main.rs Show resolved Hide resolved
}

fn main() {
let _rc = wasm_filesize("wasmpkg");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Is this path correct? I see a wasm-opt directory, and it seems like we should track optimized code as the binsize metric.

~/dev/icu4x/wasmpkg on icu4x/main at 16:00:20
➤ tree
.
├── borrowed_pattern.d
├── borrowed_pattern.wasm
├── code_line_diff.d
├── code_line_diff.wasm
├── derive.d
├── derive.wasm
├── elevator_floors.d
├── elevator_floors.wasm
├── filter_langids.d
├── filter_langids.wasm
├── language_names_hash_map.d
├── language_names_hash_map.wasm
├── language_names_lite_map.d
├── language_names_lite_map.wasm
├── owned_pattern.d
├── owned_pattern.wasm
├── permyriad.d
├── permyriad.wasm
├── simple.d
├── simple.wasm
├── syntatically_canonicalize_locales.d
├── syntatically_canonicalize_locales.wasm
├── tui.d
├── tui.wasm
├── twiggy
├── unicode_bmp_blocks_selector.d
├── unicode_bmp_blocks_selector.wasm
├── unread_emails.d
├── unread_emails.wasm
├── wasm-decompile
│   ├── borrowed_pattern.dcmp
│   ├── code_line_diff.dcmp
│   ├── derive.dcmp
│   ├── elevator_floors.dcmp
│   ├── filter_langids.dcmp
│   ├── language_names_hash_map.dcmp
│   ├── language_names_lite_map.dcmp
│   ├── owned_pattern.dcmp
│   ├── permyriad.dcmp
│   ├── simple.dcmp
│   ├── syntatically_canonicalize_locales.dcmp
│   ├── tui.dcmp
│   ├── unicode_bmp_blocks_selector.dcmp
│   ├── unread_emails.dcmp
│   ├── work_log.dcmp
│   └── writeable_message.dcmp
├── wasm-opt
│   ├── borrowed_pattern+opt.wasm
│   ├── code_line_diff+opt.wasm
│   ├── derive+opt.wasm
│   ├── elevator_floors+opt.wasm
│   ├── filter_langids+opt.wasm
│   ├── language_names_hash_map+opt.wasm
│   ├── language_names_lite_map+opt.wasm
│   ├── owned_pattern+opt.wasm
│   ├── permyriad+opt.wasm
│   ├── simple+opt.wasm
│   ├── syntatically_canonicalize_locales+opt.wasm
│   ├── tui+opt.wasm
│   ├── unicode_bmp_blocks_selector+opt.wasm
│   ├── unread_emails+opt.wasm
│   ├── work_log+opt.wasm
│   └── writeable_message+opt.wasm
├── wat
│   ├── borrowed_pattern.wat
│   ├── code_line_diff.wat
│   ├── derive.wat
│   ├── elevator_floors.wat
│   ├── filter_langids.wat
│   ├── language_names_hash_map.wat
│   ├── language_names_lite_map.wat
│   ├── owned_pattern.wat
│   ├── permyriad.wat
│   ├── simple.wat
│   ├── syntatically_canonicalize_locales.wat
│   ├── tui.wat
│   ├── unicode_bmp_blocks_selector.wat
│   ├── unread_emails.wat
│   ├── work_log.wat
│   └── writeable_message.wat
├── work_log.d
├── work_log.wasm
├── writeable_message.d
└── writeable_message.wasm

The binsize difference is pretty substantial.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the path to wasmpkg/wasm-opt. The optimized binaries may not contain the additional information required for analyzing and debugging size regression, though.

}

fn main() {
let _rc = wasm_filesize("wasmpkg");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Does it make sense tracking the gzipped size of the wasm as well? If download size is a concern, the compression rate is important.

e.g.

[ 18K]  ./borrowed_pattern.wasm
[ 13K]  ./wasm-opt/borrowed_pattern+opt.wasm
[6.2K]  ./wasm-opt/borrowed_pattern+opt.wasm.gz

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. I don't have an answer, the underlying ticket doesn't mention it.
Will put this up for 'Discussion' in the next meeting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ticket#912: #912

@@ -0,0 +1,7 @@
# icu_benchmark_binsize [![crates.io](http://meritbadge.herokuapp.com/icu_benchmark_binsize)](https://crates.io/crates/icu_benchmark_binsize)


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Please document this feature here, and in ` tools/benchmark/README.md

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documented in tools/benchmark/README.md and linked to it in tools/benchmark/binsize/README.md.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, a subsequent 'cargo make generate-readmes' threw out the change I made to benchmark/binsize/README.md.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (follow-up, optional): Ah, I believe the generate-readmes tool reads the mod.rs to generate this file. I'm not sure about this since it's a main.rs. It would probably be worth figuring out, but shouldn't block landing. It's probably fine for now that tools/benchmark/README.md has the documentation, plus this was an existing issue.

tools/benchmark/binsize/src/main.rs Show resolved Hide resolved
authors = ["The ICU4X Project Developers"]
edition = "2018"

include = [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: I don't think these include files are required, as there's no intention of publishing this separately.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, removed.

tools/benchmark/binsize/Cargo.toml Outdated Show resolved Hide resolved
}

fn main() {
let _rc = wasm_filesize("wasmpkg");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion (optional): It might be nice to hook this up to CI arguments rather than hard coding.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@sffc sffc added this to the ICU4X 0.3 milestone Jul 22, 2021
@Manishearth Manishearth reopened this Jul 22, 2021
@Manishearth
Copy link
Member

kicking CI after a branch protection change

@jira-pull-request-webhook
Copy link

Notice: the branch changed across the force-push!

  • .github/workflows/build-test.yml is different
  • Cargo.lock is different
  • tools/benchmark/binsize/Cargo.toml is different
  • tools/benchmark/binsize/README.md is different
  • tools/benchmark/binsize/src/main.rs is different
  • tools/benchmark/README.md is now changed in the branch

View Diff Across Force-Push

~ Your Friendly Jira-GitHub PR Checker Bot

Copy link
Contributor Author

@gnrunge gnrunge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review feedback factored in.

…amples

compiled into wasm binaries. Set up GHA to build wasm binaries, measure
file sizes, push results into benchmark dashboard .

Resolves ticket unicode-org#140.

Factor in review feedback.

Fix tidy clippy findings.
@jira-pull-request-webhook
Copy link

Notice: the branch changed across the force-push!

  • tools/benchmark/binsize/README.md is different
  • tools/benchmark/binsize/src/main.rs is different

View Diff Across Force-Push

~ Your Friendly Jira-GitHub PR Checker Bot

@gnrunge gnrunge requested review from gregtatum and echeran July 26, 2021 20:39
echeran
echeran previously approved these changes Jul 26, 2021
Copy link
Contributor

@echeran echeran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Github Actions side of things LGTM.

If the struct + serde for the Rust code is post-0.3 followup, then create an issue to track that.

@gnrunge
Copy link
Contributor Author

gnrunge commented Jul 26, 2021

Github Actions side of things LGTM.

If the struct + serde for the Rust code is post-0.3 followup, then create an issue to track that.

Ticket #903

gregtatum
gregtatum previously approved these changes Jul 27, 2021
Copy link
Member

@gregtatum gregtatum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to approve this with the changes.

thought: One thing to keep in mind that it will be extra work to later change this to track compressed binary sizes, as the data will already be collected in CI. I think we want this sooner than later, so it's probably not worth blocking on the discussion you filed.

tools/benchmark/binsize/src/main.rs Outdated Show resolved Hide resolved
@@ -0,0 +1,7 @@
# icu_benchmark_binsize [![crates.io](http://meritbadge.herokuapp.com/icu_benchmark_binsize)](https://crates.io/crates/icu_benchmark_binsize)


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (follow-up, optional): Ah, I believe the generate-readmes tool reads the mod.rs to generate this file. I'm not sure about this since it's a main.rs. It would probably be worth figuring out, but shouldn't block landing. It's probably fine for now that tools/benchmark/README.md has the documentation, plus this was an existing issue.

cargo run --package icu_benchmark_binsize -- wasmpkg/wasm-opt
```

Benchmark output is written in ndjson format, e.g.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: Thanks for the documentation, this will help us support it better in the future.

- name: Build wasm executables and measure size
run: |
cargo make wasm-examples
cargo run --package icu_benchmark_binsize -- wasmpkg/wasm-opt| tee benchmarks/binsize/output.txt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: 👍 thanks this make this more obvious as to what's going on.

@gnrunge gnrunge dismissed stale reviews from gregtatum and echeran via ab01351 July 27, 2021 23:00
@gnrunge gnrunge requested review from gregtatum and echeran July 27, 2021 23:00
@gnrunge
Copy link
Contributor Author

gnrunge commented Jul 27, 2021 via email

@sffc sffc merged commit 8d99c16 into unicode-org:main Jul 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add binary size testing
7 participants