-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Binary size benchmarking: Rust script to measure size of the ICU4X ex… #871
Conversation
Codecov Report
@@ Coverage Diff @@
## main #871 +/- ##
==========================================
- Coverage 74.30% 74.20% -0.11%
==========================================
Files 206 207 +1
Lines 13038 12371 -667
==========================================
- Hits 9688 9180 -508
+ Misses 3350 3191 -159
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll defer to Elango's review of the CI side of things, but I looked mainly at the bincode measurements and binsize
tool. I'm marking as "Request changes" as there are a bunch of questions, and suggestions I left that would be good to have a review cycle on.
tools/benchmark/binsize/src/main.rs
Outdated
} | ||
|
||
fn main() { | ||
let _rc = wasm_filesize("wasmpkg"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Is this path correct? I see a wasm-opt
directory, and it seems like we should track optimized code as the binsize metric.
~/dev/icu4x/wasmpkg on icu4x/main at 16:00:20
➤ tree
.
├── borrowed_pattern.d
├── borrowed_pattern.wasm
├── code_line_diff.d
├── code_line_diff.wasm
├── derive.d
├── derive.wasm
├── elevator_floors.d
├── elevator_floors.wasm
├── filter_langids.d
├── filter_langids.wasm
├── language_names_hash_map.d
├── language_names_hash_map.wasm
├── language_names_lite_map.d
├── language_names_lite_map.wasm
├── owned_pattern.d
├── owned_pattern.wasm
├── permyriad.d
├── permyriad.wasm
├── simple.d
├── simple.wasm
├── syntatically_canonicalize_locales.d
├── syntatically_canonicalize_locales.wasm
├── tui.d
├── tui.wasm
├── twiggy
├── unicode_bmp_blocks_selector.d
├── unicode_bmp_blocks_selector.wasm
├── unread_emails.d
├── unread_emails.wasm
├── wasm-decompile
│ ├── borrowed_pattern.dcmp
│ ├── code_line_diff.dcmp
│ ├── derive.dcmp
│ ├── elevator_floors.dcmp
│ ├── filter_langids.dcmp
│ ├── language_names_hash_map.dcmp
│ ├── language_names_lite_map.dcmp
│ ├── owned_pattern.dcmp
│ ├── permyriad.dcmp
│ ├── simple.dcmp
│ ├── syntatically_canonicalize_locales.dcmp
│ ├── tui.dcmp
│ ├── unicode_bmp_blocks_selector.dcmp
│ ├── unread_emails.dcmp
│ ├── work_log.dcmp
│ └── writeable_message.dcmp
├── wasm-opt
│ ├── borrowed_pattern+opt.wasm
│ ├── code_line_diff+opt.wasm
│ ├── derive+opt.wasm
│ ├── elevator_floors+opt.wasm
│ ├── filter_langids+opt.wasm
│ ├── language_names_hash_map+opt.wasm
│ ├── language_names_lite_map+opt.wasm
│ ├── owned_pattern+opt.wasm
│ ├── permyriad+opt.wasm
│ ├── simple+opt.wasm
│ ├── syntatically_canonicalize_locales+opt.wasm
│ ├── tui+opt.wasm
│ ├── unicode_bmp_blocks_selector+opt.wasm
│ ├── unread_emails+opt.wasm
│ ├── work_log+opt.wasm
│ └── writeable_message+opt.wasm
├── wat
│ ├── borrowed_pattern.wat
│ ├── code_line_diff.wat
│ ├── derive.wat
│ ├── elevator_floors.wat
│ ├── filter_langids.wat
│ ├── language_names_hash_map.wat
│ ├── language_names_lite_map.wat
│ ├── owned_pattern.wat
│ ├── permyriad.wat
│ ├── simple.wat
│ ├── syntatically_canonicalize_locales.wat
│ ├── tui.wat
│ ├── unicode_bmp_blocks_selector.wat
│ ├── unread_emails.wat
│ ├── work_log.wat
│ └── writeable_message.wat
├── work_log.d
├── work_log.wasm
├── writeable_message.d
└── writeable_message.wasm
The binsize difference is pretty substantial.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed the path to wasmpkg/wasm-opt. The optimized binaries may not contain the additional information required for analyzing and debugging size regression, though.
tools/benchmark/binsize/src/main.rs
Outdated
} | ||
|
||
fn main() { | ||
let _rc = wasm_filesize("wasmpkg"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Does it make sense tracking the gzipped size of the wasm as well? If download size is a concern, the compression rate is important.
e.g.
[ 18K] ./borrowed_pattern.wasm
[ 13K] ./wasm-opt/borrowed_pattern+opt.wasm
[6.2K] ./wasm-opt/borrowed_pattern+opt.wasm.gz
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. I don't have an answer, the underlying ticket doesn't mention it.
Will put this up for 'Discussion' in the next meeting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ticket#912: #912
@@ -0,0 +1,7 @@ | |||
# icu_benchmark_binsize [![crates.io](http://meritbadge.herokuapp.com/icu_benchmark_binsize)](https://crates.io/crates/icu_benchmark_binsize) | |||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: Please document this feature here, and in ` tools/benchmark/README.md
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documented in tools/benchmark/README.md and linked to it in tools/benchmark/binsize/README.md.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, a subsequent 'cargo make generate-readmes' threw out the change I made to benchmark/binsize/README.md.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (follow-up, optional): Ah, I believe the generate-readmes
tool reads the mod.rs to generate this file. I'm not sure about this since it's a main.rs
. It would probably be worth figuring out, but shouldn't block landing. It's probably fine for now that tools/benchmark/README.md
has the documentation, plus this was an existing issue.
tools/benchmark/binsize/Cargo.toml
Outdated
authors = ["The ICU4X Project Developers"] | ||
edition = "2018" | ||
|
||
include = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: I don't think these include files are required, as there's no intention of publishing this separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, removed.
tools/benchmark/binsize/src/main.rs
Outdated
} | ||
|
||
fn main() { | ||
let _rc = wasm_filesize("wasmpkg"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion (optional): It might be nice to hook this up to CI arguments rather than hard coding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
kicking CI after a branch protection change |
Notice: the branch changed across the force-push!
~ Your Friendly Jira-GitHub PR Checker Bot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review feedback factored in.
…amples compiled into wasm binaries. Set up GHA to build wasm binaries, measure file sizes, push results into benchmark dashboard . Resolves ticket unicode-org#140. Factor in review feedback. Fix tidy clippy findings.
Notice: the branch changed across the force-push!
~ Your Friendly Jira-GitHub PR Checker Bot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Github Actions side of things LGTM.
If the struct + serde for the Rust code is post-0.3 followup, then create an issue to track that.
Ticket #903 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to approve this with the changes.
thought: One thing to keep in mind that it will be extra work to later change this to track compressed binary sizes, as the data will already be collected in CI. I think we want this sooner than later, so it's probably not worth blocking on the discussion you filed.
@@ -0,0 +1,7 @@ | |||
# icu_benchmark_binsize [![crates.io](http://meritbadge.herokuapp.com/icu_benchmark_binsize)](https://crates.io/crates/icu_benchmark_binsize) | |||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (follow-up, optional): Ah, I believe the generate-readmes
tool reads the mod.rs to generate this file. I'm not sure about this since it's a main.rs
. It would probably be worth figuring out, but shouldn't block landing. It's probably fine for now that tools/benchmark/README.md
has the documentation, plus this was an existing issue.
cargo run --package icu_benchmark_binsize -- wasmpkg/wasm-opt | ||
``` | ||
|
||
Benchmark output is written in ndjson format, e.g. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
praise: Thanks for the documentation, this will help us support it better in the future.
- name: Build wasm executables and measure size | ||
run: | | ||
cargo make wasm-examples | ||
cargo run --package icu_benchmark_binsize -- wasmpkg/wasm-opt| tee benchmarks/binsize/output.txt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
praise: 👍 thanks this make this more obvious as to what's going on.
Co-authored-by: Greg Tatum <[email protected]>
Thanks. I committed your suggestion about the type (wih --> with), which
then reset the approvals.
…On Tue, Jul 27, 2021 at 1:51 PM Greg Tatum ***@***.***> wrote:
***@***.**** approved this pull request.
I'm happy to approve this with the changes.
thought: One thing to keep in mind that it will be extra work to later
change this to track compressed binary sizes, as the data will already be
collected in CI. I think we want this sooner than later, so it's probably
not worth blocking on the discussion you filed.
------------------------------
In tools/benchmark/binsize/src/main.rs
<#871 (comment)>:
> @@ -0,0 +1,47 @@
+// This file is part of ICU4X. For terms of use, please see the file
+// called LICENSE at the top level of the ICU4X source tree
+// (online at: https://github.com/unicode-org/icu4x/blob/main/LICENSE ).
+
+// This module takes as argument a directory path, looks for wasm
+// executables in the directory (i.e., extension .wasm), takes the size in
+// bytes of each, and writes the result in ndjson format to stdout.
+// Note that the ICU4X wasm executables have to be build prior to
+// execution of this module. Use 'cargo make wasm-example'.
+use std::env;
+use std::fs;
+use std::process;
+
+fn wasm_filesize(dir: &str) -> Result<u64, std::io::Error> {
+ let paths = fs::read_dir(dir).expect("Directory wih wasm binaries not found!");
⬇️ Suggested change
- let paths = fs::read_dir(dir).expect("Directory wih wasm binaries not found!");
+ let paths = fs::read_dir(dir).expect("Directory with wasm binaries not found!");
------------------------------
In tools/benchmark/README.md
<#871 (comment)>:
> +This is a continuous integration tool to monitor the size of ICU4X binaries.
+It is invoked by GitHub Action for each PR, measures the size of the ICU4X demo
+files compiled to wasm (WebAssembly) format, and stores the result in a
+branch of the repository for subsequent display.
+
+### Usage and Output
+
+```sh
+# Prerequisite: build wasm binaries.
+cargo make wasm-examples
+
+# Execute binsize benchmark
+cargo run --package icu_benchmark_binsize -- wasmpkg/wasm-opt
+```
+
+Benchmark output is written in ndjson format, e.g.
praise: Thanks for the documentation, this will help us support it better
in the future.
------------------------------
In .github/workflows/build-test.yml
<#871 (comment)>:
> + - name: Install prerequisites for wasm build
+ run: |
+ rustup component add rust-src
+ rustup toolchain list
+ rustup toolchain install nightly-2021-02-28
+ sudo npm install -g wasm-opt --unsafe-perm
+ sudo npm install -g wabt
+ cargo install twiggy
+
+ - name: Setup output data directory
+ run: mkdir -p benchmarks/binsize
+
+ - name: Build wasm executables and measure size
+ run: |
+ cargo make wasm-examples
+ cargo run --package icu_benchmark_binsize -- wasmpkg/wasm-opt| tee benchmarks/binsize/output.txt
praise: 👍 thanks this make this more obvious as to what's going on.
------------------------------
In tools/benchmark/binsize/README.md
<#871 (comment)>:
> @@ -0,0 +1,7 @@
+# icu_benchmark_binsize [![crates.io](http://meritbadge.herokuapp.com/icu_benchmark_binsize)](https://crates.io/crates/icu_benchmark_binsize)
+
+
suggestion (follow-up, optional): Ah, I believe the generate-readmes tool
reads the mod.rs to generate this file. I'm not sure about this since
it's a main.rs. It would probably be worth figuring out, but shouldn't
block landing. It's probably fine for now that tools/benchmark/README.md
has the documentation, plus this was an existing issue.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#871 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AJZZMHMHDLSUX45PZPIYQFDTZ4L4VANCNFSM5AYKFG6Q>
.
|
…amples
compiled into wasm binaries. Set up GHA to build wasm binaries, measure
file sizes, push results into benchmark dashboard .
Resolves ticket #140.