-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Dockerfile for consistent isolated benchmark runs #80
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
gonzedge
force-pushed
the
gonzedge/docker
branch
from
December 2, 2024 04:22
2ad4e40
to
fe7669e
Compare
Code Climate has analyzed commit fe7669e and detected 0 issues on this pull request. The test coverage on the diff in this pull request is 100.0% (95% is the threshold). This pull request will bring the total coverage in the repository to 100.0% (0.0% change). View more on Code Climate. |
gonzedge
added a commit
that referenced
this pull request
Dec 2, 2024
…n benchmarks for any given version (#81) ## Solution This PR: 1. Adds a benchmarking script leveraging the new `Dockerfile.benchmark` added in #80 2. Ensures all benchmarks are run with garbage collection stopped via ```ruby ::GC.start # <= trigger garbage collection ::GC.disable # disable before executing benchmark yield ::GC.enable # enable after executing benchmark ``` so that it doesn't interfere with benchmark calculations ## Background After #80, I've been re-running some of the benchmarks. For the last few hours, I suspected a change in #71 slowed things down in many methods. After some up tinkering, it looked to me like the main culprit is the block `do/end` => `{ }` change. However, I continued to get wildly different results, and sometimes `{ }` would be faster than `do/end`, or just as fast. This was extremely confusing because running a benchmark even with a bunch of nested blocks and closure variables, `do/end` and `{ }` report as the same-ish: ``` # result from docker run -it rambling-trie:latest bash -c 'bundle exec rake ips:nested_do_end_vs_brackets' ruby 3.3.6 (2024-11-05 revision 75015d4c1f) [aarch64-linux] Warming up -------------------------------------- do/end 312.000 i/100ms { } 312.000 i/100ms Calculating ------------------------------------- do/end 3.113k (± 1.2%) i/s (321.20 μs/i) - 62.400k in 20.046043s { } 3.107k (± 1.7%) i/s (321.84 μs/i) - 62.400k in 20.090024s Comparison: do/end: 3113.4 i/s { }: 3107.2 i/s - same-ish: difference falls within error ``` After some research and additional benchmarking, I finally figured it out: THE VARIATIONS COME FROM GARBAGE COLLECTION!!! Initially, this PR changed `{ }` blocks back to `do/end`, but after running benchmarks without GC before (`aec608b`) and after (`5e94d22`), it was clear that they are equivalent as they resulted in this ` diff tmp/aec608b.benchmark tmp/5e94d22.benchmark`: ```diff ==> Creation - `Rambling::Trie.create` 5 iterations - - 2.528858 0.292029 2.820887 ( 2.824440) + 2.529208 0.153897 2.683105 ( 2.689189) ==> Compression - `compress!` 5 iterations - - 1.685353 0.017966 1.703319 ( 1.703426) + 1.648239 0.016017 1.664256 ( 1.664490) ==> Serialization (raw trie) - `Rambling::Trie.load` 5 iterations - - 1.622794 0.037990 1.660784 ( 1.660889) + 1.659586 0.028994 1.688580 ( 1.688839) ==> Serialization (compressed trie) - `Rambling::Trie.load` 5 iterations - - 0.942562 0.009981 0.952543 ( 0.952638) + 0.962076 0.010007 0.972083 ( 0.972178) ==> Lookups (raw trie) - `word?` 200000 iterations - hi true - 0.104858 0.001034 0.105892 ( 0.105895) + 0.104117 0.003002 0.107119 ( 0.107141) 200000 iterations - help true - 0.180267 0.005861 0.186128 ( 0.186132) + 0.179648 0.008990 0.188638 ( 0.188668) 200000 iterations - beautiful true - 0.367618 0.010024 0.377642 ( 0.377670) + 0.371669 0.010991 0.382660 ( 0.382680) 200000 iterations - impressionism true - 0.507926 0.014886 0.522812 ( 0.522844) + 0.511100 0.017001 0.528101 ( 0.528137) 200000 iterations - anthropological true - 0.591033 0.018848 0.609881 ( 0.615617) + 0.605972 0.031914 0.637886 ( 0.642350) ==> Lookups (compressed trie) - `word?` 200000 iterations - hi true - 0.158066 0.000003 0.158069 ( 0.158072) + 0.160409 0.000013 0.160422 ( 0.160439) 200000 iterations - help true - 0.279116 0.000005 0.279121 ( 0.279139) + 0.281002 0.000003 0.281005 ( 0.281062) 200000 iterations - beautiful true - 0.553455 0.008983 0.562438 ( 0.562483) + 0.552547 0.015987 0.568534 ( 0.568731) 200000 iterations - impressionism true - 0.814393 0.027978 0.842371 ( 0.842468) + 0.815575 0.029985 0.845560 ( 0.845660) 200000 iterations - anthropological true - 0.908511 0.037019 0.945530 ( 0.945563) + 0.904403 0.034061 0.938464 ( 0.938576) ==> Lookups (raw trie) - `partial_word?` 200000 iterations - hi true - 0.096137 0.000005 0.096142 ( 0.096150) + 0.097516 0.000000 0.097516 ( 0.097540) 200000 iterations - help true - 0.169705 0.000000 0.169705 ( 0.169734) + 0.171188 0.000000 0.171188 ( 0.171201) 200000 iterations - beautiful true - 0.345812 0.000000 0.345812 ( 0.345840) + 0.348937 0.000001 0.348938 ( 0.348960) 200000 iterations - impressionism true - 0.478373 0.000000 0.478373 ( 0.478378) + 0.487952 0.000029 0.487981 ( 0.488129) 200000 iterations - anthropological true - 0.540492 0.000000 0.540492 ( 0.540561) + 0.550833 0.000002 0.550835 ( 0.550948) ==> Lookups (compressed trie) - `partial_word?` 200000 iterations - hi true - 0.178854 0.000000 0.178854 ( 0.178858) + 0.179354 0.000010 0.179364 ( 0.179435) 200000 iterations - help true - 0.333131 0.001003 0.334134 ( 0.334171) + 0.338368 0.001989 0.340357 ( 0.340381) 200000 iterations - beautiful true - 0.693562 0.043985 0.737547 ( 0.737578) + 0.691860 0.052987 0.744847 ( 0.744894) 200000 iterations - impressionism true - 1.037349 0.084013 1.121362 ( 1.121450) + 1.020308 0.058002 1.078310 ( 1.078376) 200000 iterations - anthropological true - 1.046859 0.121995 1.168854 ( 1.168987) + 1.025329 0.055977 1.081306 ( 1.081383) ==> Lookups (raw trie) - `scan` 1000 iterations - hi 495 - 1.107769 0.000000 1.107769 ( 1.108111) + 1.090106 0.009922 1.100028 ( 1.103858) 100000 iterations - help 20 - 5.051393 0.315010 5.366403 ( 5.366820) + 4.994424 0.170831 5.165255 ( 5.165899) 100000 iterations - beautiful 6 - 2.432588 0.241986 2.674574 ( 2.674782) + 2.400872 0.084991 2.485863 ( 2.486011) 200000 iterations - impressionism 2 - 2.222437 0.234891 2.457328 ( 2.459556) + 2.188768 0.095923 2.284691 ( 2.289520) 200000 iterations - anthropological 2 - 2.732152 0.266948 2.999100 ( 2.999282) + 2.720354 0.165952 2.886306 ( 2.886691) ==> Lookups (compressed trie) - `scan` 1000 iterations - hi 495 - 0.736214 0.000014 0.736228 ( 0.736311) + 0.726488 0.000026 0.726514 ( 0.726575) 100000 iterations - help 20 - 2.795148 0.000000 2.795148 ( 2.795292) + 2.771712 0.000023 2.771735 ( 2.771937) 100000 iterations - beautiful 6 - 1.603907 0.089967 1.693874 ( 1.694424) + 1.580089 0.087958 1.668047 ( 1.668791) 200000 iterations - impressionism 2 - 2.231390 0.131968 2.363358 ( 2.363509) + 2.212870 0.162965 2.375835 ( 2.375971) 200000 iterations - anthropological 2 - 2.408396 0.115020 2.523416 ( 2.523546) + 2.430944 0.090996 2.521940 ( 2.522128) ==> Lookups (raw trie) - `words_within` 100000 iterations - ifdxawesome45someword319 - 3.547659 0.027805 3.575464 ( 3.592132) + 3.531690 0.009991 3.541681 ( 3.543942) 100000 iterations - ifdx45someword3awesome19 - 3.463826 0.013968 3.477794 ( 3.478193) + 3.502160 0.014036 3.516196 ( 3.519002) ==> Lookups (compressed trie) - `words_within` 100000 iterations - ifdxawesome45someword319 - 4.679733 0.140954 4.820687 ( 4.825155) + 4.660395 0.129986 4.790381 ( 4.790821) 100000 iterations - ifdx45someword3awesome19 - 5.319253 0.204972 5.524225 ( 5.524799) + 5.337981 0.212922 5.550903 ( 5.559558) ``` So, instead, this PR gives us a more reliable set of benchmarks that can be used across versions.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Dockerfile
with all relevant files, includingsig/
andspec/
to run any commandsDockerfile.benchmark
with only what is required to run a benchmark (minimalGemfile
and minimalRakefile
) and with the ability to run on a specificrambling-trie
version, even from the git repo(!)pathname
explicitly intasks/helpers/path.rb
so that the benchmark can actually run with the minimal config