-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracking issue for performance #1294
Comments
** Some AArch64 relevant backports are not yet completed |
Latest results (raw times are a bit faster across the board because I'm benchmarking with a quieter OS environment, I re-did baselines for consistency): 8-bit Chimera:
10-bit Chimera:
** Full LTO for the C code seems to make performance slightly worse, if anything. I'm surprised by this but the measurements are consistent on my machine. |
I've been looking into what remaining sources of overhead might be and wanted to chime in with some of my findings. One major discrepency when running benchmarks I noticed between dav1d and rav1d was an order of magnitude difference in the number of Digging into this, I found the majority (~82%) of the page-faults in rav1d are coming from
I believe that corresponds to this closure: Lines 5223 to 5227 in 7d72409
This is the equivalent operation in dav1d: Lines 3623 to 3624 in 7d72409
Here The switch from using pooled memory in rav1d looks to have been introduced as part of 6420e5a, PR #984. |
@ivanloz, thanks for finding this! That's definitely something we changed, and I had thought we hadn't seen a performance impact outside of the picture pooled allocator, which we kept pooled, but maybe we missed it or it has different behavior on different systems. Could you put your current above in its own issue? We'll work on fixing it. It is tricky due to the lifetimes involved (the picture pool got around this because it already has to go through an unsafe C API for |
Done -- see #1358, thanks! |
This issue is intended to aggregate, track progress, and discuss performance optimization for rav1d.
Unless otherwise noted, the following conditions apply to these measurements:
--threads 8
The text was updated successfully, but these errors were encountered: