-
Notifications
You must be signed in to change notification settings - Fork 253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sha2: explore addition of SSE and AVX2 backends for SHA-256 #327
Comments
Intel has a paper describing various SIMD implementations of SHA-256 here: Notably they describe these two variants:
I'm not sure if there are newer/better methods available for implementing SHA-256 with AVX2, but that's the best I was able to find. |
some code can be found here : or here : https://github.com/intel/isa-l_crypto/tree/master/sha256_mb (links taken from https://stackoverflow.com/questions/18546244/sha256-performance-optimization-in-c) |
I see that Python 3's implementation of sha256 is x2 faster than Rust's one from sha2 crate. And consequences are so big that this makes great argument for not using Rust at all (see reasons below). Tests done in Debian Linux sid in docker on Librem Purism. x86_64-unknown-linux-gnu. #!/usr/bin/env python3
import io, hashlib
with open("/home/user/dedup-bench/sto/00", "rb") as f:
digest = hashlib.file_digest(f, "sha256")
print(digest.hexdigest())
Here is [package]
name = "s"
version = "0.1.0"
edition = "2021"
[dependencies]
hex = "0.4.3"
#sha2 = { version = "0.10.7", features = ["asm"] }
sha2 = { version = "0.10.7", features = [] }
fn main() {
use sha2::Digest;
use std::io::Read;
let mut hasher = sha2::Sha256::new();
let mut buf = vec![];
std::fs::File::open("/home/user/dedup-bench/sto/00").unwrap().read_to_end(&mut buf).unwrap();
hasher.update(&buf);
println!("{}", hex::encode(hasher.finalize()));
} Here are results:
Results with
Here is
Both crate cpufeatures and So sha extension is not supported. And Python beats Rust here. I think this my report belongs to this issue. I discovered this problem when I compared borg performance and my own simple Rust program: borgbackup/borg#7674 . Hash computing is main reason of slowness of my Rust implementation. It is impossible to beat borg speed in sha256 mode for Rust. So, if for some reason I have need to create borg replacement with sha256, it will be simply impossible (with current state of ecosystem) to create proper alternative in Rust. So currently bad sha2 performance is valid reason to not chose Rust at all. So, please, raise importance of this issue |
Also please note: I personally don't need fast sha256 in Rust. I will simply choose blake3 for my application. But I still think that this problem is important for others |
I initially thought the discrepancy was partly due to memory allocation, but no... python3 (actually openssl because it's what's used underneath) is 2x faster Here I try to measure only the actual "sha256" compute part, not the read/memory allocate part: /tmp/random contains 2 GB of random data
import hashlib
import time
import sys
data = open(sys.argv[1], 'rb').read()
s = hashlib.sha256()
begin = time.time()
s.update(data)
end = time.time()
print(f"{s.hexdigest()} - took {end - begin}") results:
rust counterpart:
no feature:
"asm" feature
|
I just checked crate "openssl". It (predictably) has nearly same speed as python. So my claim "bad sha2 performance is valid reason to not chose Rust at all" was too bold. But, of course, having Rust-native fast sha256 is good thing |
We would like to eventually integrate OpenSSL's assembly (see RustCrypto/asm-hashes#5), but it requires a fair amount of work since we do not want to rely on Perl and external compilers. |
@safinaskar @mat-gas
Without ASM:
With ASM:
By the way, you don't need a separate dependency for hex, there is
|
You are likely getting results for the SHA-NI backend. Enabling the
Note, that eventually we plan to migrate to const generics and such code will stop working. |
Ah, I see I though that it was part of SSE, but didn't realize there was a new SHA-NI extension added to processors a few years back. Tried it on older computer and as said the results were 2x slower :( |
FWIW we've been added to SUPERCOP, you can see the results across several CPUs here: https://bench.cr.yp.to/impl-hash/sha256.html We're |
Currently we only have software and SHA-NI backends for SHA-256.
The text was updated successfully, but these errors were encountered: