-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Keccak X4 interface #62
Add Keccak X4 interface #62
Conversation
@cothan Can you fix the CI, clean-up the history, and mark the PR as ready-for-review when you think it's in a good shape? |
edf8538
to
bf6f718
Compare
Sure, @hanno-becker , I was waiting for a few PR to land on main first. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a few comments and questions.
For the noise vectors, you can also use the Keccakx4 interface.
Please have a look here and add: https://github.com/pq-crystals/kyber/blob/main/avx2/poly.c
10e5fd1
to
d703199
Compare
6c716d1
to
7d8250c
Compare
7d8250c
to
1329895
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @cothan. I'm happy with this now.
@hanno-becker?
d48a79a
to
bc9bed9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @cothan.
I think we may need to tweak the interfaces further to reduce the amount of copying. However, this can be a follow-up.
The batched interfaces does input validation not present in the non-batched interface. I think it should be uniform, and would opt to remove the validation in the batched functions for now.
Otherwise, gen_matrix()
can be slightly improved by hoisting out the prepraration of the bulk of the seed
array.
Signed-off-by: Duc Tri Nguyen <[email protected]> rename to x4 add shake256x4 interface add shake256x4 add batch getnoise sampling Signed-off-by: Duc Tri Nguyen <[email protected]> unroll prf to shake256x4 Signed-off-by: Duc Tri Nguyen <[email protected]> Signed-off-by: Duc Tri Nguyen <[email protected]> fix space Signed-off-by: Duc Tri Nguyen <[email protected]> assume input pointers are valid, thus, remove conditions. move memcpy outside of the loop
bc9bed9
to
30054f1
Compare
Add Keccak X4 interface.
Potential optimization:
rej_uniform
fails to sample the full vector size, it continues by squeezing a single Keccak lane. By rewriting this logic to append single Keccak calls into a single Keccak X-way call, CPU cycles can be saved. For simplicity, it is currently implemented as a single pass.Next Steps:
keccak_absorb_x4keccak_squeezeblocks_x4
infips202x.c
.Fixes #35