Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Keccak X4 interface #62

Merged
merged 1 commit into from
Jul 19, 2024

Conversation

cothan
Copy link
Contributor

@cothan cothan commented Jun 13, 2024

Add Keccak X4 interface.

Potential optimization:

  • Multiple Pass Squeezing: Currently, if rej_uniform fails to sample the full vector size, it continues by squeezing a single Keccak lane. By rewriting this logic to append single Keccak calls into a single Keccak X-way call, CPU cycles can be saved. For simplicity, it is currently implemented as a single pass.

Next Steps:

  • Add Keccak_X4 assembly to replace the keccak_absorb_x4keccak_squeezeblocks_x4 in fips202x.c.

Fixes #35

mlkem/indcpa.c Outdated Show resolved Hide resolved
mlkem/indcpa.c Outdated Show resolved Hide resolved
mlkem/indcpa.c Outdated Show resolved Hide resolved
fips202/fips202x.h Outdated Show resolved Hide resolved
fips202/fips202x.h Outdated Show resolved Hide resolved
mlkem/indcpa.c Outdated Show resolved Hide resolved
@cothan cothan changed the title Add Keccak X interface Add Keccak X4 interface Jun 14, 2024
@hanno-becker
Copy link
Contributor

hanno-becker commented Jun 17, 2024

@cothan Can you fix the CI, clean-up the history, and mark the PR as ready-for-review when you think it's in a good shape?

@cothan cothan force-pushed the add_keccakx_interface branch 2 times, most recently from edf8538 to bf6f718 Compare June 17, 2024 15:39
@cothan cothan marked this pull request as ready for review June 17, 2024 15:54
@cothan cothan requested a review from a team June 17, 2024 15:54
@cothan
Copy link
Contributor Author

cothan commented Jun 17, 2024

Sure, @hanno-becker , I was waiting for a few PR to land on main first.
PR is marked as ready.

Copy link
Contributor

@mkannwischer mkannwischer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a few comments and questions.

For the noise vectors, you can also use the Keccakx4 interface.
Please have a look here and add: https://github.com/pq-crystals/kyber/blob/main/avx2/poly.c

fips202/fips202x.c Outdated Show resolved Hide resolved
mlkem/indcpa.c Show resolved Hide resolved
mlkem/indcpa.c Outdated Show resolved Hide resolved
mlkem/indcpa.c Outdated Show resolved Hide resolved
fips202/fips202.h Outdated Show resolved Hide resolved
mlkem/polyvec.c Outdated Show resolved Hide resolved
mlkem/indcpa.c Outdated Show resolved Hide resolved
mlkem/indcpa.c Outdated Show resolved Hide resolved
mlkem/indcpa.c Outdated Show resolved Hide resolved
@cothan cothan force-pushed the add_keccakx_interface branch 2 times, most recently from 10e5fd1 to d703199 Compare June 19, 2024 03:32
fips202/fips202x4.h Outdated Show resolved Hide resolved
fips202/fips202x4.h Outdated Show resolved Hide resolved
mlkem/poly.c Outdated Show resolved Hide resolved
mlkem/poly.c Outdated Show resolved Hide resolved
mlkem/poly.c Outdated Show resolved Hide resolved
@cothan cothan force-pushed the add_keccakx_interface branch from 6c716d1 to 7d8250c Compare June 28, 2024 20:52
@cothan cothan force-pushed the add_keccakx_interface branch from 7d8250c to 1329895 Compare July 14, 2024 16:11
@cothan cothan requested a review from mkannwischer July 14, 2024 16:12
@mkannwischer mkannwischer added the benchmark this PR should be benchmarked in CI label Jul 15, 2024
Copy link
Contributor

@mkannwischer mkannwischer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @cothan. I'm happy with this now.
@hanno-becker?

@cothan cothan force-pushed the add_keccakx_interface branch from d48a79a to bc9bed9 Compare July 15, 2024 14:11
fips202/fips202x4.c Outdated Show resolved Hide resolved
mlkem/indcpa.c Outdated Show resolved Hide resolved
Copy link
Contributor

@hanno-becker hanno-becker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @cothan.

I think we may need to tweak the interfaces further to reduce the amount of copying. However, this can be a follow-up.

The batched interfaces does input validation not present in the non-batched interface. I think it should be uniform, and would opt to remove the validation in the batched functions for now.

Otherwise, gen_matrix() can be slightly improved by hoisting out the prepraration of the bulk of the seed array.

Signed-off-by: Duc Tri Nguyen <[email protected]>

rename to x4

add shake256x4 interface

add shake256x4

add batch getnoise sampling

Signed-off-by: Duc Tri Nguyen <[email protected]>

unroll prf to shake256x4

Signed-off-by: Duc Tri Nguyen <[email protected]>
Signed-off-by: Duc Tri Nguyen <[email protected]>

fix space

Signed-off-by: Duc Tri Nguyen <[email protected]>

assume input pointers are valid, thus, remove conditions.

move memcpy outside of the loop
@cothan cothan force-pushed the add_keccakx_interface branch from bc9bed9 to 30054f1 Compare July 17, 2024 17:20
@cothan cothan requested a review from hanno-becker July 17, 2024 17:29
@hanno-becker hanno-becker merged commit 88fcf89 into pq-code-package:main Jul 19, 2024
6 checks passed
@cothan cothan deleted the add_keccakx_interface branch July 19, 2024 03:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
benchmark this PR should be benchmarked in CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add internal interface to 4-way Keccak and implement C-fallback
3 participants