Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add experimental SLOTHY-optimized NTT #143

Merged
merged 1 commit into from
Sep 23, 2024
Merged

Add experimental SLOTHY-optimized NTT #143

merged 1 commit into from
Sep 23, 2024

Conversation

hanno-becker
Copy link
Contributor

@hanno-becker hanno-becker commented Sep 19, 2024

This needs further refinement, but it's a start.

@hanno-becker hanno-becker requested a review from a team September 19, 2024 12:26
@hanno-becker hanno-becker added the benchmark this PR should be benchmarked in CI label Sep 19, 2024
@hanno-becker hanno-becker force-pushed the slothy_ntt branch 3 times, most recently from c54faaf to 6d9391a Compare September 19, 2024 13:50
@hanno-becker hanno-becker added benchmark this PR should be benchmarked in CI and removed benchmark this PR should be benchmarked in CI labels Sep 19, 2024
@hanno-becker hanno-becker added benchmark this PR should be benchmarked in CI and removed benchmark this PR should be benchmarked in CI labels Sep 19, 2024
@mkannwischer mkannwischer added benchmark this PR should be benchmarked in CI and removed benchmark this PR should be benchmarked in CI labels Sep 23, 2024
Copy link
Contributor

@mkannwischer mkannwischer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks this looks great - I tested this on my x86 and it works fine.
I also looked at the outputs of the Raspberry Pis in CI and it looks good.

One small nit, then you can go ahead and merge it.

mlkem/poly.c Outdated Show resolved Hide resolved
So far, clean ASM is only optimized according to Cortex-A55 model,
and we do not yet explore algorithmic variations from

  Fast and Clean: Auditable high-performance
  assembly via constraint solving

  https://eprint.iacr.org/2022/1303

This will come at a later point.

Both the clean and the optimized code are added to the repository,
as well as the SLOTHY script.

Signed-off-by: Hanno Becker <[email protected]>
@hanno-becker hanno-becker merged commit a1fb12f into main Sep 23, 2024
3 checks passed
@hanno-becker hanno-becker deleted the slothy_ntt branch September 23, 2024 18:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
benchmark this PR should be benchmarked in CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants