-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce global config and reorganize backends #535
Conversation
84ab059
to
a462f02
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arm Cortex-A76 (Raspberry Pi 5) benchmarks
Benchmark suite | Current: 8dff47b | Previous: 668dbab | Ratio |
---|---|---|---|
ML-KEM-512 keypair |
29186 cycles |
29184 cycles |
1.00 |
ML-KEM-512 encaps |
35547 cycles |
35554 cycles |
1.00 |
ML-KEM-512 decaps |
46136 cycles |
46098 cycles |
1.00 |
ML-KEM-768 keypair |
49229 cycles |
49232 cycles |
1.00 |
ML-KEM-768 encaps |
55399 cycles |
55386 cycles |
1.00 |
ML-KEM-768 decaps |
70203 cycles |
70236 cycles |
1.00 |
ML-KEM-1024 keypair |
72171 cycles |
72218 cycles |
1.00 |
ML-KEM-1024 encaps |
81016 cycles |
81129 cycles |
1.00 |
ML-KEM-1024 decaps |
100875 cycles |
100871 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intel Xeon 4th gen (c7i)
Benchmark suite | Current: 8dff47b | Previous: 668dbab | Ratio |
---|---|---|---|
ML-KEM-512 keypair |
13523 cycles |
13505 cycles |
1.00 |
ML-KEM-512 encaps |
17257 cycles |
17476 cycles |
0.99 |
ML-KEM-512 decaps |
22741 cycles |
22734 cycles |
1.00 |
ML-KEM-768 keypair |
22524 cycles |
22497 cycles |
1.00 |
ML-KEM-768 encaps |
24529 cycles |
24466 cycles |
1.00 |
ML-KEM-768 decaps |
32530 cycles |
32453 cycles |
1.00 |
ML-KEM-1024 keypair |
31376 cycles |
31374 cycles |
1.00 |
ML-KEM-1024 encaps |
34922 cycles |
34930 cycles |
1.00 |
ML-KEM-1024 decaps |
45729 cycles |
45768 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AMD EPYC 3rd gen (c6a)
Benchmark suite | Current: 8dff47b | Previous: 668dbab | Ratio |
---|---|---|---|
ML-KEM-512 keypair |
18158 cycles |
18216 cycles |
1.00 |
ML-KEM-512 encaps |
23194 cycles |
23145 cycles |
1.00 |
ML-KEM-512 decaps |
30526 cycles |
30478 cycles |
1.00 |
ML-KEM-768 keypair |
31067 cycles |
31108 cycles |
1.00 |
ML-KEM-768 encaps |
34152 cycles |
34212 cycles |
1.00 |
ML-KEM-768 decaps |
44834 cycles |
44770 cycles |
1.00 |
ML-KEM-1024 keypair |
44603 cycles |
44518 cycles |
1.00 |
ML-KEM-1024 encaps |
49935 cycles |
49892 cycles |
1.00 |
ML-KEM-1024 decaps |
64438 cycles |
64383 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intel Xeon 3rd gen (c6i)
Benchmark suite | Current: 8dff47b | Previous: 668dbab | Ratio |
---|---|---|---|
ML-KEM-512 keypair |
20331 cycles |
20332 cycles |
1.00 |
ML-KEM-512 encaps |
27000 cycles |
27002 cycles |
1.00 |
ML-KEM-512 decaps |
35809 cycles |
35838 cycles |
1.00 |
ML-KEM-768 keypair |
34889 cycles |
34882 cycles |
1.00 |
ML-KEM-768 encaps |
38113 cycles |
38175 cycles |
1.00 |
ML-KEM-768 decaps |
50917 cycles |
50904 cycles |
1.00 |
ML-KEM-1024 keypair |
47988 cycles |
47974 cycles |
1.00 |
ML-KEM-1024 encaps |
54179 cycles |
54135 cycles |
1.00 |
ML-KEM-1024 decaps |
71615 cycles |
71703 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AMD EPYC 4th gen (c7a)
Benchmark suite | Current: 8dff47b | Previous: 668dbab | Ratio |
---|---|---|---|
ML-KEM-512 keypair |
15086 cycles |
15078 cycles |
1.00 |
ML-KEM-512 encaps |
19688 cycles |
19663 cycles |
1.00 |
ML-KEM-512 decaps |
26339 cycles |
26313 cycles |
1.00 |
ML-KEM-768 keypair |
25688 cycles |
25609 cycles |
1.00 |
ML-KEM-768 encaps |
28250 cycles |
28179 cycles |
1.00 |
ML-KEM-768 decaps |
37912 cycles |
37856 cycles |
1.00 |
ML-KEM-1024 keypair |
35680 cycles |
35661 cycles |
1.00 |
ML-KEM-1024 encaps |
41087 cycles |
40971 cycles |
1.00 |
ML-KEM-1024 decaps |
54544 cycles |
54496 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intel Xeon 4th gen (c7i) (no-opt)
Benchmark suite | Current: 8dff47b | Previous: 668dbab | Ratio |
---|---|---|---|
ML-KEM-512 keypair |
34852 cycles |
34833 cycles |
1.00 |
ML-KEM-512 encaps |
45035 cycles |
45065 cycles |
1.00 |
ML-KEM-512 decaps |
58906 cycles |
58937 cycles |
1.00 |
ML-KEM-768 keypair |
59125 cycles |
59101 cycles |
1.00 |
ML-KEM-768 encaps |
71782 cycles |
71728 cycles |
1.00 |
ML-KEM-768 decaps |
89196 cycles |
89239 cycles |
1.00 |
ML-KEM-1024 keypair |
87509 cycles |
87839 cycles |
1.00 |
ML-KEM-1024 encaps |
104651 cycles |
104235 cycles |
1.00 |
ML-KEM-1024 decaps |
127490 cycles |
126864 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graviton3
Benchmark suite | Current: 8dff47b | Previous: 668dbab | Ratio |
---|---|---|---|
ML-KEM-512 keypair |
19000 cycles |
18993 cycles |
1.00 |
ML-KEM-512 encaps |
23609 cycles |
23579 cycles |
1.00 |
ML-KEM-512 decaps |
30774 cycles |
30754 cycles |
1.00 |
ML-KEM-768 keypair |
32288 cycles |
32245 cycles |
1.00 |
ML-KEM-768 encaps |
35729 cycles |
35712 cycles |
1.00 |
ML-KEM-768 decaps |
45855 cycles |
45882 cycles |
1.00 |
ML-KEM-1024 keypair |
46847 cycles |
46847 cycles |
1 |
ML-KEM-1024 encaps |
52618 cycles |
52634 cycles |
1.00 |
ML-KEM-1024 decaps |
66482 cycles |
66481 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graviton4
Benchmark suite | Current: 8dff47b | Previous: 668dbab | Ratio |
---|---|---|---|
ML-KEM-512 keypair |
18191 cycles |
18202 cycles |
1.00 |
ML-KEM-512 encaps |
22232 cycles |
22232 cycles |
1 |
ML-KEM-512 decaps |
28966 cycles |
28996 cycles |
1.00 |
ML-KEM-768 keypair |
30676 cycles |
30680 cycles |
1.00 |
ML-KEM-768 encaps |
33721 cycles |
33736 cycles |
1.00 |
ML-KEM-768 decaps |
43285 cycles |
43315 cycles |
1.00 |
ML-KEM-1024 keypair |
44360 cycles |
44368 cycles |
1.00 |
ML-KEM-1024 encaps |
49783 cycles |
49789 cycles |
1.00 |
ML-KEM-1024 decaps |
62851 cycles |
62848 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AMD EPYC 3rd gen (c6a) (no-opt)
Benchmark suite | Current: 8dff47b | Previous: 668dbab | Ratio |
---|---|---|---|
ML-KEM-512 keypair |
52423 cycles |
52148 cycles |
1.01 |
ML-KEM-512 encaps |
65446 cycles |
65745 cycles |
1.00 |
ML-KEM-512 decaps |
88563 cycles |
88346 cycles |
1.00 |
ML-KEM-768 keypair |
84288 cycles |
84709 cycles |
1.00 |
ML-KEM-768 encaps |
102049 cycles |
101766 cycles |
1.00 |
ML-KEM-768 decaps |
131287 cycles |
132010 cycles |
0.99 |
ML-KEM-1024 keypair |
124608 cycles |
124006 cycles |
1.00 |
ML-KEM-1024 encaps |
145088 cycles |
145709 cycles |
1.00 |
ML-KEM-1024 decaps |
182854 cycles |
183602 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graviton2
Benchmark suite | Current: 8dff47b | Previous: 668dbab | Ratio |
---|---|---|---|
ML-KEM-512 keypair |
29191 cycles |
29193 cycles |
1.00 |
ML-KEM-512 encaps |
35560 cycles |
35560 cycles |
1 |
ML-KEM-512 decaps |
46144 cycles |
46109 cycles |
1.00 |
ML-KEM-768 keypair |
49208 cycles |
49229 cycles |
1.00 |
ML-KEM-768 encaps |
55410 cycles |
55407 cycles |
1.00 |
ML-KEM-768 decaps |
70226 cycles |
70223 cycles |
1.00 |
ML-KEM-1024 keypair |
72208 cycles |
72358 cycles |
1.00 |
ML-KEM-1024 encaps |
81022 cycles |
81166 cycles |
1.00 |
ML-KEM-1024 decaps |
100891 cycles |
100836 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AMD EPYC 4th gen (c7a) (no-opt)
Benchmark suite | Current: 8dff47b | Previous: 668dbab | Ratio |
---|---|---|---|
ML-KEM-512 keypair |
45797 cycles |
45723 cycles |
1.00 |
ML-KEM-512 encaps |
56953 cycles |
56858 cycles |
1.00 |
ML-KEM-512 decaps |
76303 cycles |
76248 cycles |
1.00 |
ML-KEM-768 keypair |
74600 cycles |
74537 cycles |
1.00 |
ML-KEM-768 encaps |
88666 cycles |
88586 cycles |
1.00 |
ML-KEM-768 decaps |
114607 cycles |
114435 cycles |
1.00 |
ML-KEM-1024 keypair |
109558 cycles |
109413 cycles |
1.00 |
ML-KEM-1024 encaps |
127641 cycles |
127490 cycles |
1.00 |
ML-KEM-1024 decaps |
160203 cycles |
160181 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intel Xeon 3rd gen (c6i) (no-opt)
Benchmark suite | Current: 8dff47b | Previous: 668dbab | Ratio |
---|---|---|---|
ML-KEM-512 keypair |
56596 cycles |
56618 cycles |
1.00 |
ML-KEM-512 encaps |
69451 cycles |
69458 cycles |
1.00 |
ML-KEM-512 decaps |
91407 cycles |
91377 cycles |
1.00 |
ML-KEM-768 keypair |
91821 cycles |
91849 cycles |
1.00 |
ML-KEM-768 encaps |
107799 cycles |
107762 cycles |
1.00 |
ML-KEM-768 decaps |
136350 cycles |
136305 cycles |
1.00 |
ML-KEM-1024 keypair |
134787 cycles |
134894 cycles |
1.00 |
ML-KEM-1024 encaps |
155210 cycles |
155288 cycles |
1.00 |
ML-KEM-1024 decaps |
191544 cycles |
191611 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
3659da5
to
4cdc754
Compare
4cdc754
to
7944b6a
Compare
This commit introduces a global configuration file `mlkem/config.h` which should contain all user-configurable parameters. With this commit, it contains: - MLKEM_K - MLKEM_NAMESPACE - FIPS202_NAMESPACE - MLKEM_USE_NATIVE - MLKEM_NATIVE_ARITH_BACKEND - MLKEM_NATIVE_FIPS202_BACKEND The backends have been reorganized to follow a simpler file structure: Every backend profile is identified by metadata file in the toplevel directory of the backend. For example, `aarch64` has `opt.h` and `clean.h`. Those metadata files so far only set the name of the backend, and point to the actual implementation. The reason why the metadata file and the implementation are kept separate is so that assembly files can include the metadata file and know if they should be assembled: For example, `aarch64/opt.h` sets `MLKEM_NATIVE_ARITH_BACKEND_AARCH64_OPT` which all relevant files are guarded by; similar for clean. Previously, they were all guarded more coarsely by `MLKEM_USE_NATIVE_AARCH64` or `MLKEM_USE_NATIVE_X86_64` -- those have been removed. The source code of the backends has been moved into `src` directories. Ultimately, we may want to split `aarch64` into `aarch64_opt` and `aarch64_clean`, so the distinction between profile and backend goes away, but this is not yet attempted. Signed-off-by: Hanno Becker <[email protected]>
85c1cdf
to
8dff47b
Compare
67c5ca2
to
c6196c7
Compare
This commit adds another minimal example to `examples/`, demonstrating how to use a custom configuration file and a custom FIPS-202 backend. Signed-off-by: Hanno Becker <[email protected]>
Signed-off-by: Hanno Becker <[email protected]>
c6196c7
to
d170c4e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks Hanno.
This commit introduces a global configuration file
mlkem/config.h
which should contain all user-configurable parameters. With this
commit, it contains:
The backends have been reorganized to follow a simpler file structure:
Every backend profile is identified by metadata file in the toplevel
directory of the backend. For example,
aarch64
hasopt.h
andclean.h
.Those metadata files so far only set the name of the backend, and point
to the actual implementation. The reason why the metadata file and the
implementation are kept separate is so that assembly files can include
the metadata file and know if they should be assembled: For example,
aarch64/opt.h
setsMLKEM_NATIVE_ARITH_BACKEND_AARCH64_OPT
which allrelevant files are guarded by; similar for clean. Previously, they were
all guarded more coarsely by
MLKEM_USE_NATIVE_AARCH64
orMLKEM_USE_NATIVE_X86_64
-- those have been removed.The source code of the backends has been moved into
src
directories.Ultimately, we may want to split
aarch64
intoaarch64_opt
andaarch64_clean
,so the distinction between profile and backend goes away, but this is not yet
attempted.
====
An example in
examples/
is added which demonstrates how to use a custombackend and custom config.