Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate Kyber from libjade #1745

Merged
merged 84 commits into from
Aug 18, 2024
Merged

Integrate Kyber from libjade #1745

merged 84 commits into from
Aug 18, 2024

Conversation

praveksharma
Copy link
Member

@praveksharma praveksharma commented Apr 3, 2024

This PR integrates Kyber512 and Kyber768 from libjade for x86_64 with and without AVX2 optimizations as described in #1466. Once Kyber1024 is made available in libjade the updated copy_from_upstream.py script will pull and integrate that code into liboqs as well, requiring only modifications to copy_from_libjade.yml and copy_from_upstream.yml.

This PR modifies copy_from_upstream.py and adds a 'libjade' mode in addition 'copy' and 'verify. This new 'libjade' functions independent of 'copy' mode and copies code for the KEM algorithms specified in copy_from_libjade.yml and updates relevant documentation and the library's CBOM. While this additional 'libjade' copy_from_upstream infrastructure can also pull signature algorithms it would likely require minor tweaks to the various templates used by copy_from_upstream.py; since libjade doesn't provide full support for any signature algorithm at the moment, those changes are best dealt with a separate PR.

UPDATE (from comment below):

At the moment, I've turned off weekly constant time tests when building with -DOQS_LIBJADE_BUILD=ON. Jasmin compiler doesn't include debug information in the x86 assembly it produces so Valgrind is unable to resolve specific function names and locations in source. I've attached output from memcheck for reference: Kyber768.txt Kyber512.txt

There is a closed PR in the Jasmin lang repo which added a feature to include DWARF2 information in the generated assembly but this update is not included in the latest release. Once merged, I can create an issue to track this and update the versions of Jasmin compiler used in CI after that feature is released.

[No] Does this PR change the input/output behaviour of a cryptographic algorithm (i.e., does it change known answer test values)? (If so, a version bump will be required from x.y.z to x.(y+1).0.)
[No] Does this PR change the list of algorithms available -- either adding, removing, or renaming? Does this PR otherwise change an API? (If so, PRs in fully supported downstream projects dependent on these, i.e., oqs-provider will also need to be ready for review and merge by the time this is merged.)

@praveksharma
Copy link
Member Author

The CBOM generation script doesn't account for the BOM including multiple implementations of the same algorithm targeting the same platform. I've tweaked the script to include additional information about the implementation in the bom-ref for components pulled from libjade to avoid clashes (see /docs/cbom.json). Since the Cyclone DX CBOM guide only list a uniqueness criterion that any component's bom-ref must satisfy I think this should be okay; I'd appreciate your thoughts on this @bhess.

@praveksharma praveksharma requested a review from SWilson4 April 9, 2024 23:04
@praveksharma praveksharma marked this pull request as ready for review April 9, 2024 23:04
@bhess
Copy link
Member

bhess commented Apr 10, 2024

The CBOM generation script doesn't account for the BOM including multiple implementations of the same algorithm targeting the same platform. I've tweaked the script to include additional information about the implementation in the bom-ref for components pulled from libjade to avoid clashes (see /docs/cbom.json). Since the Cyclone DX CBOM guide only list a uniqueness criterion that any component's bom-ref must satisfy I think this should be okay; I'd appreciate your thoughts on this @bhess.

I agree @praveksharma, for this use case this should be fine, only the bom-ref needs to be unique. Thanks for updating the script. I've also created issue #1753 since the CBOM spec was just merged to the official CycloneDX 1.6 release.

Do I understand it correctly that the libjade version of Kyber would co-exist with the pq-crystals implementation, but implementing the same algorithm for the same target? Is there a use case to have both implementations in liboqs?

@praveksharma
Copy link
Member Author

Do I understand it correctly that the libjade version of Kyber would co-exist with the pq-crystals implementation, but implementing the same algorithm for the same target? Is there a use case to have both implementations in liboqs?

This is partially correct. Unfortunately, the jasmin compiler produces AT&T assembly so libjade code isn't compatible with Microsoft Visual C++ which uses it's own assembly format. So liboqs would need to have PQClean ref code in addition to libjade's ref and avx2 code. The only redundant implementation is PQClean avx2.

At the moment, liboqs would also need to include Kyber1024 PQClean avx2 since libjade doesn't include Kyber1024. This would lead to a mixed set of implementations for Kyber:

  1. Kyber512: PQClean ref, libjade ref, libjade avx2
  2. Kyber768: PQClean ref, libjade ref, libjade avx2
  3. Kyber1024: PQClean ref, PQClean avx2

My thought was that this might lead to additional maintenance burden for an algorithm that we might want to drop in favour of ML-KEM soon (this summer?) when we don't need Kyber for CIRCL interop.

Aside from Kyber, here are some half-formed thoughts as to why we may want to include regular C99 implementations for algorithms in addition to formally verified assembly implementations from libjade (or similar projects):

  1. My understanding is that the formal verification process leads to a larger turnaround between changing jasmin code and having provably correct/secure assembly. In the event of a vulnerability it may be quicker to patch C99 code (either in liboqs or accept patches from upstream) than to wait for the formally verified code to be fixed. I'm not sure how practical this concern is.
  2. There may be projects unwilling to include assembly code due organisational/compliance reasons. Again, I'm not sure how practical this concern is.

@dstebila dstebila added this to the 0.11.0 milestone Apr 16, 2024
@praveksharma
Copy link
Member Author

praveksharma commented Apr 19, 2024

The libjade code wasn't accessible due a error in one of the templates. Thank you for pointing that out @SWilson4!

At the moment, I've turned off weekly constant time tests when building with -DOQS_LIBJADE_BUILD=ON. Jasmin compiler doesn't include debug information in the x86 assembly it produces so Valgrind is unable to resolve specific function names and locations in source. I've attached output from memcheck for reference: Kyber768.txt Kyber512.txt

There is a closed PR in the Jasmin lang repo which added a feature to include DWARF2 information in the generated assembly but this update is not included in the latest release. Once merged, I can create an issue to track this and update the versions of Jasmin compiler used in CI after that feature is released.

Copy link
Member

@baentsch baentsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks very much for this large chunk of work, @praveksharma ! Am I overlooking CI actually activating and running this code?

Copy link
Member

@SWilson4 SWilson4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there's still one templating bug to iron out. Other than that, everything looks great to me. I didn't carefully review the assembly code (it's formally verified anyway, right?) or the Python scripts.

docs/algorithms/kem/kyber.yml Outdated Show resolved Hide resolved
scripts/copy_from_upstream/copy_from_upstream.py Outdated Show resolved Hide resolved
src/kem/kyber/kem_kyber_768.c Outdated Show resolved Hide resolved
src/kem/kyber/kem_kyber_768.c Outdated Show resolved Hide resolved
docs/algorithms/kem/kyber.yml Outdated Show resolved Hide resolved
src/kem/kyber/kem_kyber_512.c Outdated Show resolved Hide resolved
src/kem/kyber/kem_kyber_768.c Outdated Show resolved Hide resolved
src/kem/kyber/kem_kyber_768.c Outdated Show resolved Hide resolved
src/kem/kyber/kem_kyber_768.c Outdated Show resolved Hide resolved
src/kem/kyber/libjade_kyber512_avx2/api.c Show resolved Hide resolved
@SWilson4
Copy link
Member

A few comments from an older review (addressed in-person with @praveksharma) slipped in. I think I've cleaned them all up; sorry for any confusion.

@praveksharma
Copy link
Member Author

@baentsch @SWilson4, thank you for your reviews! I think I've addressed all of them.

Am I overlooking CI actually activating and running this code?

You weren't Michael, I've fixed that. The code is being activated by the GitHub Actions workflows but not CircleCI (specifically the ubuntu-bionic-i386 image which from what I can tell isn't being used in GitHub Actions workflows). However, the code is being tested on both Linux and Darwin which are they only platforms libjade supports at the moment.

Copy link
Member

@SWilson4 SWilson4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of questions around config for ARM-based systems.

.CMake/alg_support.cmake Outdated Show resolved Hide resolved
.CMake/alg_support.cmake Outdated Show resolved Hide resolved
docs/algorithms/kem/kyber.yml Outdated Show resolved Hide resolved
src/common/CMakeLists.txt Outdated Show resolved Hide resolved
src/kem/kyber/kem_kyber_512.c Outdated Show resolved Hide resolved
.github/workflows/unix.yml Show resolved Hide resolved
src/kem/kyber/kem_kyber_768.c Outdated Show resolved Hide resolved
CMakeLists.txt Outdated Show resolved Hide resolved
@SWilson4 SWilson4 self-requested a review April 26, 2024 13:33
@dstebila
Copy link
Member

libjade has now been relicensed to be CC0 or Apache 2: formosa-crypto/libjade#121

@baentsch
Copy link
Member

baentsch commented May 7, 2024

@baentsch @SWilson4, thank you for your reviews! I think I've addressed all of them.

Am I overlooking CI actually activating and running this code?

You weren't Michael, I've fixed that. The code is being activated by the GitHub Actions workflows but not CircleCI (specifically the ubuntu-bionic-i386 image which from what I can tell isn't being used in GitHub Actions workflows). However, the code is being tested on both Linux and Darwin which are they only platforms libjade supports at the moment.

Thanks for this improvement. So this means that if this is merged (and activated), Linux and MacOS will run automatically run libjade Kyber code and all else (Windows, Raspberry, IBM platforms, etc.) will run reference code? If so, is this already documented? Should it be? How?

Copy link
Member

@bhess bhess left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work adding Kyber from libjade!

Two points I noticed:

  • When running copy_from_upstream.py libjade I seem to be missing some dependencies. Should they be documented?

  • If I compile with -DOQS_LIBJADE_BUILD=ON and run, e.g., test_kem Kyber768, the OQS_LIBJADE_BUILD flag isn't reported under "OQS build flags". I think it would be useful to know the flag is activated by reporting it in system_info.c.

Now it looks like this:

Configuration info
==================
Target platform:  x86_64-Linux-6.1.19
Compiler:         gcc (11.4.0)
Compile options:  [-Wa,--noexecstack;-O3;-fomit-frame-pointer;-fdata-sections;-ffunction-sections;-Wl,--gc-sections;-Wbad-function-cast]
OQS version:      0.10.1-dev
Git commit:       864be05738585e975540dd3ee1095d580904517f (+ local modifications)
OpenSSL enabled:  Yes (OpenSSL 3.0.2 15 Mar 2022)
AES:              NI
SHA-2:            OpenSSL
SHA-3:            C
OQS build flags:  OQS_DIST_BUILD OQS_OPT_TARGET=generic CMAKE_BUILD_TYPE=Release 
CPU exts active:  ADX AES AVX AVX2 AVX512 BMI1 BMI2 PCLMULQDQ VPCLMULQDQ POPCNT SSE SSE2 SSE3

src/kem/kyber/libjade_kyber768_ref/api.c Show resolved Hide resolved
@praveksharma
Copy link
Member Author

The license files have been updated and the PR is ready to merge.

CONFIGURE.md Outdated Show resolved Hide resolved
@@ -378,7 +378,7 @@ TOC_INCLUDE_HEADINGS = 0
# The default value is: DOXYGEN.
# This tag requires that the tag MARKDOWN_SUPPORT is set to YES.

MARKDOWN_ID_STYLE = DOXYGEN
MARKDOWN_ID_STYLE = GITHUB
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this necessary/related to a libjade integration?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't. I likely made this change while debugging documentation generation and this option fixes some issues I was facing with GH style anchors.

Copy link
Member

@baentsch baentsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conceptually LGTM, but didn't have enough time to thoroughly look at it all and will be on the road from tomorrow until next Wed, so please rely on other's opinions. Thanks for the work @praveksharma !

Signed-off-by: Pravek Sharma <[email protected]>
Copy link
Member

@SWilson4 SWilson4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment about docs that may require updating. Otherwise LGTM!

README.md Outdated Show resolved Hide resolved
src/common/libjade_shims/libjade_randombytes.h Outdated Show resolved Hide resolved
src/kem/kyber/libjade_kyber512_avx2/api.c Show resolved Hide resolved
src/kem/kyber/libjade_kyber512_avx2/api.h Show resolved Hide resolved
src/kem/kyber/libjade_kyber512_ref/api.c Show resolved Hide resolved
src/kem/kyber/libjade_kyber512_ref/api.h Show resolved Hide resolved
src/kem/kyber/libjade_kyber768_avx2/api.c Show resolved Hide resolved
src/kem/kyber/libjade_kyber768_avx2/api.h Show resolved Hide resolved
src/kem/kyber/libjade_kyber768_ref/api.c Show resolved Hide resolved
src/kem/kyber/libjade_kyber768_ref/api.h Show resolved Hide resolved
@praveksharma
Copy link
Member Author

Merging despite Travis failing, please see #1888.

@praveksharma praveksharma merged commit e520ec1 into main Aug 18, 2024
46 of 48 checks passed
@SWilson4
Copy link
Member

@baentsch
Copy link
Member

@SWilson4 @praveksharma May I ask a question: Does this mean that when a GH CI job is incorrectly formatted, it simply does not run and no UI information is given so one is tempted to (and in this case did) merge even though CI doesn't run? Second question: Is there no way one can check "at a glance" whether all configured GH jobs ran (and are configured) OK (along the lines of the Travis CI badge)? Worthwhile checking back with GH?

@SWilson4
Copy link
Member

@SWilson4 @praveksharma May I ask a question: Does this mean that when a GH CI job is incorrectly formatted, it simply does not run and no UI information is given so one is tempted to (and in this case did) merge even though CI doesn't run? Second question: Is there no way one can check "at a glance" whether all configured GH jobs ran (and are configured) OK (along the lines of the Travis CI badge)? Worthwhile checking back with GH?

I believe it does shows up, but not as a "red" failure (simply greyed out iirc). In this case it was probably even less obvious because of the Travis failures. I think this will be improved by #1880.

@SWilson4
Copy link
Member

SWilson4 commented Aug 21, 2024

I believe it does shows up, but not as a "red" failure (simply greyed out iirc). In this case it was probably even less obvious because of the Travis failures. I think this will be improved by #1880.

Never mind, it doesn't show up at all on the interface next to the merge button. :( See below for an example of a PR which inherits the same failure. The two "action required" checks are Travis failures.

Screenshot from 2024-08-21 15-19-03

@baentsch
Copy link
Member

Never mind, it doesn't show up at all on the interface next to the merge button.

So I thought. This is not good: We start to regularly disregard CI failures (Travis) and don't see (look for) badly configured CI jobs, too (not exactly surprising as we already seem to expect CI to fail) -- and on top violate the least privilege principle wrt controlling who can do what to a project -- and discuss this all in closed-and-merged PRs, i.e., after the fox has left the barn (or whatever the proverb is :).

I think we ought to discuss how to improve things here. Any chance we can take some minutes for this in the next status call, @dstebila ? If not, my proposal below:

  • Do not accept PRs failing any CI but rather remove deemed irrelevant CI for good, creating an issue to activate it when possible; possibly document changes thus caused in PLATFORMS.md so this is becoming visible to the community and possibly creating motivation for the community to help solve things.
  • Add testing to highlight incorrect CI scripts and do not allow PRs before such errors are fixed, in effect making them a priority
  • PRs changing GH access permissions should be reviewed by at least as many people as changes to the logic of liboqs as errors here have the potential for more damage than simple deletion of a project, namely the uncontrolled integration of code into a code base that aims to be trusted (at least in my view).

@dstebila
Copy link
Member

I think we ought to discuss how to improve things here. Any chance we can take some minutes for this in the next status call, @dstebila ?

Done! https://github.com/open-quantum-safe/tsc/blob/main/oqs-status-meetings/2024-08-27%20-%20OQS%20status%20meeting%20-%20agenda.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants