-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CBMC: Proof of type safety and memory safety for toplevel MLKEM KeyGen (modulo NTT+SHA3) #370
Conversation
9e6a575
to
cb061bc
Compare
3f16ec4
to
213e730
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice @hanno-becker. Congratulations to the first proof of a high-level function! Looks like things are coming together.
Some suggestions and questions.
33b67ac
to
eb5e55b
Compare
Attempts to verify the toplevel MLKEM-INDCPA API have showed that the matrix-vector multiplication embedded in the keypair generation (and, likely, also the one in the encapsulation) are causing performance issues for CBMC verification. This commit hoists the matrix-vector multiplication loop, excluding the preceeding mulcache computation, into a helper function `matvec_mul`. This helper function is specified and proved independently. To make this work, the call to `poly_tomont()` in the matvec multiplication loop inside key generation needs to be hoisted out of the loop and replaced by a call to the new function `polyvec_tomont()`, which is also given the obvious spec+proof. Signed-off-by: Hanno Becker <[email protected]>
Previously, the spec for polyvec_basemul_acc_montgomery_cached() would mark the entire destination object around the target polynomial as tainted. This is too strong if the caller passes a polynomial which is part of a polynomial vector -- which is the case during matrix vector multiplication. This commit strengthens the assignment clause for polyvec_basemul_acc_montgomery_cached to mark only a poly-sized slice as tainted. Similarly, the specification of poly_tomont() is strengthened. Signed-off-by: Hanno Becker <[email protected]>
Our type+memory safety proof in CBMC does not require any specific knowledge of what the NTT does. We do, however, need a bound on the coefficients of the output polynomial, in order to prove type safety (non-overflow) when the caller subsequently adds to the output. The definition of the NTT and invNTT bounds are, likely temporarily, moved to poly.h, to avoid a cyclic dependency between headers. Signed-off-by: Hanno Becker <[email protected]>
Previously, the assign clauses in poly_cbd_eta{1,2} would see the entire object containing the target polynomial as tainted. This is too strong if the target polynoimal is part of a polynomial vector. This commit strengthens the specifications to only taint the poly-sized slice spanned by the target polynomial. Signed-off-by: Hanno Becker <[email protected]>
Previously, poly_getnoise_eta1_4x() was specified as requiring four output polynomials satisfying IS_FRESH(...). While intuitively we only want to express that the output polynomials are disjoint, the obligation generated for a caller is, in fact, much stronger: The target polynomials cannot belong to the same object. In particular, the current specification would not allow to pass in four consecutive slices of a polyvec, or say 2 consecutive polyvecs in the first and last two arguments each. Those limitations will prevent verification of indcpa_keypair_derand, which does indeed pass slices of polyvecs to the getnoise functions. To accommodate, this commit introduces multiple preconditions varying on in the amount of object-sharing that's permitted: For MLKEM_K=4, the output polynomials must be consecutive slices of the same object. For MLKEM_K=2, the first and last two must be consecutive, but belong to separate objects. For MLKEM_K=3, the first three must be consecutive, and the forth is unused and must be NULL. For MLKEM_K=3, which needs sampling of 6 polynomials during keygen, the code previously used 2 dummy buffers for the two unused output slots. This commit changes this behaviour and instead passes NULL as the output buffer, and drops the respective write. This is both cleaner and faster. Object sharing in the sibling functions poly_getnoise_eta2_4x and poly_getnoise_eta1122_4x will be addressed once they are needed. One can attempt to use a single REQUIRES(...) clause with a large disjunction, which proves OK but seems to cause issues at call-time; this needs further investigation. Signed-off-by: Hanno Becker <[email protected]>
This commit adds a tentative specification, but no proof, for the hash_g/sha3_512 function used by the toplevel INDCPA API. It appears that hash_g is called in two versions: - With input and output equal - With input and output disjoint The tentative spec aims to accommodate both cases. Signed-off-by: Hanno Becker <[email protected]>
This commit brings together the previous workstream of low-level CBMC proofs, into the proof of type safety and memory safety for the MLKEM-INDCPA-KeyGen function `indcpa_keypair_derand()`. In terms of type safety, what the proof in particular shows is that the input/output bounds of the various arithmetic functions are coherent, and that there is hence no risk of arithmetic overflow. Signed-off-by: Hanno Becker <[email protected]>
This commit brings together the numerous low-level proofs to a proof of type-safety and memory-safety of the top-level MLKEM key generation API `crypto_kem_keypair_derand()`. Signed-off-by: Hanno Becker <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM now - thanks @hanno-becker for making the changes.
This commit disables CBMC coverage generation in CI, which is very time-consuming and not what we are interested in currently. Signed-off-by: Hanno Becker <[email protected]>
This PR brings together the work of the past weeks into a top-level type+memory safety proof of
crypto_kem_keypair_derand()
-- with only NTT and SHA3 axiomatized, but all other lower level functions specified and proved.A few changes had to be made to the existing lower level proofs -- see the individual commit messages for details.