Skip to content

Commit

Permalink
Enabling DIT flag in AArch64. (#1687)
Browse files Browse the repository at this point in the history
- Add a compile flag to enable it.
- DIT APIs and macro are available for external use.
- Add an option `-dit` to the speed test that's available in a build that used the compile flag. 
Enabling it makes the benchmarks closer to not having DIT enabled. 
This is to suggest that the macro can be used by AWS-LC callers in their own scope 
around multiple calls to the library to improve the performance since, 
in that case, DIT would not switch on and off at the entry/exit of the library functions.
  • Loading branch information
nebeid authored Aug 9, 2024
1 parent 8d5e8c1 commit 697b277
Show file tree
Hide file tree
Showing 26 changed files with 433 additions and 53 deletions.
46 changes: 40 additions & 6 deletions BUILDING.md
Original file line number Diff line number Diff line change
Expand Up @@ -218,7 +218,7 @@ Both sets of tests may also be run with `ninja -C build run_tests`, but CMake

If your project is unable to take on a Go or Perl dependency, the AWS-LC repository
provides generated build files. These can be used in place of the files that would
normally be generated by these dependencies.
normally be generated by these dependencies.

It is still recommended to have both Go and Perl installed to be able to run the full
range of unit tests, as well as running valgrind and SDE tests. Building without Go now
Expand All @@ -228,12 +228,46 @@ More information on this can be found in [INCORPORATING.md](/INCORPORATING.md).

# Snapsafe Detection

AWS-LC supports Snapsafe-type uniqueness breaking event detection
on Linux using SysGenID (https://lkml.org/lkml/2021/3/8/677). This mechanism
is used for security hardening. If a SysGenID interface is not found, then the
mechanism is ignored.
AWS-LC supports Snapsafe-type uniqueness breaking event detection
on Linux using SysGenID (https://lkml.org/lkml/2021/3/8/677). This mechanism
is used for security hardening. If a SysGenID interface is not found, then the
mechanism is ignored.

## Snapsafe Prerequisites

Snapshots taken on active hosts can potentially be unsafe to use.
Snapshots taken on active hosts can potentially be unsafe to use.
See "Snapshot Safety Prerequisites" here: https://lkml.org/lkml/2021/3/8/677

# Data Independent Timing on AArch64

The Data Independent Timing (DIT) flag on Arm64 processors, when
enabled, ensures the following as per [Arm A-profile Architecture
Registers
Document](https://developer.arm.com/documentation/ddi0601/2023-12/AArch64-Registers/DIT--Data-Independent-Timing):
- The timing of every load and store instruction is insensitive to the
value of the data being loaded or stored.
- For certain data processing instructions, the instruction takes a
time which is independent of the data in the registers and the NZCV
flags.

It is also expected to disable the Data Memory-dependent Prefetcher
(DMP) feature of Apple M-series CPUs starting at M3 as per [this
article](https://appleinsider.com/articles/24/03/21/apple-silicon-vulnerability-leaks-encryption-keys-and-cant-be-patched-easily).

Building with the option `-DENABLE_DATA_INDEPENDENT_TIMING_AARCH64=ON`
will enable the macro `SET_DIT_AUTO_DISABLE`. This macro is present at
the entry of functions that process/load/store secret data to enable
the DIT flag and then set it to its original value on entry. With
this build option, there is an effect on performance that varies by
function and by processor architecture. The effect is mostly due to
enabling and disabling the DIT flag. If it remains enabled over many
calls, the effect can be largely mitigated. Hence, the macro can be
inserted in the caller's application at the beginning of the code
scope that makes repeated calls to AWS-LC cryptographic
functions. Alternatively, the functions `armv8_enable_dit` and
`armv8_restore_dit` can be placed at the beginning and the end of
the code section, respectively.
An example of that usage is present in the benchmarking function
`Speed()` in `tool/speed.cc` when the `-dit` option is used

./tool/bssl speed -dit
5 changes: 5 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ option(ENABLE_DILITHIUM "Enable Dilithium signatures in the EVP API" OFF)
option(DISABLE_PERL "Disable Perl for AWS-LC" OFF)
option(DISABLE_GO "Disable Go for AWS-LC" OFF)
option(ENABLE_FIPS_ENTROPY_CPU_JITTER "Enable FIPS entropy source: CPU Jitter" OFF)
option(ENABLE_DATA_INDEPENDENT_TIMING_AARCH64 "Enable Data-Independent Timing (DIT) flag on Arm64" OFF)
include(cmake/go.cmake)

enable_language(C)
Expand Down Expand Up @@ -812,6 +813,10 @@ else()
set(ARCH "generic")
endif()

if(ENABLE_DATA_INDEPENDENT_TIMING_AARCH64)
add_definitions(-DMAKE_DIT_AVAILABLE)
endif()

if(USE_CUSTOM_LIBCXX)
if(NOT CLANG)
message(FATAL_ERROR "USE_CUSTOM_LIBCXX only supported with Clang")
Expand Down
6 changes: 6 additions & 0 deletions crypto/curve25519/curve25519.c
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@

#include "internal.h"
#include "../internal.h"
#include "../fipsmodule/cpucap/internal.h"

// X25519 [1] and Ed25519 [2] is an ECDHE protocol and signature scheme,
// respectively. This file contains an implementation of both using two
Expand Down Expand Up @@ -100,6 +101,7 @@ void ED25519_keypair_from_seed(uint8_t out_public_key[ED25519_PUBLIC_KEY_LEN],

void ED25519_keypair(uint8_t out_public_key[ED25519_PUBLIC_KEY_LEN],
uint8_t out_private_key[ED25519_PRIVATE_KEY_LEN]) {
SET_DIT_AUTO_DISABLE;

// Ed25519 key generation: rfc8032 5.1.5
// Private key is 32 octets of random data.
Expand Down Expand Up @@ -127,6 +129,7 @@ int ED25519_sign(uint8_t out_sig[ED25519_SIGNATURE_LEN],
// seed = private_key[0:31]
// A = private_key[32:61] (per 5.1.5.4)
// Compute az = SHA512(seed).
SET_DIT_AUTO_DISABLE;
uint8_t az[SHA512_DIGEST_LENGTH];
SHA512(private_key, ED25519_PRIVATE_KEY_SEED_LEN, az);
// s = az[0:31]
Expand Down Expand Up @@ -217,6 +220,7 @@ int ED25519_verify(const uint8_t *message, size_t message_len,
void X25519_public_from_private(
uint8_t out_public_value[X25519_PUBLIC_VALUE_LEN],
const uint8_t private_key[X25519_PRIVATE_KEY_LEN]) {
SET_DIT_AUTO_DISABLE;

#if defined(CURVE25519_S2N_BIGNUM_CAPABLE)
x25519_public_from_private_s2n_bignum(out_public_value, private_key);
Expand All @@ -229,6 +233,7 @@ void X25519_public_from_private(

void X25519_keypair(uint8_t out_public_value[X25519_PUBLIC_VALUE_LEN],
uint8_t out_private_key[X25519_PRIVATE_KEY_LEN]) {
SET_DIT_AUTO_DISABLE;

RAND_bytes(out_private_key, X25519_PRIVATE_KEY_LEN);

Expand Down Expand Up @@ -256,6 +261,7 @@ int X25519(uint8_t out_shared_key[X25519_SHARED_KEY_LEN],
const uint8_t private_key[X25519_PRIVATE_KEY_LEN],
const uint8_t peer_public_value[X25519_PUBLIC_VALUE_LEN]) {

SET_DIT_AUTO_DISABLE;
static const uint8_t kZeros[X25519_SHARED_KEY_LEN] = {0};

#if defined(CURVE25519_S2N_BIGNUM_CAPABLE)
Expand Down
4 changes: 4 additions & 0 deletions crypto/fipsmodule/aes/aes.c
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@
// code, above, is incompatible with the |aes_hw_*| functions.

void AES_encrypt(const uint8_t *in, uint8_t *out, const AES_KEY *key) {
SET_DIT_AUTO_DISABLE;
if (hwaes_capable()) {
aes_hw_encrypt(in, out, key);
} else if (vpaes_capable()) {
Expand All @@ -70,6 +71,7 @@ void AES_encrypt(const uint8_t *in, uint8_t *out, const AES_KEY *key) {
}

void AES_decrypt(const uint8_t *in, uint8_t *out, const AES_KEY *key) {
SET_DIT_AUTO_DISABLE;
if (hwaes_capable()) {
aes_hw_decrypt(in, out, key);
} else if (vpaes_capable()) {
Expand All @@ -80,6 +82,7 @@ void AES_decrypt(const uint8_t *in, uint8_t *out, const AES_KEY *key) {
}

int AES_set_encrypt_key(const uint8_t *key, unsigned bits, AES_KEY *aeskey) {
SET_DIT_AUTO_DISABLE;
if (bits != 128 && bits != 192 && bits != 256) {
return -2;
}
Expand All @@ -93,6 +96,7 @@ int AES_set_encrypt_key(const uint8_t *key, unsigned bits, AES_KEY *aeskey) {
}

int AES_set_decrypt_key(const uint8_t *key, unsigned bits, AES_KEY *aeskey) {
SET_DIT_AUTO_DISABLE;
if (bits != 128 && bits != 192 && bits != 256) {
return -2;
}
Expand Down
5 changes: 5 additions & 0 deletions crypto/fipsmodule/cipher/aead.c
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,7 @@ int EVP_AEAD_CTX_init_with_direction(EVP_AEAD_CTX *ctx, const EVP_AEAD *aead,
const uint8_t *key, size_t key_len,
size_t tag_len,
enum evp_aead_direction_t dir) {
SET_DIT_AUTO_DISABLE;
if (key_len != aead->key_len) {
OPENSSL_PUT_ERROR(CIPHER, CIPHER_R_UNSUPPORTED_KEY_SIZE);
ctx->aead = NULL;
Expand Down Expand Up @@ -124,6 +125,7 @@ int EVP_AEAD_CTX_seal(const EVP_AEAD_CTX *ctx, uint8_t *out, size_t *out_len,
size_t max_out_len, const uint8_t *nonce,
size_t nonce_len, const uint8_t *in, size_t in_len,
const uint8_t *ad, size_t ad_len) {
SET_DIT_AUTO_DISABLE;
if (in_len + ctx->aead->overhead < in_len /* overflow */) {
OPENSSL_PUT_ERROR(CIPHER, CIPHER_R_TOO_LARGE);
goto error;
Expand Down Expand Up @@ -162,6 +164,7 @@ int EVP_AEAD_CTX_seal_scatter(const EVP_AEAD_CTX *ctx, uint8_t *out,
size_t in_len, const uint8_t *extra_in,
size_t extra_in_len, const uint8_t *ad,
size_t ad_len) {
SET_DIT_AUTO_DISABLE; //check that it was preserved
// |in| and |out| may alias exactly, |out_tag| may not alias.
if (!check_alias(in, in_len, out, in_len) ||
buffers_alias(out, in_len, out_tag, max_out_tag_len) ||
Expand Down Expand Up @@ -194,6 +197,7 @@ int EVP_AEAD_CTX_open(const EVP_AEAD_CTX *ctx, uint8_t *out, size_t *out_len,
size_t max_out_len, const uint8_t *nonce,
size_t nonce_len, const uint8_t *in, size_t in_len,
const uint8_t *ad, size_t ad_len) {
SET_DIT_AUTO_DISABLE;
if (!check_alias(in, in_len, out, max_out_len)) {
OPENSSL_PUT_ERROR(CIPHER, CIPHER_R_OUTPUT_ALIASES_INPUT);
goto error;
Expand Down Expand Up @@ -241,6 +245,7 @@ int EVP_AEAD_CTX_open_gather(const EVP_AEAD_CTX *ctx, uint8_t *out,
const uint8_t *in, size_t in_len,
const uint8_t *in_tag, size_t in_tag_len,
const uint8_t *ad, size_t ad_len) {
SET_DIT_AUTO_DISABLE;
if (!check_alias(in, in_len, out, in_len)) {
OPENSSL_PUT_ERROR(CIPHER, CIPHER_R_OUTPUT_ALIASES_INPUT);
goto error;
Expand Down
9 changes: 8 additions & 1 deletion crypto/fipsmodule/cipher/cipher.c
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,7 @@ void EVP_CIPHER_CTX_free(EVP_CIPHER_CTX *ctx) {
}

int EVP_CIPHER_CTX_copy(EVP_CIPHER_CTX *out, const EVP_CIPHER_CTX *in) {
SET_DIT_AUTO_DISABLE;
if (in == NULL || in->cipher == NULL) {
OPENSSL_PUT_ERROR(CIPHER, CIPHER_R_INPUT_NOT_INITIALIZED);
return 0;
Expand Down Expand Up @@ -145,6 +146,7 @@ int EVP_CIPHER_CTX_reset(EVP_CIPHER_CTX *ctx) {
int EVP_CipherInit_ex(EVP_CIPHER_CTX *ctx, const EVP_CIPHER *cipher,
ENGINE *engine, const uint8_t *key, const uint8_t *iv,
int enc) {
SET_DIT_AUTO_DISABLE;
GUARD_PTR(ctx);
if (enc == -1) {
enc = ctx->encrypt;
Expand Down Expand Up @@ -262,6 +264,7 @@ static int block_remainder(const EVP_CIPHER_CTX *ctx, int len) {

int EVP_EncryptUpdate(EVP_CIPHER_CTX *ctx, uint8_t *out, int *out_len,
const uint8_t *in, int in_len) {
SET_DIT_AUTO_DISABLE;
GUARD_PTR(ctx);
if (ctx->poisoned) {
OPENSSL_PUT_ERROR(CIPHER, ERR_R_SHOULD_NOT_HAVE_BEEN_CALLED);
Expand Down Expand Up @@ -354,6 +357,7 @@ int EVP_EncryptUpdate(EVP_CIPHER_CTX *ctx, uint8_t *out, int *out_len,
}

int EVP_EncryptFinal_ex(EVP_CIPHER_CTX *ctx, uint8_t *out, int *out_len) {
SET_DIT_AUTO_DISABLE;
int n;
unsigned int i, b, bl;
GUARD_PTR(ctx);
Expand Down Expand Up @@ -408,6 +412,7 @@ int EVP_EncryptFinal_ex(EVP_CIPHER_CTX *ctx, uint8_t *out, int *out_len) {

int EVP_DecryptUpdate(EVP_CIPHER_CTX *ctx, uint8_t *out, int *out_len,
const uint8_t *in, int in_len) {
SET_DIT_AUTO_DISABLE;
GUARD_PTR(ctx);
if (ctx->poisoned) {
OPENSSL_PUT_ERROR(CIPHER, ERR_R_SHOULD_NOT_HAVE_BEEN_CALLED);
Expand Down Expand Up @@ -474,6 +479,7 @@ int EVP_DecryptUpdate(EVP_CIPHER_CTX *ctx, uint8_t *out, int *out_len,
}

int EVP_DecryptFinal_ex(EVP_CIPHER_CTX *ctx, unsigned char *out, int *out_len) {
SET_DIT_AUTO_DISABLE;
int i, n;
unsigned int b;
*out_len = 0;
Expand All @@ -482,7 +488,7 @@ int EVP_DecryptFinal_ex(EVP_CIPHER_CTX *ctx, unsigned char *out, int *out_len) {
// |ctx->cipher->cipher| calls the static aes encryption function way under
// the hood instead of |EVP_Cipher|, so the service indicator does not need
// locking here.

if (ctx->poisoned) {
OPENSSL_PUT_ERROR(CIPHER, ERR_R_SHOULD_NOT_HAVE_BEEN_CALLED);
return 0;
Expand Down Expand Up @@ -546,6 +552,7 @@ int EVP_DecryptFinal_ex(EVP_CIPHER_CTX *ctx, unsigned char *out, int *out_len) {

int EVP_Cipher(EVP_CIPHER_CTX *ctx, uint8_t *out, const uint8_t *in,
size_t in_len) {
SET_DIT_AUTO_DISABLE;
GUARD_PTR(ctx);
GUARD_PTR(ctx->cipher);
const int ret = ctx->cipher->cipher(ctx, out, in, in_len);
Expand Down
40 changes: 40 additions & 0 deletions crypto/fipsmodule/cpucap/cpu_aarch64.c
Original file line number Diff line number Diff line change
Expand Up @@ -49,4 +49,44 @@ void handle_cpu_env(uint32_t *out, const char *in) {
}
}

#if defined(MAKE_DIT_AVAILABLE) && !defined(OPENSSL_WINDOWS)
// "DIT" is not recognised as a register name by clang-10 (at least)
// Register's encoded name is from e.g.
// https://github.com/ashwio/arm64-sysreg-lib/blob/d421e249a026f6f14653cb6f9c4edd8c5d898595/include/sysreg/dit.h#L286
#define DIT_REGISTER s3_3_c4_c2_5

static uint64_t armv8_get_dit(void) {
uint64_t val = 0;
__asm__ volatile("mrs %0, s3_3_c4_c2_5" : "=r" (val));
return (val >> 24) & 1;
}

// See https://github.com/torvalds/linux/blob/53eaeb7fbe2702520125ae7d72742362c071a1f2/arch/arm64/include/asm/sysreg.h#L82
// As per Arm ARM for v8-A, Section "C.5.1.3 op0 == 0b00, architectural hints,
// barriers and CLREX, and PSTATE access", ARM DDI 0487 J.a, system instructions
// for accessing PSTATE fields have the following encoding
// and C5.2.4 DIT, Data Independent Timing:
// Op0 = 0, CRn = 4
// Op1 (3 for DIT) , Op2 (5 for DIT) encodes the PSTATE field modified and defines the constraints.
// CRm = Imm4 (#0 or #1 below)
// Rt = 0x1f
uint64_t armv8_enable_dit(void) {
if (CRYPTO_is_ARMv8_DIT_capable()) {
uint64_t original_dit = armv8_get_dit();
// Encoding of "msr dit, #1"
__asm__ volatile(".long 0xd503415f");
return original_dit;
} else {
return 0;
}
}

void armv8_restore_dit(volatile uint64_t *original_dit) {
if (CRYPTO_is_ARMv8_DIT_capable() && *original_dit != 1) {
// Encoding of "msr dit, #0"
__asm__ volatile(".long 0xd503405f");
}
}
#endif // MAKE_DIT_AVAILABLE && !OPENSSL_WINDOWS

#endif // OPENSSL_AARCH64 && !OPENSSL_STATIC_ARMCAP
6 changes: 6 additions & 0 deletions crypto/fipsmodule/cpucap/cpu_aarch64_apple.c
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,12 @@ void OPENSSL_cpuid_setup(void) {
OPENSSL_armcap_P |= ARMV8_APPLE_M1;
}

#if defined(MAKE_DIT_AVAILABLE)
if (has_hw_feature("hw.optional.arm.FEAT_DIT")) {
OPENSSL_armcap_P |= ARMV8_DIT;
}
#endif // MAKE_DIT_AVAILABLE

// OPENSSL_armcap is a 32-bit, unsigned value which may start with "0x" to
// indicate a hex value. Prior to the 32-bit value, a '~' or '|' may be given.
//
Expand Down
8 changes: 8 additions & 0 deletions crypto/fipsmodule/cpucap/cpu_aarch64_linux.c
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,14 @@ void OPENSSL_cpuid_setup(void) {
}
}

#if defined(MAKE_DIT_AVAILABLE)
static const unsigned long kDIT = 1 << 24;
// Before enabling/disabling the DIT flag, check it's available in HWCAP
if (hwcap & kDIT) {
OPENSSL_armcap_P |= ARMV8_DIT;
}
#endif // MAKE_DIT_AVAILABLE

// OPENSSL_armcap is a 32-bit, unsigned value which may start with "0x" to
// indicate a hex value. Prior to the 32-bit value, a '~' or '|' may be given.
//
Expand Down
4 changes: 3 additions & 1 deletion crypto/fipsmodule/cpucap/internal.h
Original file line number Diff line number Diff line change
Expand Up @@ -235,7 +235,9 @@ OPENSSL_INLINE int CRYPTO_is_ARMv8_wide_multiplier_capable(void) {
(OPENSSL_armcap_P & ARMV8_APPLE_M1) != 0;
}


OPENSSL_INLINE int CRYPTO_is_ARMv8_DIT_capable(void) {
return (OPENSSL_armcap_P & ARMV8_DIT) != 0;
}

#endif // OPENSSL_ARM || OPENSSL_AARCH64

Expand Down
Loading

0 comments on commit 697b277

Please sign in to comment.