Skip to content

Latest commit

 

History

History
704 lines (596 loc) · 29 KB

README.md

File metadata and controls

704 lines (596 loc) · 29 KB

OTBN-PQ: Enabling Lattice-Based Post-Quantum Cryptography on the OpenTitan Platform

This code has been published as part of the following conference paper: Enabling Lattice-Based Post-Quantum Cryptography on the OpenTitan Platform. Further below you can find the original README of the OTBN.

This version of the OTBN with PQ-Extension works with the Earlgrey-PROD-M3-RC1 Release (Commit 49d4e53) of the OpenTitan.

To checkout this commit/tag use the following command:

git checkout 49d4e53

Synthesize OTBN-PQ Standalone

With the following commands a build script for standalone test synthesis for the OTBN-PQ for Vivado is generated:

fusesoc --cores-root . run --flag=fileset_top --target=synth --no-export --setup aisec:ip:otbn_pq:0.1
cd build/aisec_ip_otbn_pq_0.1/synth-vivado/
. /tools/Xilinx/Vivado/2020.2/settings64.sh
vivado

Within Vivado execute the following commands to generate the project:

source aisec_ip_otbn_pq_0.1.tcl

Results

Test synthesis results (Vivado 2020.2 - 21.06.2023)

LUTs FFs BRAMs DSPs
OTBN-PQ 55,409 16,575 14.5 49
PQ-ALU 1,830 0 0 11
Twiddle Update Unit 1,939 904 0 22
Register Address Unit 118 47 0 0
Keccak Lane Unit 992 0 0 0
Keccak Plane Unit 320 0 0 0

Simulate OTBN-PQ Standalone

With the following commands a build script for standalone RTL-simulation environnemnt for the OTBN-PQ for Vivado is generated:

fusesoc --cores-root . run --flag=fileset_top --target=sim --no-export --setup aisec:ip:otbn_pq:0.1
cd build/aisec_ip_otbn_pq_0.1/sim-vivado/
. /tools/Xilinx/Vivado/2020.2/settings64.sh
vivado

Within Vivado execute the following commands to generate the project and configure the simulator:

source aisec_ip_otbn_pq_0.1.tcl
set_property top tb_otbn [get_filesets sim_1]
set_property top_lib xil_defaultlib [get_filesets sim_1]
set_property -name {xsim.simulate.runtime} -value {5000000ns} -objects [get_filesets sim_1]

To generate the IMEM and DMEM contents for the RTL-testbench, execute the following bash script:

cd hw/vendor/aisec_otbn_pq
dv/sv/gen_mem_files.sh

Within dv/sv/tb_otbn.sv the paths to the tests and the log file have to be set accordingly:

localparam string                 log_path = "/home/user/projects/aisec/opentitan/hw/vendor/aisec_otbn_pq/dv/sv/log/";
localparam string                 mem_path = "/home/user/projects/aisec/opentitan/hw/vendor/aisec_otbn_pq/dv/sv/";

When running the simulation, the memory files generated by the dv/sv/gen_mem_files.sh script are loaded into DMEM and IMEM. These memory files correspond to the applications within the sw/ directory. For debugging the applications dv/sv/log logs all bus accesses. Furthermore, the clock cycles necessary to execute an application are counted (see Results).

Troubleshooting

Running the simulation might not function out of the box when using Vivado simulator. This is due to two files which might have to be modified:

  • tlul_rsp_intg_chk:
  logic [63:0] test1;
  logic [63:0] test2;  
  assign test1 = {tl_i.d_user.rsp_intg, D2HRspMaxWidth'(rsp)};
  assign test2 = {tl_i.d_user.data_intg, DataMaxWidth'(tl_i.d_data)};

  prim_secded_inv_64_57_dec u_chk (
    .data_i(test1),
    .data_o(),
    .syndrome_o(),
    .err_o(rsp_err)
  );

  logic rsp_data_err;
  if (EnableRspDataIntgCheck) begin : gen_rsp_data_intg_check
    tlul_data_integ_dec u_tlul_data_integ_dec (
      .data_intg_i(test2),
      .data_err_o(rsp_data_err)
    );
  • tlul_cmd_intg_chk:
    logic[63:0] test1;
    assign test1 = {tl_i.a_user.data_intg, DataMaxWidth'(tl_i.a_data)};
    logic[63:0] test2;
    assign test2 = {tl_i.a_user.cmd_intg, H2DCmdMaxWidth'(cmd)};

    prim_secded_inv_64_57_dec u_chk (
      .data_i(test2),
      .data_o(),
      .syndrome_o(),
      .err_o(err)
    );

    tlul_data_integ_dec u_tlul_data_integ_dec (
      .data_intg_i(test1),
      .data_err_o(data_err)
    );

Results

Simulation results (Vivado 2020.2 - 21.06.2023)

Test CC Errors
Keccak 1,204 0
Kyber-NTT 1,642 0
Kyber-INTT 1,904 0
Kyber-BaseMul 1,626 0
Dilithium-NTT 2,160 0
Dilithium-INTT 2,422 0
Dilithium-Mul 956 0
Dilithium-II Verify 403,592 0
Dilithium-III Verify 683,674 0
Dilithium-V Verify 1,194,032 0
These results include the CCs for the secure wipe at the end of an application.

Citation

Cite this work, please use the following BibTeX entry:

@inproceedings{10.1145/3605769.3623993,
author = {Stelzer, Tobias and Oberhansl, Felix and Schupp, Jonas and Karl, Patrick},
title = {Enabling Lattice-Based Post-Quantum Cryptography on the OpenTitan Platform},
year = {2023},
isbn = {9798400702624},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3605769.3623993},
doi = {10.1145/3605769.3623993},
booktitle = {Proceedings of the 2023 Workshop on Attacks and Solutions in Hardware Security},
pages = {51–60},
numpages = {10},
keywords = {post-quantum cryptography, digital signatures, hardware/software co-design, lattice-based cryptography},
location = {<conf-loc>, <city>Copenhagen</city>, <country>Denmark</country>, </conf-loc>},
series = {ASHES '23}
}

License

Apache License Version 2.0

This repository includes code from the following third party libraries:

OpenTitan: Apache License 2.0

OpenTitan Big Number Accelerator (OTBN) Technical Specification

otbn:

Overview

This document specifies functionality of the OpenTitan Big Number Accelerator, or OTBN. OTBN is a coprocessor for asymmetric cryptographic operations like RSA or Elliptic Curve Cryptography (ECC).

This module conforms to the Comportable guideline for peripheral functionality. See that document for integration overview within the broader top level system.

Features

  • Processor optimized for wide integer arithmetic
  • 32b wide control path with 32 32b wide registers
  • 256b wide data path with 32 256b wide registers
  • Full control-flow support with conditional branch and unconditional jump instructions, hardware loops, and hardware-managed call/return stacks.
  • Reduced, security-focused instruction set architecture for easier verification and the prevention of data leaks.
  • Built-in access to random numbers.

Description

OTBN is a processor, specialized for the execution of security-sensitive asymmetric (public-key) cryptography code, such as RSA or ECC. Such algorithms are dominated by wide integer arithmetic, which are supported by OTBN's 256b wide data path, registers, and instructions which operate these wide data words. On the other hand, the control flow is clearly separated from the data, and reduced to a minimum to avoid data leakage.

The data OTBN processes is security-sensitive, and the processor design centers around that. The design is kept as simple as possible to reduce the attack surface and aid verification and testing. For example, no interrupts or exceptions are included in the design, and all instructions are designed to be executable within a single cycle.

OTBN is designed as a self-contained co-processor with its own instruction and data memory, which is accessible as a bus device.

Compatibility

OTBN is not designed to be compatible with other cryptographic accelerators. It received some inspiration from assembly code available from the Chromium EC project, which has been formally verified within the Fiat Crypto project.

Instruction Set

OTBN is a processor with a custom instruction set. The full ISA description can be found in our ISA manual. The instruction set is split into two groups:

  • The base instruction subset operates on the 32b General Purpose Registers (GPRs). Its instructions are used for the control flow of a OTBN application. The base instructions are inspired by RISC-V's RV32I instruction set, but not compatible with it.
  • The big number instruction subset operates on 256b Wide Data Registers (WDRs). Its instructions are used for data processing.

Processor State

General Purpose Registers (GPRs)

OTBN has 32 General Purpose Registers (GPRs), each of which is 32b wide. The GPRs are defined in line with RV32I and are mainly used for control flow. They are accessed through the base instruction subset. GPRs aren't used by the main data path; this operates on the Wide Data Registers, a separate register file, controlled by the big number instructions.

x0 Zero register. Reads as 0; writes are ignored.
x1

Access to the call stack

x2 ... x31 General purpose registers

Note: Currently, OTBN has no "standard calling convention," and GPRs other than x0 and x1 can be used for any purpose. If a calling convention is needed at some point, it is expected to be aligned with the RISC-V standard calling conventions, and the roles assigned to registers in that convention. Even without a agreed-on calling convention, software authors are encouraged to follow the RISC-V calling convention where it makes sense. For example, good choices for temporary registers are x6, x7, x28, x29, x30, and x31.

Call Stack

OTBN has an in-built call stack which is accessed through the x1 GPR. This is intended to be used as a return address stack, containing return addresses for the current stack of function calls. See the documentation for {{#otbn-insn-ref JAL}} and {{#otbn-insn-ref JALR}} for a description of how to use it for this purpose.

The call stack has a maximum depth of 8 elements. Each instruction that reads from x1 pops a single element from the stack. Each instruction that writes to x1 pushes a single element onto the stack. An instruction that reads from an empty stack or writes to a full stack causes a CALL_STACK software error.

A single instruction can both read and write to the stack. In this case, the read is ordered before the write. Providing the stack has at least one element, this is allowed, even if the stack is full.

Control and Status Registers (CSRs)

Control and Status Registers (CSRs) are 32b wide registers used for "special" purposes, as detailed in their description; they are not related to the GPRs. CSRs can be accessed through dedicated instructions, {{#otbn-insn-ref CSRRS}} and {{#otbn-insn-ref CSRRW}}. Writes to read-only (RO) registers are ignored; they do not signal an error. All read-write (RW) CSRs are set to 0 when OTBN starts an operation (when 1 is written to CMD.start).

Number Access Name Description
0x7C0 RW FG0 Wide arithmetic flag group 0. This CSR provides access to flag group 0 used by wide integer arithmetic. *FLAGS*, *FG0* and *FG1* provide different views on the same underlying bits.
BitDescription
0Carry of flag group 0
1MSb of flag group 0
2LSb of flag group 0
3Zero of flag group 0
0x7C1 RW FG1 Wide arithmetic flag group 1. This CSR provides access to flag group 1 used by wide integer arithmetic. *FLAGS*, *FG0* and *FG1* provide different views on the same underlying bits.
BitDescription
0Carry of flag group 1
1MSb of flag group 1
2LSb of flag group 1
3Zero of flag group 1
0x7C8 RW FLAGS Wide arithmetic flag groups. This CSR provides access to both flag groups used by wide integer arithmetic. *FLAGS*, *FG0* and *FG1* provide different views on the same underlying bits.
BitDescription
0Carry of flag group 0
1MSb of flag group 0
2LSb of flag group 0
3Zero of flag group 0
4Carry of flag group 1
5MSb of flag group 1
6LSb of flag group 1
7Zero of flag group 1
0x7D0 RW MOD0 Bits [31:0] of the modulus operand, used in the {{#otbn-insn-ref BN.ADDM}}/{{#otbn-insn-ref BN.SUBM}} instructions. This CSR is mapped to the MOD WSR.
0x7D1 RW MOD1 Bits [63:32] of the modulus operand, used in the {{#otbn-insn-ref BN.ADDM}}/{{#otbn-insn-ref BN.SUBM}} instructions. This CSR is mapped to the MOD WSR.
0x7D2 RW MOD2 Bits [95:64] of the modulus operand, used in the {{#otbn-insn-ref BN.ADDM}}/{{#otbn-insn-ref BN.SUBM}} instructions. This CSR is mapped to the MOD WSR.
0x7D3 RW MOD3 Bits [127:96] of the modulus operand, used in the {{#otbn-insn-ref BN.ADDM}}/{{#otbn-insn-ref BN.SUBM}} instructions. This CSR is mapped to the MOD WSR.
0x7D4 RW MOD4 Bits [159:128] of the modulus operand, used in the {{#otbn-insn-ref BN.ADDM}}/{{#otbn-insn-ref BN.SUBM}} instructions. This CSR is mapped to the MOD WSR.
0x7D5 RW MOD5 Bits [191:160] of the modulus operand, used in the {{#otbn-insn-ref BN.ADDM}}/{{#otbn-insn-ref BN.SUBM}} instructions. This CSR is mapped to the MOD WSR.
0x7D6 RW MOD6 Bits [223:192] of the modulus operand, used in the {{#otbn-insn-ref BN.ADDM}}/{{#otbn-insn-ref BN.SUBM}} instructions. This CSR is mapped to the MOD WSR.
0x7D7 RW MOD7 Bits [255:224] of the modulus operand, used in the {{#otbn-insn-ref BN.ADDM}}/{{#otbn-insn-ref BN.SUBM}} instructions. This CSR is mapped to the MOD WSR.
0x7D8 RW RND_PREFETCH Write to this CSR to begin a request to fill the RND cache. Always reads as 0.
0xFC0 RO RND An AIS31-compliant class PTG.3 random number with guaranteed entropy and forward and backward secrecy. Primarily intended to be used for key generation.
The number is sourced from the EDN via a single-entry cache. Reads when the cache is empty will cause OTBN to be stalled until a new random number is fetched from the EDN.
0xFC1 RO URND A random number without guaranteed secrecy properties or specific statistical properties. Intended for use in masking and blinding schemes. Use RND for high-quality randomness.
The number is sourced from an local PRNG. Reads never stall.

Wide Data Registers (WDRs)

In addition to the 32b wide GPRs, OTBN has a second "wide" register file, which is used by the big number instruction subset. This register file consists of NWDR = 32 Wide Data Registers (WDRs). Each WDR is WLEN = 256b wide.

Wide Data Registers (WDRs) and the 32b General Purpose Registers (GPRs) are separate register files. They are only accessible through their respective instruction subset: GPRs are accessible from the base instruction subset, and WDRs are accessible from the big number instruction subset (BN instructions).

Register
w0
w1
...
w31

Wide Special Purpose Registers (WSRs)

OTBN has 256b Wide Special purpose Registers (WSRs). These are analogous to the 32b CSRs, but are used by big number instructions. They can be accessed with the {{#otbn-insn-ref BN.WSRR}} and {{#otbn-insn-ref BN.WSRW}} instructions. Writes to read-only (RO) registers are ignored; they do not signal an error. All read-write (RW) WSRs are set to 0 when OTBN starts an operation (when 1 is written to CMD.start).

Number Access Name Description
0x0 RW MOD The modulus used by the {{#otbn-insn-ref BN.ADDM}} and {{#otbn-insn-ref BN.SUBM}} instructions. This WSR is also visible as CSRs `MOD0` through to `MOD7`.
0x1 RO RND An AIS31-compliant class PTG.3 random number with guaranteed entropy and forward and backward secrecy. Primarily intended to be used for key generation.
The number is sourced from the EDN via a single-entry cache. Reads when the cache is empty will cause OTBN to be stalled until a new random number is fetched from the EDN.
0x2 RO URND A random number without guaranteed secrecy properties or specific statistical properties. Intended for use in masking and blinding schemes. Use RND for high-quality randomness.
The number is sourced from an local PRNG. Reads never stall.
0x3 RW ACC The accumulator register used by the {{#otbn-insn-ref BN.MULQACC}} instruction.
0x4 RO KEY_S0_L Bits [255:0] of share 0 of the 384b OTBN sideload key provided by the [Key Manager](../keymgr/README.md).
A `KEY_INVALID` software error is raised on read if the Key Manager has not provided a valid key.
0x5 RO KEY_S0_H Bits [255:128] of this register are always zero. Bits [127:0] contain bits [383:256] of share 0 of the 384b OTBN sideload key provided by the [Key Manager](../keymgr/README.md).
A `KEY_INVALID` software error is raised on read if the Key Manager has not provided a valid key.
0x6 RO KEY_S1_L Bits [255:0] of share 1 of the 384b OTBN sideload key provided by the [Key Manager](../keymgr/README.md).
A `KEY_INVALID` software error is raised on read if the Key Manager has not provided a valid key.
0x7 RO KEY_S1_H Bits [255:128] of this register are always zero. Bits [127:0] contain bits [383:256] of share 1 of the 384b OTBN sideload key provided by the [Key Manager](../keymgr/README.md).
A `KEY_INVALID` software error is raised on read if the Key Manager has not provided a valid key.

Flags

In addition to the wide register file, OTBN maintains global state in two groups of flags for the use by wide integer operations. Flag groups are named Flag Group 0 (FG0), and Flag Group 1 (FG1). Each group consists of four flags. Each flag is a single bit.

  • C (Carry flag). Set to 1 an overflow occurred in the last arithmetic instruction.

  • M (MSb flag) The most significant bit of the result of the last arithmetic or shift instruction.

  • L (LSb flag). The least significant bit of the result of the last arithmetic or shift instruction.

  • Z (Zero Flag) Set to 1 if the result of the last operation was zero; otherwise 0.

The M, L, and Z flags are determined based on the result of the operation as it is written back into the result register, without considering the overflow bit.

Loop Stack

OTBN has two instructions for hardware-assisted loops: {{#otbn-insn-ref LOOP}} and {{#otbn-insn-ref LOOPI}}. Both use the same state for tracking control flow. This is a stack of tuples containing a loop count, start address and end address. The stack has a maximum depth of eight and the top of the stack is the current loop.

Security Features

OTBN is a security co-processor. It contains various security features and is hardened against side-channel analysis and fault injection attacks. The following sections describe the high-level security features of OTBN. Refer to the Design Details section for a more in-depth description.

Data Integrity Protection

OTBN's data integrity protection is designed to protect the data stored and processed within OTBN from modifications through physical attacks.

Data in OTBN travels along a data path which includes the data memory (DMEM), the load-store-unit (LSU), the register files (GPR and WDR), and the execution units. Whenever possible, data transmitted or stored within OTBN is protected with an integrity protection code which guarantees the detection of at least three modified bits per 32 bit word. Additionally, instructions and data stored in the instruction and data memory, respectively, are scrambled with a lightweight, non-cryptographically-secure cipher.

Refer to the Data Integrity Protection section for details of how the data integrity protections are implemented.

Secure Wipe

OTBN provides a mechanism to securely wipe all state it stores, including the instruction memory.

The full secure wipe mechanism is split into three parts:

A secure wipe is performed automatically in certain situations, or can be requested manually by the host software. The full secure wipe is automatically initiated as a local reaction to a fatal error. In addition, it can be triggered by the Life Cycle Controller before RMA entry using the lc_rma_req/ack interface. In both cases OTBN enters the locked state afterwards and needs to be reset. A secure wipe of only the internal state is performed after reset, whenever an OTBN operation is complete, and after a recoverable error. Finally, host software can manually trigger the data memory and instruction memory secure wipe operations by issuing an appropriate command.

Refer to the Secure Wipe section for implementation details.

Instruction Counter

In order to detect and mitigate fault injection attacks on the OTBN, the host CPU can read the number of executed instructions from INSN_CNT and verify whether it matches the expectation. The host CPU can clear the instruction counter when OTBN is not running. Writing any value to INSN_CNT clears this register to zero. Write attempts while OTBN is running are ignored.

Key Sideloading

OTBN software can make use of a single 384b wide key provided by the Key Manager, which is made available in two shares. The key is passed through a dedicated connection between the Key Manager and OTBN to avoid exposing it to other components. Software can access the first share of the key through the KEY_S0_L and KEY_S0_H WSRs, and the second share of the key through the KEY_S1_L and KEY_S1_H WSRs.

It is up to host software to configure the Key Manager so that it provides the right key to OTBN at the start of the operation, and to remove the key again once the operation on OTBN has completed. A KEY_INVALID software error is raised if OTBN software accesses any of the KEY_* WSRs when the Key Manager has not presented a key.

Blanking

To reduce side channel leakage OTBN employs a blanking technique on certain control and data paths. When a path is blanked it is forced to 0 (by ANDing the path with a blanking signal) preventing sensitive data bits producing a power signature via that path where that path isn't needed for the current instruction.

Blanking controls all come directly from flops to prevent glitches in decode logic reducing the effectiveness of the blanking. These control signals are determined in the prefetch stage via pre-decode logic. Full decoding is still performed in the execution stage with the full decode results checked against the pre-decode blanking control. If the full decode disagrees with the pre-decode OTBN raises a BAD_INTERNAL_STATE fatal error.

Blanking is applied in the following locations:

  • Read path from the bignum, CSR and WDR register files. This is achieved with a one-hot mux with a two-level AND-OR structure.
  • Write data into the bignum, CSR and WDR register files. Blanking is done separately for each register (as opposed to once on incoming write data that fans out to each register).
  • All relevant data paths within the bignum ALU and MAC. Data paths not required for the instruction being executed are blanked.

Note there is no blanking on the base side (save for the CSRs as these provide access to WDRs such as ACC).

References

[CHEN08] L. Chen, "Hsiao-Code Check Matrices and Recursively Balanced Matrices," arXiv:0803.1217 [cs], Mar. 2008 [Online]. Available: https://arxiv.org/abs/0803.1217

[SYMBIOTIC21] RISC-V Bitmanip Extension v0.93 Available: https://github.com/riscv/riscv-bitmanip/releases/download/v0.93/bitmanip-0.93.pdf