From 60cbdeecb3cb61135dcd3335cfec3d3d642eda0f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?J=C3=A9r=C3=B4me=20Qu=C3=A9vremont?= Date: Mon, 18 Dec 2023 15:40:06 +0100 Subject: [PATCH] Changing part number in user manual We accelerate this action to stabilize the embedded part number for upcoming commits. Consensus (not unanimity) on **CV32A60MX** for the embedded configuration. Indeed, the previous CV32E60X (or CV32E65X) part number does not reflect the CVA6 family name and is a real issue for Thales. CV32A60MX will change to CV32A65MX is the configuration is upgraded with dual-issue. --- AXI_Interface.rst | 652 ++++++++++++++++++++++++++++++++ CSR_Performance_Counters.rst | 102 +++++ CVX_Interface_Coprocessor.rst | 204 ++++++++++ Interfaces.rst | 79 ++++ Introduction.rst | 184 +++++++++ Programmer_View.rst | 206 ++++++++++ RISCV_Instructions_RV32A.rst | 179 +++++++++ RISCV_Instructions_RV32C.rst | 370 ++++++++++++++++++ RISCV_Instructions_RV32I.rst | 542 ++++++++++++++++++++++++++ RISCV_Instructions_RV32M.rst | 143 +++++++ RISCV_Instructions_RV32ZCb.rst | 171 +++++++++ RISCV_Instructions_RVZicond.rst | 62 +++ 12 files changed, 2894 insertions(+) create mode 100644 AXI_Interface.rst create mode 100644 CSR_Performance_Counters.rst create mode 100644 CVX_Interface_Coprocessor.rst create mode 100644 Interfaces.rst create mode 100644 Introduction.rst create mode 100644 Programmer_View.rst create mode 100644 RISCV_Instructions_RV32A.rst create mode 100644 RISCV_Instructions_RV32C.rst create mode 100644 RISCV_Instructions_RV32I.rst create mode 100644 RISCV_Instructions_RV32M.rst create mode 100644 RISCV_Instructions_RV32ZCb.rst create mode 100644 RISCV_Instructions_RVZicond.rst diff --git a/AXI_Interface.rst b/AXI_Interface.rst new file mode 100644 index 0000000000..9c9fb00eea --- /dev/null +++ b/AXI_Interface.rst @@ -0,0 +1,652 @@ +.. + Copyright (c) 2023 OpenHW Group + Copyright (c) 2023 Thales + + SPDX-License-Identifier: Apache-2.0 WITH SHL-2.1 + + Original Author: Alae Eddine EZ ZEJJARI (alae-eddine.ez-zejjari@external.thalesgroup.com) + +.. _cva6_axi: + +AXI +=== + +Introduction +------------ +In this chapter, we describe in detail the restriction that apply to the supported features. + +In order to understand how the AXI memory interface behaves in CVA6, it is necessary to read the AMBA AXI and ACE Protocol Specification (https://developer.arm.com/documentation/ihi0022/hc) and this chapter. + +*Applicability of this chapter to configurations:* + +.. csv-table:: + :widths: auto + :align: left + :header: "Configuration", "Implementation" + + "CV32A60X", "AXI included" + "CV32A60MX", "AXI included" + +About the AXI4 protocol +~~~~~~~~~~~~~~~~~~~~~~~ + +The AMBA AXI protocol supports high-performance, high-frequency system designs for communication between Manager and Subordinate components. + +The AXI protocol features are: + +* It is suitable for high-bandwidth and low-latency designs. +* High-frequency operation is provided, without using complex bridges. +* The protocol meets the interface requirements of a wide range of components. +* It is suitable for memory controllers with high initial access latency. +* Flexibility in the implementation of interconnect architectures is provided. +* It is backward-compatible with AHB and APB interfaces. + +The key features of the AXI protocol are: + +* Separate address/control and data phases. +* Support for unaligned data transfers, using byte strobes. +* Uses burst-based transactions with only the start address issued. +* Separate read and write data channels, that can provide low-cost Direct Memory Access (DMA). +* Support for issuing multiple outstanding addresses. +* Support for out-of-order transaction completion. +* Permits easy addition of register stages to provide timing closure. + +The present specification is based on: https://developer.arm.com/documentation/ihi0022/hc + + +AXI4 and CVA6 +~~~~~~~~~~~~~ + +The AXI bus protocol is used with the CVA6 processor as a memory interface. Since the processor is the one that initiates the connection with the memory, it will have a manager interface to send requests to the subordinate, which will be the memory. + +Features supported by CVA6 are the ones in the AMBA AXI4 specification and the Atomic Operation feature from AXI5. With restriction that apply to some features. + +This doesn’t mean that all the full set of signals available on an AXI interface are supported by the CVA6. Nevertheless, all required AXI signals are implemented. + +Supported AXI4 features are defined in AXI Protocol Specification sections: A3, A4, A5, A6 and A7. + +Supported AXI5 feature are defined in AXI Protocol Specification section: E1.1. + + +Signal Description (Section A2) +------------------------------- + +This section introduces the AXI memory interface signals of CVA6. Most of the signals are supported by CVA6, the tables summarizing the signals identify the exceptions. + +In the following tables, the *Src* column tells whether the signal is driven by Manager ou Subordinate. + +The AXI required and optional signals, and the default signals values that apply when an optional signal is not implemented are defined in AXI Protocol Specification section A9.3. + + +Global signals (Section A2.1) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Table 2.1 shows the global AXI memory interface signals. + + +.. list-table:: + :widths: 15 15 55 + :header-rows: 1 + + * - **Signal** + - **Src** + - **Description** + * - **ACLK** + - Clock source + - | Global clock signal. Synchronous signals are sampled on the + | rising edge of the global clock. + * - **WDATA** + - Reset source + - | Global reset signal. This signal is active-LOW. + + +Write address channel signals (Section A2.2) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Table 2.2 shows the AXI memory interface write address channel signals. Unless the description indicates otherwise, a signal can take any parameter if is supported. + + +.. list-table:: + :widths: 15 15 15 40 + :header-rows: 1 + + * - **Signal** + - **Src** + - **Support** + - **Description** + * - **AWID** + - M + - | Yes + | (optional) + - | Identification tag for a write transaction. + | CVA6 gives the id depending on the type of transaction. + | See :ref:`transaction_identifiers_label`. + * - **AWADDR** + - M + - Yes + - | The address of the first transfer in a write transaction. + * - **AWLEN** + - M + - | Yes + | (optional) + - | Length, the exact number of data transfers in a write + | transaction. This information determines the number of + | data transfers associated with the address. + | All write transactions performed by CVA6 are of length 1. + | (AWLEN = 0b00000000) + * - **AWSIZE** + - M + - | Yes + | (optional) + - | Size, the number of bytes in each data transfer in a write + | transaction + | See :ref:`address_structure_label`. + * - **AWBURST** + - M + - | Yes + | (optional) + - | Burst type, indicates how address changes between each + | transfer in a write transaction. + | All write transactions performed by CVA6 are of burst type + | INCR. (AWBURST = 0b01) + * - **AWLOCK** + - M + - | Yes + | (optional) + - | Provides information about the atomic characteristics of a + | write transaction. + * - **AWCACHE** + - M + - | Yes + | (optional) + - | Indicates how a write transaction is required to progress + | through a system. + | The subordinate is always of type Normal Non-cacheable Non-bufferable. + | (AWCACHE = 0b0010) + * - **AWPROT** + - M + - Yes + - | Protection attributes of a write transaction: + | privilege, security level, and access type. + | The value of AWPROT is always 0b000. + * - **AWQOS** + - M + - | No + | (optional) + - | Quality of Service identifier for a write transaction. + | AWQOS = 0b0000 + * - **AWREGION** + - M + - | No + | (optional) + - | Region indicator for a write transaction. + | AWREGION = 0b0000 + * - **AWUSER** + - M + - | No + | (optional) + - | User-defined extension for the write address channel. + | AWUSER = 0b00 + * - **AWATOP** + - M + - | Yes + | (optional) + - | AWATOP indicates the Properties of the Atomic Operation + | used for a write transaction. + | See :ref:`atomic_transactions_label`. + * - **AWVALID** + - M + - Yes + - | Indicates that the write address channel signals are valid. + * - **AWREADY** + - S + - Yes + - | Indicates that a transfer on the write address channel + | can be accepted. + + +Write data channel signals (Section A2.3) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Table 2.3 shows the AXI write data channel signals. Unless the description indicates otherwise, a signal can take any parameter if is supported. + +.. list-table:: + :widths: 15 15 15 40 + :header-rows: 1 + + * - **Signal** + - **Src** + - **Support** + - **Description** + * - **WID** + - M + - | Yes + | (optional) + - | The ID tag of the write data transfer. + | CVA6 gives the id depending on the type of transaction. + | See :ref:`transaction_identifiers_label`. + * - **WDATA** + - M + - Yes + - | Write data. + * - **WSTRB** + - M + - | Yes + | (optional) + - | Write strobes, indicate which byte lanes hold valid data + | See :ref:`data_read_and_write_structure_label`. + * - **WLAST** + - M + - Yes + - | Indicates whether this is the last data transfer in a write + | transaction. + * - **WUSER** + - M + - | Yes + | (optional) + - | User-defined extension for the write data channel. + * - **WVALID** + - M + - Yes + - | Indicates that the write data channel signals are valid. + * - **WREADY** + - S + - Yes + - | Indicates that a transfer on the write data channel can be + | accepted. + + + + +Write Response Channel signals (Section A2.4) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Table 2.4 shows the AXI write response channel signals. Unless the description indicates otherwise, a signal can take any parameter if is supported. + + +.. list-table:: + :widths: 15 15 15 40 + :header-rows: 1 + + * - **Signal** + - **Src** + - **Support** + - **Description** + * - **BID** + - S + - | Yes + | (optional) + - | Identification tag for a write response. + | CVA6 gives the id depending on the type of transaction. + | See :ref:`transaction_identifiers_label`. + * - **BRESP** + - S + - Yes + - | Write response, indicates the status of a write transaction. + | See :ref:`read_and_write_response_structure_label`. + * - **BUSER** + - S + - | No + | (optional) + - | User-defined extension for the write response channel. + | Not supported. + * - **BVALID** + - S + - Yes + - | Indicates that the write response channel signals are valid. + * - **BREADY** + - M + - Yes + - | Indicates that a transfer on the write response channel can be + | accepted. + + + + +Read address channel signals (Section A2.5) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Table 2.5 shows the AXI read address channel signals. Unless the description indicates otherwise, a signal can take any parameter if is supported. + + +.. list-table:: + :widths: 15 15 15 40 + :header-rows: 1 + + * - **Signal** + - **Src** + - **Support** + - **Description** + * - **ARID** + - M + - | Yes + | (optional) + - | Identification tag for a read transaction. + | CVA6 gives the id depending on the type of transaction. + | See :ref:`transaction_identifiers_label`. + * - **ARADDR** + - M + - | Yes + - | The address of the first transfer in a read transaction. + * - **ARLEN** + - M + - | Yes + | (optional) + - | Length, the exact number of data transfers in a read + | transaction. This information determines the number of data + | transfers associated with the address. + | All read transactions performed by CVA6 have a length equal to 0, + | ICACHE_LINE_WIDTH/64 or DCACHE_LINE_WIDTH/64. + * - **ARSIZE** + - M + - | Yes + | (optional) + - | Size, the number of bytes in each data transfer in a read + | transaction + | See :ref:`address_structure_label`. + * - **ARBURST** + - M + - | Yes + | (optional) + - | Burst type, indicates how address changes between each + | transfer in a read transaction. + | All Read transactions performed by CVA6 are of burst type INCR. + | (ARBURST = 0b01) + * - **ARLOCK** + - M + - | Yes + | (optional) + - | Provides information about the atomic characteristics of + | a read transaction. + * - **ARCACHE** + - M + - | Yes + | (optional) + - | Indicates how a read transaction is required to progress + | through a system. + | The memory is always of type Normal Non-cacheable Non-bufferable. + | (ARCACHE = 0b0010) + * - **ARPROT** + - M + - | Yes + - | Protection attributes of a read transaction: + | privilege, security level, and access type. + | The value of ARPROT is always 0b000. + * - **ARQOS** + - M + - | No + | (optional) + - | Quality of Service identifier for a read transaction. + | ARQOS= 0b00 + * - **ARREGION** + - M + - | No + | (optional) + - | Region indicator for a read transaction. + | ARREGION= 0b00 + * - **ARUSER** + - M + - | No + | (optional) + - | User-defined extension for the read address channel. + | ARUSER= 0b00 + * - **ARVALID** + - M + - | Yes + | (optional) + - | Indicates that the read address channel signals are valid. + * - **ARREADY** + - S + - | Yes + | (optional) + - | Indicates that a transfer on the read address channel can be + | accepted. + + +Read data channel signals (Section A2.6) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Table 2.6 shows the AXI read data channel signals. Unless the description indicates otherwise, a signal can take any parameter if is supported. + + +.. list-table:: + :widths: 15 15 15 40 + :header-rows: 1 + + * - **Signal** + - **Src** + - **Support** + - **Description** + * - **RID** + - S + - | Yes + | (optional) + - | The ID tag of the read data transfer. + | CVA6 gives the id depending on the type of transaction. + | See :ref:`transaction_identifiers_label`. + * - **RDATA** + - S + - Yes + - | Read data. + * - **RLAST** + - S + - Yes + - | Indicates whether this is the last data transfer in a read + | transaction. + * - **RUSER** + - S + - | Yes + | (optional) + - | User-defined extension for the read data channel. + | Not supported. + * - **RVALID** + - S + - Yes + - | Indicates that the read data channel signals are valid. + * - **RREADY** + - M + - Yes + - | Indicates that a transfer on the read data channel can be accepted. + + + + +Single Interface Requirements: Transaction structure (Section A3.4) +------------------------------------------------------------------- + +This section describes the structure of transactions. The following sections define the address, data, and response +structures + +.. _address_structure_label: + +Address structure (Section A3.4.1) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The AXI protocol is burst-based. The Manager begins each burst by driving control information and the address of the first byte in the transaction to the Subordinate. As the burst progresses, the Subordinate must calculate the addresses of subsequent transfers in the burst. + +**Burst length** + +The burst length is specified by: + +* ``ARLEN[7:0]``, for read transfers +* ``AWLEN[7:0]``, for write transfers + +The burst length for AXI4 is defined as: ``Burst_Length = AxLEN[3:0] + 1``. + +CVA6 has some limitation governing the use of bursts: + +* *All read transactions performed by CVA6 are of burst length equal to 0, ICACHE_LINE_WIDTH/64 or DCACHE_LINE_WIDTH/64.* +* *All write transactions performed by CVA6 are of burst length equal to 1.* + +**Burst size** + +The maximum number of bytes to transfer in each data transfer, or beat, in a burst, is specified by: + +* ``ARSIZE[2:0]``, for read transfers +* ``AWSIZE[2:0]``, for write transfers + +*The maximum value can be taking by AXSIZE is log2(AXI DATA WIDTH/8) (8 bytes by transfer).* + +**Burst type** + +The AXI protocol defines three burst types: + +* **FIXED** +* **INCR** +* **WRAP** + +The burst type is specified by: + +* ``ARBURST[1:0]``, for read transfers +* ``AWBURST[1:0]``, for write transfers + +*All transactions performed by CVA6 are of burst type INCR. (AXBURST = 0b01)* + + +.. _data_read_and_write_structure_label: + +Data read and write structure: (Section A3.4.4) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +**Write strobes** + +The ``WSTRB[n:0]`` signals when HIGH, specify the byte lanes of the data bus that contain valid information. There is one write strobe +for each 8 bits of the write data bus, therefore ``WSTRB[n]`` corresponds to ``WDATA[(8n)+7: (8n)]``. + +*Write Strobe width is equal to (AXI_DATA_WIDTH/8) (n = (AXI_DATA_WIDTH/8)-1).* + +*The size of all transactions performed by cva6 is equal to the number of byte lanes of the data bus containing valid information.* +*This means 1, 2, 4, ... or (AXI_DATA_WIDTH/8) byte lanes containing valid information.* + + +**Unaligned transfers** + +For any burst that is made up of data transfers wider than 1 byte, the first bytes accessed might be unaligned with the natural +address boundary. For example, a 32-bit data packet that starts at a byte address of 0x1002 is not aligned to the natural 32-bit +transfer size. + +*CVA6 does not perform Unaligned transfers.* + + +.. _read_and_write_response_structure_label: + +Read and write response structure (Section A3.4.5) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The AXI protocol provides response signaling for both read and write transactions: + +* For read transactions, the response information from the Subordinate is signaled on the read data channel. +* For write transactions, the response information is signaled on the write response channel. + +CVA6 does not consider the responses sent by the memory except in the exclusive Access ( ``XRESP[1:0]`` = 0b01 ). + +Transaction Attributes: Memory types (Section A4) +-------------------------------------------------- + +This section describes the attributes that determine how a transaction should be treated by the AXI subordinate that is connected to the CVA6. + +``AXCACHE`` always takeq 0b0010. The subordinate should be a Normal Non-cacheable Non-bufferable. + +The required behavior for Normal Non-cacheable Non-bufferable memory is: + +* The write response must be obtained from the final destination. +* Read data must be obtained from the final destination. +* Transactions are modifiable. +* Writes can be merged. + + +.. _transaction_identifiers_label: + +Transaction Identifiers (Section A5) +------------------------------------- + +The AXI protocol includes AXI ID transaction identifiers. A Manager can use these to identify separate transactions that must be returned in order. + +The CVA6 identify each type of transaction with a specific ID: + +* For read transaction id can be 0 or 1. +* For write transaction id = 1. +* For Atomic operation id = 3. This ID must be sent in the write channels and also in the read channel if the transaction performed requires response data. + +AXI Ordering Model (Section A6) +------------------------------- + +AXI ordering model overview (Section A6.1) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + + +The AXI ordering model is based on the use of the transaction identifier, which is signaled on ``ARID`` or ``AWID``. + +Transaction requests on the same channel, with the same ID and destination are guaranteed to remain in order. + +Transaction responses with the same ID are returned in the same order as the requests were issued. + +Write transaction requests, with the same destination are guaranteed to remain in order. Because all write transaction performed by CVA6 have the same ID. + +CVA6 can perform multiple outstanding write address transactions. + +CVA6 cannot perform a Read transaction and a Write one at the same time. Therefore there no ordering problems between Read and write transactions. + + +The ordering model does not give any ordering guarantees between: + +* Transactions from different Managers +* Read Transactions with different IDs +* Transactions to different Memory locations + +If the CVA6 requires ordering between transactions that have no ordering guarantee, the Manager must wait to receive a response to the first transaction before issuing the second transaction. + + +Memory locations and Peripheral regions (Section A6.2) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The address map in AMBA is made up of Memory locations and Peripheral regions. But the AXI is associated to the memory interface of CVA6. + +A Memory location has all of the following properties: + +* A read of a byte from a Memory location returns the last value that was written to that byte location. +* A write to a byte of a Memory location updates the value at that location to a new value that is obtained by a subsequent read of that location. +* Reading or writing to a Memory location has no side-effects on any other Memory location. +* Observation guarantees for Memory are given for each location. +* The size of a Memory location is equal to the single-copy atomicity size for that component. + + +Transactions and ordering (Section A6.3) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +A transaction is a read or a write to one or more address locations. The locations are determined by AxADDR and any relevant qualifiers such as the Non-secure bit in ``AxPROT``. + +* Ordering guarantees are given only between accesses to the same Memory location or Peripheral region. +* A transaction to a Peripheral region must be entirely contained within that region. +* A transaction that spans multiple Memory locations has multiple ordering guarantees. + +Transaction performed by CVA6 is of type Normal, because ``AxCACHE[1]`` is asserted. + +Normal transactions are used to access Memory locations and are not expected to be used to access Peripheral regions. + +A Normal access to a Peripheral region must complete in a protocol-compliant manner, but the result is IMPLEMENTATION DEFINED. + +A write transaction performed by CVA6 is Non-bufferable (It is not possible to send an early response before the transaction reach the final destination), because ``AxCACHE[0]`` is deasserted. + +Ordered write observation (Section A6.8) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +To improve compatibility with interface protocols that support a different ordering model, a Subordinate interface can give stronger ordering guarantees for write transactions. A stronger ordering guarantee is known as Ordered Write Observation. + +*The CVA6 AXI interface exhibits Ordered Write Observation, so the Ordered_Write_Observation property is True.* + +An interface that exhibits Ordered Write Observation gives guarantees for write transactions that are not dependent on the destination or address: + +* A write W1 is guaranteed to be observed by a write W2, where W2 is issued after W1, from the same Manager, with the same ID. + + +.. _atomic_transactions_label: + +Atomic transactions (Section E1.1) +----------------------------------- + +AMBA 5 introduces Atomic transactions, which perform more than just a single access and have an operation that is associated with the transaction. Atomic transactions enable sending the operation to the data, permitting the operation to be performed closer to where the data is located. Atomic transactions are suited to situations where the data is located a significant distance from the agent that must perform the operation. + +CVA6 supports just the AtomicLoad and AtomicSwap transaction. So ``AWATOP[5:4]`` can be 00, 10 or 11. + +CVA6 performs only little-endian operation. So ``AWATOP[3]`` = 0. + +For AtomicLoad, CVA6 supports all arithmetic operations encoded on the lower-order ``AWATOP[2:0]`` signals. diff --git a/CSR_Performance_Counters.rst b/CSR_Performance_Counters.rst new file mode 100644 index 0000000000..59d570005e --- /dev/null +++ b/CSR_Performance_Counters.rst @@ -0,0 +1,102 @@ +.. + Copyright (c) 2023 OpenHW Group + Copyright (c) 2023 Thales DIS design services SAS + + SPDX-License-Identifier: Apache-2.0 WITH SHL-2.1 + +.. Level 1 + ======= + + Level 2 + ------- + + Level 3 + ~~~~~~~ + + Level 4 + ^^^^^^^ + +.. _cva6_csr_performance_counters: + +*Applicability of this chapter to configurations:* + +.. csv-table:: + :widths: auto + :align: left + :header: "Configuration", "Implementation" + + "CV32A60X", "Performance counters included" + "CV32A60MX", "No performance counters" + +CSR performance counters control +================================ +CVA6 implements performance counters according to the RISC-V Privileged Specification, version 1.11 (see Hardware Performance Monitor, Section 3.1.10). The performance counters are placed inside the Control and Status Registers(CSRs) and can be accessed with the ``CSRRW(I)`` and ``CSRRS/C(I)`` instructions. + +CVA6 implements the standard 64-bit clock cycle counter ``mcycle``, the retired instruction counter ``minstret`` as well as the six generic 64-bit event counters ``mhpm_counter_3`` to ``mhpm_counter_8`` including their upper 32 bits counterparts ``mhpm_counter_3h`` to ``mhpm_counter_8h``. The corresponding event selectors ``mhpm_event_3`` to ``mhpm_event_8`` are implemented for the selection of the source of events. The unavailable counters (``mhpm_counter_9(h)`` to ``mhpm_counter_31(h)``) and event selector (``mhpm_event_9`` to ``mhpm_event_31``) always read 0. + +The ``mcountinhibit`` CSR is used to individually inhibit the incrementing of the counters. The read-only shadows of the counters are also implemented as ``cycle``, ``instret`` and ``hpmcountern``. The ``mcycle`` and ``minstret`` counters are always available but the ``mhpmcounter`` are optional and can be configured through the parameter ``PERF_COUNTER_EN``. The supervisor and user access of performance counters are allowed through enabling of ``mcounteren`` and ``scounteren`` CSRs. + +Event Selector +------------------------------- +The event selector CSRs ``mhpm_event_3`` to ``mhpm_event_8`` controls which of the events are counted by the six generic event counters ``mhpm_counter_3`` to ``mhpm_counter_8`` respectively. + +The five least significant bit(LSB) of the event selector CSRs are written to select the event that one needs to count from a particular generic event counter. Thus, we can count six different events at a time using the six generic counters. + +Each of the six generic performance counters is able to monitor events from one of these sources: + ++----------+-----------------------------+---------------------------------------------------------------+ +| Event ID | Event Name | Description | ++==========+=============================+===============================================================+ +| 1 | L1 I-Cache Misses | Number of misses in L1 I-Cache | ++----------+-----------------------------+---------------------------------------------------------------+ +| 2 | L1 D-Cache Misses | Number of misses in L1 D-Cache | ++----------+-----------------------------+---------------------------------------------------------------+ +| 3 | ITLB Misses | Number of misses in ITLB | ++----------+-----------------------------+---------------------------------------------------------------+ +| 4 | DTLB Misses | Number of misses in DTLB | ++----------+-----------------------------+---------------------------------------------------------------+ +| 5 | Load Accesses | Number of data memory loads | ++----------+-----------------------------+---------------------------------------------------------------+ +| 6 | Store Accesses | Number of data memory stores | ++----------+-----------------------------+---------------------------------------------------------------+ +| 7 | Exceptions | Valid Exceptions encountered | ++----------+-----------------------------+---------------------------------------------------------------+ +| 8 | Exception Handler Returns | Return from an exception | ++----------+-----------------------------+---------------------------------------------------------------+ +| 9 | Branch Instructions | Number of branch instructions encountered | ++----------+-----------------------------+---------------------------------------------------------------+ +| 10 | Branch Mispredicts | Number of branch mispredictions | ++----------+-----------------------------+---------------------------------------------------------------+ +| 11 | Branch Exceptions | Number of valid branch exceptions | ++----------+-----------------------------+---------------------------------------------------------------+ +| 12 | Call | Number of call instructions | ++----------+-----------------------------+---------------------------------------------------------------+ +| 13 | Return | Number of return instructions | ++----------+-----------------------------+---------------------------------------------------------------+ +| 14 | MSB Full | Scoreboard is full | ++----------+-----------------------------+---------------------------------------------------------------+ +| 15 | Instruction Fetch Empty | Number of invalid instructions in IF | ++----------+-----------------------------+---------------------------------------------------------------+ +| 16 | L1 I-Cache Accesses | Number of accesses to Instruction Cache | ++----------+-----------------------------+---------------------------------------------------------------+ +| 17 | L1 D-Cache Accesses | Number of accesses to Data Cache | ++----------+-----------------------------+---------------------------------------------------------------+ +| 18 | L1 Cache Line Eviction | Number of Data Cache line eviction | ++----------+-----------------------------+---------------------------------------------------------------+ +| 19 | ITLB Flush | Number of ITLB Flushes | ++----------+-----------------------------+---------------------------------------------------------------+ +| 20 | Integer Instructions | Number of Integer instructions | ++----------+-----------------------------+---------------------------------------------------------------+ +| 21 | Floating Point Instructions | Number of Floating point instructions | ++----------+-----------------------------+---------------------------------------------------------------+ +| 22 | Pipeline Stall | Number of cycles the pipeline is stalled during read operands | ++----------+-----------------------------+---------------------------------------------------------------+ +| 23-31 | Reserved | Reserved | ++----------+-----------------------------+---------------------------------------------------------------+ + +Controlling the counters from software +--------------------------------------- +All performance counters are enabled after reset. The ``mcountinhibit`` CSR at address ``0x320`` controls which of the performance counters increment as described in the RISC-V Privileged Specification, version 1.11 (see Machine Counter-Inhibit CSR, Section 3.1.12). For instance, bit 0 is set to 0 for ``mcycle(h)`` to increment as usual, bit 2 for ``minstrert(h)`` and bit X for event counter mhpmcounterX(h). + +The lower 32 bits of all counters can be accessed through the base register, whereas the upper 32 bits are accessed through the h-register. + diff --git a/CVX_Interface_Coprocessor.rst b/CVX_Interface_Coprocessor.rst new file mode 100644 index 0000000000..a0f9d1148e --- /dev/null +++ b/CVX_Interface_Coprocessor.rst @@ -0,0 +1,204 @@ +.. + Copyright (c) 2023 OpenHW Group + Copyright (c) 2023 Thales + + SPDX-License-Identifier: Apache-2.0 WITH SHL-2.1 + +.. Level 1 + ======= + + Level 2 + ------- + + Level 3 + ~~~~~~~ + + Level 4 + ^^^^^^^ + +.. _cva6_cvx_interface_coprocessor: + +CV-X-IF Interface and Coprocessor +================================= + +The CV-X-IF interface of CVA6 allows to extend its supported instruction set +with external coprocessors. + +*Applicability of this chapter to configurations:* + +.. csv-table:: + :widths: auto + :align: left + :header: "Configuration", "Implementation" + + "CV32A60X", "CV-X-IF included" + "CV32A60MX", "CV-X-IF included" + + +CV-X-IF interface specification +------------------------------- + +Description +~~~~~~~~~~~ +This design specification presents global functionalities of +Core-V-eXtension-Interface (XIF, CVXIF, CV-X-IF, X-interface) in the CVA6 core. + +.. code-block:: text + + The CORE-V X-Interface is a RISC-V eXtension interface that provides a + generalized framework suitable to implement custom coprocessors and ISA + extensions for existing RISC-V processors. + + --core-v-xif Readme, https://github.com/openhwgroup/core-v-xif + +The specification of the CV-X-IF bus protocol can be found at [CV-X-IF]. + +CV-X-IF aims to: + +* Create interfaces to connect a coprocessor to the CVA6 to execute instructions. +* Offload CVA6 illegal instrutions to the coprocessor to be executed. +* Get the results of offloaded instructions from the coprocessor so they are written back into the CVA6 register file. +* Add standard RISC-V instructions unsupported by CVA6 or custom instructions and implement them in a coprocessor. +* Kill offloaded instructions to allow speculative execution in the coprocessor. (Unsupported in CVA6 yet) +* Connect the coprocessor to memory via the CVA6 Load and Store Unit. (Unsupported in CVA6 yet) + +The coprocessor operates like another functional unit so it is connected to +the CVA6 in the execute stage. + +Only the 3 mandatory interfaces from the CV-X-IF specification (issue, commit and result +) have been implemented. +Compressed interface, Memory Interface and Memory result interface are not yet +implemented in the CVA6. + +Supported Parameters +~~~~~~~~~~~~~~~~~~~~ +The following table presents CVXIF parameters supported by CVA6. + +=============== =========================== =============================================== +Signal Value Description +=============== =========================== =============================================== +**X_NUM_RS** int: 2 or 3 (configurable) | Number of register file read ports that can + | be used by the eXtension interface +**X_ID_WIDTH** int: 3 | Identification width for the eXtension + | interface +**X_MEM_WIDTH** n/a (feature not supported) | Memory access width for loads/stores via the + | eXtension interface +**X_RFR_WIDTH** int: ``XLEN`` (32 or 64) | Register file read access width for the + | eXtension interface +**X_RFW_WIDTH** int: ``XLEN`` (32 or 64) | Register file write access width for the + | eXtension interface +**X_MISA** logic[31:0]: 0x0000_0000 | MISA extensions implemented on the eXtension + | interface +=============== =========================== =============================================== + +CV-X-IF Enabling +~~~~~~~~~~~~~~~~ +CV-X-IF can be enabled or disabled via the ``CVA6ConfigCvxifEn`` parameter in the SystemVerilog source code. + +Illegal instruction decoding +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +The CVA6 decoder module detects illegal instructions for the CVA6, prepares exception field +with relevant information (exception code "ILLEGAL INSTRUCTION", instruction value). + +The exception valid flag is raised in CVA6 decoder when CV-X-IF is disabled. Otherwise +it is not raised at this stage because the decision belongs to the coprocessor +after the offload process. + +RS3 support +~~~~~~~~~~~ +The number of source registers used by the CV-X-IF coprocessor is configurable with 2 or +3 source registers. + +If CV-X-IF is enabled and configured with 3 source registers, +a third read port is added to the CVA6 general purpose register file. + +Description of interface connections between CVA6 and Coprocessor +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +In CVA6 execute stage, there is a new functional unit dedicated to drive the CV-X-IF interfaces. +Here is *how* and *to what* CV-X-IF interfaces are connected to the CVA6. + +* Issue interface + - Request + + | Operands are connected to ``issue_req.rs`` signals + + | Scoreboard transaction id is connected to ``issue_req.id`` signal. + | Therefore scoreboard ids and offloaded instruction ids are linked + | together (equal in this implementation). It allows the CVA6 to do out + | of order execution with the coprocessor in the same way as other + | functional units. + + | Undecoded instruction is connected to ``issue_req.instruction`` + + | Valid signal for CVXIF functional unit is connected to + | ``issue_req.valid`` + + | All ``issue_req.rs_valid`` signals are set to 1. The validity of source + | registers is assured by the validity of valid signal sent from issue stage. + - Response + + | If ``issue_resp.accept`` is set during a transaction (i.e. issue valid + | and ready are set), the offloaded instruction is accepted by the coprocessor + | and a result transaction will happen. + + | If ``issue_resp.accept`` is not set during a transaction, the offloaded + | instruction is illegal and an illegal instruction exception will be + | raised as soon as no result transaction are written on the writeback bus. + +* Commit interface + - | Valid signal of commit interface is connected to the valid signal of + | issue interface. + - | Id signal of commit interface is connected to issue interface id signal + | (i.e. scoreboard id). + - | Killing of offload instruction is never set. (Unsupported feature) + - | Therefore all accepted offloaded instructions are commited to their + | execution and no killing of instruction is possible in this implementation. + +* Result interface + - Request + + | Ready signal of result interface is always set as CVA6 is always ready + | to take a result from coprocessor for an accepted offloaded instruction. + - Response + + | Result response is directly connected to writeback bus of the CV-X-IF + | functionnal unit. + + | Valid signal of result interface is connected to valid signal of + | writeback bus. + + | Id signal of result interface is connected to scoreboard id of + | writeback bus. + + | Write enable signal of result interface is connected to a dedicated CV-X-IF WE + | signal in CVA6 which signals scoreboard if a writeback should happen + | or not to the CVA6 register file. + + | ``exccode`` and ``exc`` signal of result interface are connected to exception + | signals of writeback bus. Exception from coprocessor does not write + | the ``tval`` field in exception signal of writeback bus. + + | Three registers are added to hold illegal instruction information in + | case a result transaction and a non-accepted issue transaction happen + | in the same cycle. Result transactions will be written to the writeback + | bus in this case having priority over the non-accepted instruction due + | to being linked to an older offloaded instruction. Once the writeback + | bus is free, an illegal instruction exception will be raised thanks to + | information held in these three registers. + +Coprocessor recommendations for use with CVA6's CV-X-IF +------------------------------------------------------- + +CVA6 supports all coprocessors supporting the CV-X-IF specification with the exception of : + +* Coprocessor requiring the Memory interface and Memory result interface (not implemented in CVA6 yet). + - All memory transaction should happen via the Issue interface, i.e. Load into CVA6 register file + then initialize an issue transaction. +* Coprocessor requiring the Compressed interface (not implemented in CVA6 yet). + - RISC-V Compressed extension (RVC) is already implemented in CVA6. User Space for custom compressed instruction + is not big enough to have RVC and a custom compressed extension. +* Stateful coprocessors. + - CVA6 will commit on the Commit interface all its issue transactions. Speculation + informations are only kept in the CVA6 and speculation process is only done in CVA6. + The coprocessor shall be stateless otherwise it will not be able to revert its state if CVA6 kills an + in-flight instruction (in case of mispredict or flush). + +How to use CVA6 without CV-X-IF interface +----------------------------------------- +Select a configuration with ``CVA6ConfigCvxifEn`` parameter disabled or change it for your configuration. + +Never let the CV-X-IF interface unconnected with the ``CVA6ConfigCvxifEn`` parameter enabled. + +How to design a coprocessor for the CV-X-IF interface +----------------------------------------------------- +*The team is looking for a contributor to write this section.* + +How to program a CV-X-IF coprocessor +------------------------------------ +*The team is looking for a contributor to write this section.* diff --git a/Interfaces.rst b/Interfaces.rst new file mode 100644 index 0000000000..f17b5fdd75 --- /dev/null +++ b/Interfaces.rst @@ -0,0 +1,79 @@ +.. + Copyright (c) 2023 OpenHW Group + Copyright (c) 2023 Thales + + SPDX-License-Identifier: Apache-2.0 WITH SHL-2.1 + +.. Level 1 + ======= + + Level 2 + ------- + + Level 3 + ~~~~~~~ + + Level 4 + ^^^^^^^ + +.. _cva6_interfaces: + +Interfaces +========== + +AXI Interface +------------- +The AXI interface is described in a separate chapter. + +*Applicability to configurations:* + +.. csv-table:: + :widths: auto + :align: left + :header: "Configuration", "Implementation" + + "CV32A60X", "AXI implemented" + "CV32A60MX", "AXI implemented" + +Debug Interface +--------------- + The team is looking for a contributor for this section. + We can likely reuse an E4 DVplan. + Remember: the debug module (DTM) is not in the scope, so we focus on the debug interrupt. + How to use the interface (HW/SW). We can refer to RISC-V specifications. + If the section is too heavy, promote it to a separate chapter. + +*Applicability to configurations:* + +.. csv-table:: + :widths: auto + :align: left + :header: "Configuration", "Implementation" + + "CV32A60X", "Debug interface implemented" + "CV32A60MX", "No debug interface" + +Interrupt Interface +------------------- + The team is looking for a contributor for this section. + We can likely reuse an E4 DVplan. + How to use the interface (HW/SW). We can refer to RISC-V specifications. + If the section is too heavy, promote it to a separate chapter. + +*The interrupt interface is applicable to all configurations.* + +TRI Interface +------------- +The TRI interface is exclusive of the AXI interface. + +For more information, refer to OpenPiton documents. + +*Applicability to configurations:* + +.. csv-table:: + :widths: auto + :align: left + :header: "Configuration", "Implementation" + + "CV32A60X", "No TRI interface" + "CV32A60MX", "No TRI interface" diff --git a/Introduction.rst b/Introduction.rst new file mode 100644 index 0000000000..9e4bae1168 --- /dev/null +++ b/Introduction.rst @@ -0,0 +1,184 @@ +.. + Copyright (c) 2023 OpenHW Group + Copyright (c) 2023 Thales + + SPDX-License-Identifier: Apache-2.0 WITH SHL-2.1 + +.. Level 1 + ======= + + Level 2 + ------- + + Level 3 + ~~~~~~~ + + Level 4 + ^^^^^^^ + +.. _cva6_user_guide_introduction: + +Introduction +============ + +License +------- +Copyright 2023 OpenHW Group and Thales + +SPDX-License-Identifier: Apache-2.0 WITH SHL-2.1 + +Licensed under the Solderpad Hardware License v 2.1 (the “License”); you may not use this file except in compliance with the License, or, at your option, the Apache License version 2.0. +You may obtain a copy of the License at https://solderpad.org/licenses/SHL-2.1/. +Unless required by applicable law or agreed to in writing, any work distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and limitations under the License. + +Work In Progress +---------------- +This document is a work in progress and the team currently drafting it focuses on its use to verify several configurations of CVA6. + +The current limitation of documentation on CVA6 is well understood. +Rather than regretting this, the reader is encouraged to contribute to it to make CVA6 an even better core. +To contribute to the project, refer to the Contributing_ guidelines and get in touch with the team. + +.. _Contributing: https://github.com/jquevremont/cva6/blob/master/CONTRIBUTING.md + +Target Audience +--------------- +The CVA6 user manual targets: + +* SW programmers +* HW designers who integrate CVA6 into a SoC/ASIC/FPGA +* Architects who design a coprocessor for the CV-X-IF interface and who need to create SW to use it +* HW designers who synthetize/place&route/verify a design that embeds CVA6 +* Verification engineers involved in the OpenHW Group’s CVA6 project who use this manual as a reference. + +The user guide does not target people who dig into CVA6 design. No internal mechanisms are described here, +except if the user has some sort of control on it. A separate design document digs into the core microarchitecture. + +CVA6 Overview +-------------- +**CVA6** is a RISC-V compatible application processor core that can be configured +as a 32- or 64-bit core: **CV32A6** and **CV64A6**. + +CVA6 can be configured to the users' and application needs thanks to several +parameters and optional features (MMU, PMP, FPU, cache organization and size...). +It targets **FPGA** and **ASIC** technologies. + +CVA6, as an application core, can run many operating systems. It has already been +demonstrated with embedded **Linux** distributions (built with **BuildRoot** and +**Yocto**), **FreeRTOS** and **Zephyr**. + +CVA6 features the **CV-X-IF** coprocessor interface to extend the set of instructions it can execute. + +The goal of CVA6 is to be **fully compliant** with RISC-V specifications and feature no or extremely +few custom extensions (except through extensions on CV-X-IF interface). + +CV32A6 and CV64A6 share the same **SystemVerilog** source code, available in this GitHub_ repository. + +.. _GitHub: https://github.com/openhwgroup/cva6/ + +CV64A6 is an industrial evolution of ARIANE created by ETH Zürich and the +University of Bologna. CV32A6 is a later addition by Thales. CVA6 is now +curated at the OpenHW Group by its members. + +Configurations +-------------- + +CVA6 is actually a family of cores, as CVA6 can be configured to the users' needs with more than 50 parameters. +A configuration is defined as a given set of parameters. + +A few configurations undergo a complete verification process to bring them to **TRL-5**, +the maturity level where they can be integrated in production ASICs. + +This manual includes generic descriptions of CVA6 capabilities, as well as their applicability to +the verified configurations. + +As of today, two configurations are being verified and addressed in this document: + +.. csv-table:: + :widths: auto + :align: left + :header: "Configuration", "Short description", "Target", "Privilege levels", "Supported RISC-V ISA", "CV-X-IF" + + "**CV32A60X**", "32-bit **application** core", "ASIC", "Machine, Supervisor, User", "RV32IMACZicsr_Zifencei_Zicount_Zba_Zbb_Zbc_Zbs_Zcb_Zicond", "Included" + "**CV32A60MX**", "32-bit **embedded** core", "ASIC", "Machine only", "RV32IMCZicsr_Zifencei_Zba_Zbb_Zbc_Zbs_Zcb", "Included" + +CV32A60MX is an interim part number until the team can decide if this configuration is single- or dual-issue. +If the dual-issue architecture is selected, the part number will become CV32A65MX to denote the extra performance. + +In the future, dedicated user manuals for each configuration could be generated. The team is looking for a contributor to implement this through *templating*. + +Scope of the IP +--------------- + +The **scope of the IP** refers the subsystem that is documented here. + +.. image:: ../02_cva6_requirements/images/cva6_scope.png + +As displayed in the picture above, the IP comprises: + +- The CVA6 core; +- L1 write-through cache; +- Optional FPU; +- Optional MMU; +- Optional PMP; +- CSR; +- Performance counters; +- AXI interface; +- Interface with the P-Mesh coherence system of OpenPiton; +- CV-X-IF coprocessor interface (not shown). + +These are not part of the IP (several solutions can be used): + +- CLINT or PLIC Interrupt modules; +- Debug module (such as DTM); +- Support of L1 write-back cache. + +Specifications and References +----------------------------- + +Applicable Specifications +~~~~~~~~~~~~~~~~~~~~~~~~~ + +CVA6 strives to comply with the following specifications. When the +specifications allow variations (parameters, optional features...), +this users' guide will detail them. + +.. [RVunpriv] “The RISC-V Instruction Set Manual, Volume I: User-Level ISA, + Document Version 20191213”, Editors Andrew Waterman and Krste Asanović, + RISC-V Foundation, December 13, 2019. + +.. [RVpriv] “The RISC-V Instruction Set Manual, Volume II: Privileged + Architecture, Document Version 20211203”, Editors Andrew Waterman, Krste + Asanović and John Hauser, RISC-V Foundation, December 4, 2021. + +.. [RVdbg] “RISC-V External Debug Support, Document Version 0.13.2”, + Editors Tim Newsome and Megan Wachs, RISC-V Foundation, March 22, 2019. + +.. [RVcompat] “RISC-V Architectural Compatibility Test Framework”, + https://github.com/riscv-non-isa/riscv-arch-test. + +.. [AXI] AXI Specification, + https://developer.arm.com/documentation/ihi0022/hc. + +.. [CV-X-IF] CV-X-IF coprocessor interface currently + prepared at OpenHW Group; current version in + https://docs.openhwgroup.org/projects/openhw-group-core-v-xif/. + +.. [OpenPiton] “OpenPiton Microarchitecture Specification”, Princeton + University, + https://parallel.princeton.edu/openpiton/docs/micro_arch.pdf. + +Reference Documents +~~~~~~~~~~~~~~~~~~~ + +These are additional reference cited in this guide: + +.. [CLINT] Core-Local Interruptor (CLINT), “SiFive E31 Core Complex + Manual v2p0”, chapter 6, + https://static.dev.sifive.com/SiFive-E31-Manual-v2p0.pdf + + + + + diff --git a/Programmer_View.rst b/Programmer_View.rst new file mode 100644 index 0000000000..03473d06d7 --- /dev/null +++ b/Programmer_View.rst @@ -0,0 +1,206 @@ +.. + Copyright (c) 2023 OpenHW Group + Copyright (c) 2023 Thales DIS design services SAS + + SPDX-License-Identifier: Apache-2.0 WITH SHL-2.1 + +.. Level 1 + ======= + + Level 2 + ------- + + Level 3 + ~~~~~~~ + + Level 4 + ^^^^^^^ + +.. _cva6_programmers_view: + +Programmer’s View +================= +RISC-V specifications allow many variations. This chapter provides more details about RISC-V variants available for the programmer. +A global view of the CVA6 family is provided, as well as details for each verified configuration. + +RISC-V Extensions +----------------- + +CVA6 family +~~~~~~~~~~~ + +The following extensions are available for the CVA6 family. +Some of them are optional and are enabled through parameters in the SystemVerilog design. + +.. csv-table:: + :widths: auto + :align: left + :header: "Extension", "Optional", "RV32 (in CV32A6)", "RV64 (in CV64A6)", "Note" + + "I- Base Integer Instruction Set", "No", "✓", "✓", "Note 1" + "A - Atomic Instructions", "Yes", "✓", "✓", "Note 1" + "Zb* - Bit-Manipulation", "Yes", "✓", "✓", "Note 1" + "C - Compressed Instructions ", "Yes", "✓", "✓", "Note 1" + "Zcb - Code Size Reduction", "Yes", "✓", "✓", "Note 1" + "D - Double precision floating-point", "Yes", "", "✓", "Note 1" + "F - Single precision floating-point", "Yes", "✓", "✓", "Note 1" + "M - Integer Multiply/Divide", "No", "✓", "✓", "Note 1" + "Zicount - Performance Counters", "Yes", "✓", "✓", "Note 2" + "Zicsr - Control and Status Register Instructions", "No", "✓", "✓", "Note 2" + "Zifencei - Instruction-Fetch Fence", "No", "✓", "✓", "Note 2" + "Zicond - Integer Conditional Operations(Ratification pending)", "Yes", "✓", "✓", "Note 2" + +Notes: + +* Note 1: These extensions have a slightly different definition between RV32 and RV64. They are therefore denoted with digits (e.g. RV\ **32**\ M). +* Note 2: These extensions do not differ between RV32 and RV64. They are therefore denoted without digits below (e.g. RVZifencei). + +*The following tables detail the availability of extensions for the various CVA6 configurations:* + +CV32A60X extensions +~~~~~~~~~~~~~~~~~~~ + +These extensions are available in CV32A60X: + +.. csv-table:: + :widths: auto + :align: left + :header: "Extension", "Available in CV32A60X" + + "RV32I - Base Integer Instruction Set", "✓" + "RV32A - Atomic Instructions", "✓" + "RV32Zb* - Bit-Manipulation (Zba, Zbb, Zbc, Zbs)", "✓" + "RV32C - Compressed Instructions ", "✓" + "RV32Zcb - Code Size Reduction", "✓" + "RV32D - Double precision floating-point", "" + "RV32F - Single precision floating-point", "" + "RV32M - Integer Multiply/Divide", "✓" + "RVZicount - Performance Counters", "✓" + "RVZicsr - Control and Status Register Instructions", "✓" + "RVZifencei - Instruction-Fetch Fence", "✓" + "RVZicond - Integer Conditional Operations(Ratification pending)", "✓" + +CV32A60MX extensions +~~~~~~~~~~~~~~~~~~~ + +These extensions are available in CV32A60MX: + +.. csv-table:: + :widths: auto + :align: left + :header: "Extension", "Available in CV32A60X" + + "RV32I - Base Integer Instruction Set", "✓" + "RV32A - Atomic Instructions", "" + "RV32Zb* - Bit-Manipulation (Zba, Zbb, Zbc, Zbs)", "✓" + "RV32C - Compressed Instructions ", "✓" + "RV32Zcb - Code Size Reduction", "✓" + "RV32D - Double precision floating-point", "" + "RV32F - Single precision floating-point", "" + "RV32M - Integer Multiply/Divide", "✓" + "RVZicount - Performance Counters", "" + "RVZicsr - Control and Status Register Instructions", "✓" + "RVZifencei - Instruction-Fetch Fence", "✓" + "RVZicond - Integer Conditional Operations(Ratification pending)", "" + + +RISC-V Privileges +----------------- + +CVA6 family +~~~~~~~~~~~ + +CVA6 supports these privilege modes: + +.. csv-table:: + :widths: auto + :align: left + :header: "Mode" + + "M - Machine" + "S - Supervior" + "U - User" + +Note: The addition of the H Extension is in the process. After that, HS, VS, and VU modes will also be available. + +*The following tables detail the availability of privileges modes for the various CVA6 configurations:* + +CV32A60X privilege modes +~~~~~~~~~~~~~~~~~~~~~~~~ + +These privilege modes are available in CV32A60X: + +.. csv-table:: + :widths: auto + :align: left + :header: "Privileges", "Available in CV32A60X" + + "M - Machine", "✓" + "S - Supervior", "✓" + "U - User", "✓" + +CV32A60MX privilege modes +~~~~~~~~~~~~~~~~~~~~~~~~ + +These privilege modes are available in CV32A60MX: + +.. csv-table:: + :widths: auto + :align: left + :header: "Privileges", "Available in CV32A60MX" + + "M - Machine", "✓" + "S - Supervior", "" + "U - User", "" + + +RISC-V Virtual Memory +--------------------- + +CVA6 family +~~~~~~~~~~~ + +CV32A6 supports the RISC-V **Sv32** virtual memory when the ``MMUEn`` parameter is set to 1 (and ``Xlen`` is set to 32). + +CV64A6 supports the RISC-V **Sv39** virtual memory when the ``MMUEn`` parameter is set to 1 (and ``Xlen`` is set to 64). + +By default, CV32A6 and CV64A6 are in RISC-V **Bare** mode. **Sv32** or **Sv39** are enabled by writing 1 to ``satp[0]`` register bit. + +When the ``MMUEn`` parameter is set to 0, CV32A6 and CV64A6 are always in RISC-V **Bare** mode; ``satp[0]`` remains at 0 and writes to this register are ignored. + +Notes for the integrator: + +* The virtual memory is implemented by a memory management unit (MMU) that accelerates the translation from virtual memory addresses (as handled by the core) to physical memory addresses. The MMU integrates translation lookaside buffers (TLB) and a hardware page table walker (PTW). The number of instruction and data TLB entries are configured with ``InstrTlbEntries`` and ``DataTlbEntries``. + +* The MMU will integrate a microarchitectural optimization featuring two levels of TLB: level 1 TBL (sized by ``InstrTlbEntries`` and ``DataTlbEntries``) and a shared level 2 TLB. The optimization has no consequences on the programmer's view. + +* The addition of the hypervisor support will come with **Sv39x4** virtual memory that is not yet documented here. + +*These are the addressing modes supported by the various CVA6 configurations:* + +CV32A60X virtual memory +~~~~~~~~~~~~~~~~~~~~~~~ + +CV32A60X integrates an MMU and supports both the **Bare** and **Sv32** addressing modes. + + +CV32A60MX virtual memory +~~~~~~~~~~~~~~~~~~~~~~~~ + +CV32A60X integrates no MMU and only supports the **Bare** addressing mode. + + +Memory Alignment +---------------- +CVA6 **does not support non-aligned** memory accesses. + +*This is applicable to all configurations.* + +Harts +----- +CVA6 features a **single hart**, i.e. a single hardware thread. + +Therefore the words *hart* and *core* have the same meaning in this guide. + +*This is applicable to all configurations.* + diff --git a/RISCV_Instructions_RV32A.rst b/RISCV_Instructions_RV32A.rst new file mode 100644 index 0000000000..a2c404088c --- /dev/null +++ b/RISCV_Instructions_RV32A.rst @@ -0,0 +1,179 @@ +.. + Copyright (c) 2023 OpenHW Group + Copyright (c) 2023 Thales + + SPDX-License-Identifier: Apache-2.0 WITH SHL-2.1 + +.. Level 1 + ======= + + Level 2 + ------- + + Level 3 + ~~~~~~~ + + Level 4 + ^^^^^^^ + +.. _cva6_riscv_instructions_RV32A: + +*Applicability of this chapter to configurations:* + +.. csv-table:: + :widths: auto + :align: left + :header: "Configuration", "Implementation" + + "CV32A60X", "Implemented extension" + "CV32A60MX", "Not implemented extension" + +**Note**: This chapter is specific to CV32A6 configurations. CV64A6 configurations implement as an option RV64A, that includes additional instructions. + + +RV32A Atomic Instructions +-------------------------------- + +The standard atomic instruction extension is denoted by instruction subset name “A”, and contains instructions that atomically read-modify-write memory to support synchronization between +multiple RISC-V harts running in the same memory space. The two forms of atomic instruction +provided are load-reserved/store-conditional instructions and atomic fetch-and-op memory instructions. Both types of atomic instruction support various memory consistency orderings including +unordered, acquire, release, and sequentially consistent semantics. + +Load-Reserved/Store-Conditional Instructions +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +- **LR.W**: Load-Reserved Word + + **Format**: lr.w rd, (rs1) + + **Description**: LR loads a word from the address in rs1, places the sign-extended value in rd, and registers a reservation on the memory address. + + **Pseudocode**: x[rd] = LoadReserved32(M[x[rs1]]) + + **Invalid values**: NONE + + **Exception raised**: If the address is not naturally aligned (4-byte boundary), a Load/AMO address misaligned exception will be generated. + +- **SC.W**: Store-Conditional Word + + **Format**: sc.w rd, rs2, (rs1) + + **Description**: SC writes a word in rs2 to the address in rs1, provided a valid reservation still exists on that address. SC writes zero to rd on success or a nonzero code on failure. + + **Pseudocode**: x[rd] = StoreConditional32(M[x[rs1]], x[rs2]) + + **Invalid values**: NONE + + **Exception raised**: If the address is not naturally aligned (4-byte boundary), a Store/AMO address misaligned exception will be generated. + +Atomic Memory Operations +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +- **AMOADD.W**: Atomic Memory Operation: Add Word + + **Format**: amoadd.w rd, rs2, (rs1) + + **Description**: AMOADD.W atomically loads a data value from the address in rs1, places the value into register rd, then adds the loaded value and the original value in rs2, then stores the result back to the address in rs1. + + **Pseudocode**: x[rd] = AMO32(M[x[rs1]] + x[rs2]) + + **Invalid values**: NONE + + **Exception raised**: If the address is not naturally aligned (4-byte boundary), a misaligned address exception will be generated. + +- **AMOAND.W**: Atomic Memory Operation: And Word + + **Format**: amoand.w rd, rs2, (rs1) + + **Description**: AMOAND.W atomically loads a data value from the address in rs1, places the value into register rd, then performs an AND between the loaded value and the original value in rs2, then stores the result back to the address in rs1. + + **Pseudocode**: x[rd] = AMO32(M[x[rs1]] & x[rs2]) + + **Invalid values**: NONE + + **Exception raised**: If the address is not naturally aligned (4-byte boundary), a misaligned address exception will be generated. + +- **AMOOR.W**: Atomic Memory Operation: Or Word + + **Format**: amoor.w rd, rs2, (rs1) + + **Description**: AMOOR.W atomically loads a data value from the address in rs1, places the value into register rd, then performs an OR between the loaded value and the original value in rs2, then stores the result back to the address in rs1. + + **Pseudocode**: x[rd] = AMO32(M[x[rs1]] | x[rs2]) + + **Invalid values**: NONE + + **Exception raised**: If the address is not naturally aligned (4-byte boundary), a misaligned address exception will be generated. + +- **AMOXOR.W**: Atomic Memory Operation: Xor Word + + **Format**: amoxor.w rd, rs2, (rs1) + + **Description**: AMOXOR.W atomically loads a data value from the address in rs1, places the value into register rd, then performs a XOR between the loaded value and the original value in rs2, then stores the result back to the address in rs1. + + **Pseudocode**: x[rd] = AMO32(M[x[rs1]] ^ x[rs2]) + + **Invalid values**: NONE + + **Exception raised**: If the address is not naturally aligned (4-byte boundary), a misaligned address exception will be generated. + +- **AMOSWAP.W**: Atomic Memory Operation: Swap Word + + **Format**: amoswap.w rd, rs2, (rs1) + + **Description**: AMOSWAP.W atomically loads a data value from the address in rs1, places the value into register rd, then performs a SWAP between the loaded value and the original value in rs2, then stores the result back to the address in rs1. + + **Pseudocode**: x[rd] = AMO32(M[x[rs1]] SWAP x[rs2]) + + **Invalid values**: NONE + + **Exception raised**: If the address is not naturally aligned (4-byte boundary), a misaligned address exception will be generated. + +- **AMOMIN.W**: Atomic Memory Operation: Minimum Word + + **Format**: amomin.d rd, rs2, (rs1) + + **Description**: AMOMIN.W atomically loads a data value from the address in rs1, places the value into register rd, then choses the minimum between the loaded value and the original value in rs2, then stores the result back to the address in rs1. + + **Pseudocode**: x[rd] = AMO32(M[x[rs1]] MIN x[rs2]) + + **Invalid values**: NONE + + **Exception raised**: If the address is not naturally aligned (4-byte boundary), a misaligned address exception will be generated. + +- **AMOMINU.W**: Atomic Memory Operation: Minimum Word, Unsigned + + **Format**: amominu.d rd, rs2, (rs1) + + **Description**: AMOMINU.W atomically loads a data value from the address in rs1, places the value into register rd, then choses the minimum (the values treated as unsigned) between the loaded value and the original value in rs2, then stores the result back to the address in rs1. + + **Pseudocode**: x[rd] = AMO32(M[x[rs1]] MINU x[rs2]) + + **Invalid values**: NONE + + **Exception raised**: If the address is not naturally aligned (4-byte boundary), a misaligned address exception will be generated. + +- **AMOMAX.W**: Atomic Memory Operation: Maximum Word, Unsigned + + **Format**: amomax.d rd, rs2, (rs1) + + **Description**: AMOMAX.W atomically loads a data value from the address in rs1, places the value into register rd, then choses the maximum between the loaded value and the original value in rs2, then stores the result back to the address in rs1. + + **Pseudocode**: x[rd] = AMO32(M[x[rs1]] MAX x[rs2]) + + **Invalid values**: NONE + + **Exception raised**: If the address is not naturally aligned (4-byte boundary), a misaligned address exception will be generated. + +- **AMOMAXU.W**: Atomic Memory Operation: Maximum Word, Unsigned + + **Format**: amomaxu.d rd, rs2, (rs1) + + **Description**: AMOMAXU.W atomically loads a data value from the address in rs1, places the value into register rd, then choses the maximum (the values treated as unsigned) between the loaded value and the original value in rs2, then stores the result back to the address in rs1. + + **Pseudocode**: x[rd] = AMO32(M[x[rs1]] MAXU x[rs2]) + + **Invalid values**: NONE + + **Exception raised**: If the address is not naturally aligned (4-byte boundary), a misaligned address exception will be generated. + diff --git a/RISCV_Instructions_RV32C.rst b/RISCV_Instructions_RV32C.rst new file mode 100644 index 0000000000..8b9bf0beab --- /dev/null +++ b/RISCV_Instructions_RV32C.rst @@ -0,0 +1,370 @@ +.. + Copyright (c) 2023 OpenHW Group + Copyright (c) 2023 Thales + + SPDX-License-Identifier: Apache-2.0 WITH SHL-2.1 + +.. Level 1 + ======= + + Level 2 + ------- + + Level 3 + ~~~~~~~ + + Level 4 + ^^^^^^^ + +.. _cva6_riscv_instructions_RV32C: + +*Applicability of this chapter to configurations:* + +.. csv-table:: + :widths: auto + :align: left + :header: "Configuration", "Implementation" + + "CV32A60X", "Implemented extension" + "CV32A60MX", "Implemented extension" + +**Note**: This chapter is specific to CV32A6 configurations. CV64A6 configurations implement as an option RV64C, that includes a different list of instructions. + + +RV32C Compressed Instructions +----------------------------- + +RVC uses a simple compression scheme that offers shorter 16-bit versions of common 32-bit RISC-V +instructions when: + + • the immediate or address offset is small; + • one of the registers is the zero register (x0), the ABI link register (x1), or the ABI stack pointer (x2); + • the destination register and the first source register are identical; + • the registers used are the 8 most popular ones. + +The C extension is compatible with all other standard instruction extensions. The C extension +allows 16-bit instructions to be freely intermixed with 32-bit instructions, with the latter now able +to start on any 16-bit boundary. With the addition of the C extension, JAL and JALR instructions +will no longer raise an instruction misaligned exception. + +Integer Computational Instructions +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +- **C.LI**: Compressed Load Immediate + + **Format**: c.li rd, imm[5:0] + + **Description**: loads the sign-extended 6-bit immediate, imm, into register rd. + + **Pseudocode**: x[rd] = sext(imm[5:0]) + + **Invalid values**: rd = x0 + + **Exception raised**: NONE + +- **C.LUI**: Compressed Load Upper Immediate + + **Format**: c.lui rd, nzimm[17:12] + + **Description**: loads the non-zero 6-bit immediate field into bits 17–12 of the destination register, clears the bottom 12 bits, and sign-extends bit 17 into all higher bits of the destination. + + **Pseudocode**: x[rd] = sext(nzimm[17:12] << 12) + + **Invalid values**: rd = x0 & rd = x2 & nzimm = 0 + + **Exception raised**: NONE + +- **C.ADDI**: Compressed Addition Immediate + + **Format**: c.addi rd, nzimm[5:0] + + **Description**: adds the non-zero sign-extended 6-bit immediate to the value in register rd then writes the result to rd. + + **Pseudocode**: x[rd] = x[rd] + sext(nzimm[5:0]) + + **Invalid values**: rd = x0 & nzimm = 0 + + **Exception raised**: NONE + +- **C.ADDI16SP**: Addition Immediate Scaled by 16, to Stack Pointer + + **Format**: c.addi16sp nzimm[9:4] + + **Description**: adds the non-zero sign-extended 6-bit immediate to the value in the stack pointer (sp=x2), where the immediate is scaled to represent multiples of 16 in the range (-512,496). C.ADDI16SP is used to adjust the stack pointer in procedure prologues and epilogues. C.ADDI16SP shares the opcode with C.LUI, but has a destination field of x2. + + **Pseudocode**: x[2] = x[2] + sext(nzimm[9:4]) + + **Invalid values**: rd != x2 & nzimm = 0 + + **Exception raised**: NONE + +- **C.ADDI4SPN**: Addition Immediate Scaled by 4, to Stack Pointer + + **Format**: c.addi4spn rd', nzimm[9:2] + + **Description**: adds a zero-extended non-zero immediate, scaled by 4, to the stack pointer, x2, and writes the result to rd'. This instruction is used to generate pointers to stack-allocated variables. + + **Pseudocode**: x[8 + rd'] = x[2] + zext(nzimm[9:2]) + + **Invalid values**: nzimm = 0 + + **Exception raised**: NONE + +- **C.SLLI**: Compressed Shift Left Logic Immediate + + **Format**: c.slli rd, uimm[5:0] + + **Description**: performs a logical left shift (zeros are shifted into the lower bits). + + **Pseudocode**: x[rd] = x[rd] << uimm[5:0] + + **Invalid values**: rd = x0 & uimm[5] = 0 + + **Exception raised**: NONE + +- **C.SRLI**: Compressed Shift Right Logic Immediate + + **Format**: c.srli rd', uimm[5:0] + + **Description**: performs a logical right shift (zeros are shifted into the upper bits). + + **Pseudocode**: x[8 + rd'] = x[8 + rd'] >> uimm[5:0] + + **Invalid values**: uimm[5] = 0 + + **Exception raised**: NONE + +- **C.SRAI**: Compressed Shift Right Arithmetic Immediate + + **Format**: c.srai rd', uimm[5:0] + + **Description**: performs an arithmetic right shift (sign bits are shifted into the upper bits). + + **Pseudocode**: x[8 + rd'] = x[8 + rd'] >>s uimm[5:0] + + **Invalid values**: uimm[5] = 0 + + **Exception raised**: NONE + +- **C.ANDI**: Compressed AND Immediate + + **Format**: c.andi rd', imm[5:0] + + **Description**: computes the bitwise AND of the value in register rd', and the sign-extended 6-bit immediate, then writes the result to rd'. + + **Pseudocode**: x[8 + rd'] = x[8 + rd'] & sext(imm[5:0]) + + **Invalid values**: NONE + + **Exception raised**: NONE + +- **C.ADD**: Compressed Addition + + **Format**: c.add rd, rs2 + + **Description**: adds the values in registers rd and rs2 and writes the result to register rd. + + **Pseudocode**: x[rd] = x[rd] + x[rs2] + + **Invalid values**: rd = x0 & rs2 = x0 + + **Exception raised**: NONE + +- **C.MV**: Move + + **Format**: c.mv rd, rs2 + + **Description**: copies the value in register rs2 into register rd. + + **Pseudocode**: x[rd] = x[rs2] + + **Invalid values**: rd = x0 & rs2 = x0 + + **Exception raised**: NONE + +- **C.AND**: Compressed AND + + **Format**: c.and rd', rs2' + + **Description**: computes the bitwise AND of of the value in register rd', and register rs2', then writes the result to rd'. + + **Pseudocode**: x[8 + rd'] = x[8 + rd'] & x[8 + rs2'] + + **Invalid values**: NONE + + **Exception raised**: NONE + +- **C.OR**: Compressed OR + + **Format**: c.or rd', rs2' + + **Description**: computes the bitwise OR of of the value in register rd', and register rs2', then writes the result to rd'. + + **Pseudocode**: x[8 + rd'] = x[8 + rd'] | x[8 + rs2'] + + **Invalid values**: NONE + + **Exception raised**: NONE + +- **C.XOR**: Compressed XOR + + **Format**: c.and rd', rs2' + + **Description**: computes the bitwise XOR of of the value in register rd', and register rs2', then writes the result to rd'. + + **Pseudocode**: x[8 + rd'] = x[8 + rd'] ^ x[8 + rs2'] + + **Invalid values**: NONE + + **Exception raised**: NONE + +- **C.SUB**: Compressed Subtraction + + **Format**: c.sub rd', rs2' + + **Description**: subtracts the value in registers rs2' from value in rd' and writes the result to register rd'. + + **Pseudocode**: x[8 + rd'] = x[8 + rd'] - x[8 + rs2'] + + **Invalid values**: NONE + + **Exception raised**: NONE + +- **C.EBREAK**: Compressed Ebreak + + **Format**: c.ebreak + + **Description**: cause control to be transferred back to the debugging environment. + + **Pseudocode**: RaiseException(Breakpoint) + + **Invalid values**: NONE + + **Exception raised**: Raise a Breakpoint exception. + +Control Transfer Instructions +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +- **C.J**: Compressed Jump + + **Format**: c.j imm[11:1] + + **Description**: performs an unconditional control transfer. The offset is sign-extended and added to the pc to form the jump target address. + + **Pseudocode**: pc += sext(imm[11:1]) + + **Invalid values**: NONE + + **Exception raised**: jumps to an unaligned address (4-byte or 2-byte boundary) will usually raise an exception. + +- **C.JAL**: Compressed Jump and Link + + **Format**: c.jal imm[11:1] + + **Description**: performs the same operation as C.J, but additionally writes the address of the instruction following the jump (pc+2) to the link register, x1. + + **Pseudocode**: x[1] = pc+2; pc += sext(imm[11:1]) + + **Invalid values**: NONE + + **Exception raised**: jumps to an unaligned address (4-byte or 2-byte boundary) will usually raise an exception. + +- **C.JR**: Compressed Jump Register + + **Format**: c.jr rs1 + + **Description**: performs an unconditional control transfer to the address in register rs1. + + **Pseudocode**: pc = x[rs1] + + **Invalid values**: rs1 = x0 + + **Exception raised**: jumps to an unaligned address (4-byte or 2-byte boundary) will usually raise an exception. + +- **C.JALR**: Compressed Jump and Link Register + + **Format**: c.jalr rs1 + + **Description**: performs the same operation as C.JR, but additionally writes the address of the instruction following the jump (pc+2) to the link register, x1. + + **Pseudocode**: t = pc+2; pc = x[rs1]; x[1] = t + + **Invalid values**: rs1 = x0 + + **Exception raised**: jumps to an unaligned address (4-byte or 2-byte boundary) will usually raise an exception. + +- **C.BEQZ**: Branch if Equal Zero + + **Format**: c.beqz rs1', imm[8:1] + + **Description**: performs conditional control transfers. The offset is sign-extended and added to the pc to form the branch target address. C.BEQZ takes the branch if the value in register rs1' is zero. + + **Pseudocode**: if (x[8+rs1'] == 0) pc += sext(imm[8:1]) + + **Invalid values**: NONE + + **Exception raised**: no instruction fetch misaligned exception is generated for a conditional branch that is not taken. An Instruction address misaligned exception is raised if the target address is not aligned on 4-byte or 2-byte boundary, because the core supports compressed instructions. + +- **C.BNEZ**: Branch if Not Equal Zero + + **Format**: c.bnez rs1', imm[8:1] + + **Description**: performs conditional control transfers. The offset is sign-extended and added to the pc to form the branch target address. C.BEQZ takes the branch if the value in register rs1' isn't zero. + + **Pseudocode**: if (x[8+rs1'] != 0) pc += sext(imm[8:1]) + + **Invalid values**: NONE + + **Exception raised**: no instruction fetch misaligned exception is generated for a conditional branch that is not taken. An Instruction address misaligned exception is raised if the target address is not aligned on 4-byte or 2-byte boundary, because the core supports compressed instructions. + +Load and Store Instructions +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +- **C.LWSP**: Load Word Stack-Pointer + + **Format**: c.lwsp rd, uimm(x2) + + **Description**: loads a 32-bit value from memory into register rd. It computes an effective address by adding the zero-extended offset, scaled by 4, to the stack pointer, x2. + + **Pseudocode**: x[rd] = M[x[2] + zext(uimm[7:2])][31:0] + + **Invalid values**: rd = x0 + + **Exception raised**: loads with a destination of x0 must still raise any exceptions, also an exception if the memory address isn't aligned (4-byte boundary). + +- **C.SWSP**: Store Word Stack-Pointer + + **Format**: c.swsp rd, uimm(x2) + + **Description**: stores a 32-bit value in register rs2 to memory. It computes an effective address by adding the zero-extended offset, scaled by 4, to the stack pointer, x2. + + **Pseudocode**: M[x[2] + zext(uimm[7:2])][31:0] = x[rs2] + + **Invalid values**: NONE + + **Exception raised**: an exception raised if the memory address isn't aligned (4-byte boundary). + +- **C.LW**: Compressed Load Word + + **Format**: c.lw rd', uimm(rs1') + + **Description**: loads a 32-bit value from memory into register rd'. It computes an effective address by adding the zero-extended offset, scaled by 4, to the base address in register rs1'. + + **Pseudocode**: x[8+rd'] = M[x[8+rs1'] + zext(uimm[6:2])][31:0]) + + **Invalid values**: NONE + + **Exception raised**: an exception raised if the memory address isn't aligned (4-byte boundary). + +- **C.SW**: Compressed Store Word + + **Format**: c.sw rs2', uimm(rs1') + + **Description**: stores a 32-bit value from memory into register rd'. It computes an effective address by adding the zero-extended offset, scaled by 4, to the base address in register rs1'. + + **Pseudocode**: M[x[8+rs1'] + zext(uimm[6:2])][31:0] = x[8+rs2'] + + **Invalid values**: NONE + + **Exception raised**: an exception raised if the memory address isn't aligned (4-byte boundary). + diff --git a/RISCV_Instructions_RV32I.rst b/RISCV_Instructions_RV32I.rst new file mode 100644 index 0000000000..62df7d0167 --- /dev/null +++ b/RISCV_Instructions_RV32I.rst @@ -0,0 +1,542 @@ +.. + Copyright (c) 2023 OpenHW Group + Copyright (c) 2023 Thales + + SPDX-License-Identifier: Apache-2.0 WITH SHL-2.1 + +.. Level 1 + ======= + + Level 2 + ------- + + Level 3 + ~~~~~~~ + + Level 4 + ^^^^^^^ + +.. _cva6_riscv_instructions_RV32I: + +*Applicability of this chapter to configurations:* + +This chapter is applicable to all CV32A6 configurations. + +.. csv-table:: + :widths: auto + :align: left + :header: "Configuration", "Implementation" + + "CV32A60X", "Implemented extension" + "CV32A60MX", "Implemented extension" + +**Note**: CV64A6 implements RV64I that includes additional instructions. + + +RV32I Base Integer Instruction Set +----------------------------------- + +This section describes the RV32I base integer instruction set. + +Integer Register-Immediate Instructions +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +- **ADDI**: Add Immediate + + **Format**: addi rd, rs1, imm[11:0] + + **Description**: add sign-extended 12-bit immediate to register rs1, and store the result in register rd. + + **Pseudocode**: x[rd] = x[rs1] + sext(imm[11:0]) + + **Invalid values**: NONE + + **Exception raised**: NONE + +- **ANDI**: AND Immediate + + **Format**: andi rd, rs1, imm[11:0] + + **Description**: perform bitwise AND on register rs1 and the sign-extended 12-bit immediate and place the result in rd. + + **Pseudocode**: x[rd] = x[rs1] & sext(imm[11:0]) + + **Invalid values**: NONE + + **Exception raised**: NONE + +- **ORI**: OR Immediate + + **Format**: ori rd, rs1, imm[11:0] + + **Description**: perform bitwise OR on register rs1 and the sign-extended 12-bit immediate and place the result in rd. + + **Pseudocode**: x[rd] = x[rs1] | sext(imm[11:0]) + + **Invalid values**: NONE + + **Exception raised**: NONE + +- **XORI**: XOR Immediate + + **Format**: xori rd, rs1, imm[11:0] + + **Description**: perform bitwise XOR on register rs1 and the sign-extended 12-bit immediate and place the result in rd. + + **Pseudocode**: x[rd] = x[rs1] ^ sext(imm[11:0]) + + **Invalid values**: NONE + + **Exception raised**: NONE + +- **SLTI**: Set Less Then Immediate + + **Format**: slti rd, rs1, imm[11:0] + + **Description**: set register rd to 1 if register rs1 is less than the sign extended immediate when both are treated as signed numbers, else 0 is written to rd. + + **Pseudocode**: if (x[rs1] < sext(imm[11:0]) x[rd] = 1 else x[rd] = 0 + + **Invalid values**: NONE + + **Exception raised**: NONE + +- **SLTIU**: Set Less Then Immediate Unsigned + + **Format**: sltiu rd, rs1, imm[11:0] + + **Description**: set register rd to 1 if register rs1 is less than the sign extended immediate when both are treated as unsigned numbers, else 0 is written to rd. + + **Pseudocode**: if (x[rs1] > imm[4:0] + + **Invalid values**: NONE + + **Exception raised**: NONE + +- **SRAI**: Shift Right Arithmetic Immediate + + **Format**: srai rd, rs1, imm[4:0] + + **Description**: arithmetic right shift (the original sign bit is copied into the vacated upper bits). + + **Pseudocode**: x[rd] = x[rs1] >>s imm[4:0] + + **Invalid values**: NONE + + **Exception raised**: NONE + +- **LUI**: Load Upper Immediate + + **Format**: lui rd, imm[19:0] + + **Description**: place the immediate value in the top 20 bits of the destination register rd, filling in the lowest 12 bits with zeros. + + **Pseudocode**: x[rd] = sext(imm[31:12] << 12) + + **Invalid values**: NONE + + **Exception raised**: NONE + +- **AUIPC**: Add Upper Immediate to PC + + **Format**: auipc rd, imm[19:0] + + **Description**: form a 32-bit offset from the 20-bit immediate, filling in the lowest 12 bits with zeros, adds this offset to the pc, then place the result in register rd. + + **Pseudocode**: x[rd] = pc + sext(immediate[31:12] << 12) + + **Invalid values**: NONE + + **Exception raised**: NONE + +Integer Register-Register Instructions +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +- **ADD**: Addition + + **Format**: add rd, rs1, rs2 + + **Description**: add rs2 to register rs1, and store the result in register rd. + + **Pseudocode**: x[rd] = x[rs1] + x[rs2] + + **Invalid values**: NONE + + **Exception raised**: NONE + +- **SUB**: Subtraction + + **Format**: sub rd, rs1, rs2 + + **Description**: subtract rs2 from register rs1, and store the result in register rd. + + **Pseudocode**: x[rd] = x[rs1] - x[rs2] + + **Invalid values**: NONE + + **Exception raised**: NONE + +- **AND**: AND logical operator + + **Format**: and rd, rs1, rs2 + + **Description**: perform bitwise AND on register rs1 and rs2 and place the result in rd. + + **Pseudocode**: x[rd] = x[rs1] & x[rs2] + + **Invalid values**: NONE + + **Exception raised**: NONE + +- **OR**: OR logical operator + + **Format**: or rd, rs1, rs2 + + **Description**: perform bitwise OR on register rs1 and rs2 and place the result in rd. + + **Pseudocode**: x[rd] = x[rs1] | x[rs2] + + **Invalid values**: NONE + + **Exception raised**: NONE + +- **XOR**: XOR logical operator + + **Format**: xor rd, rs1, rs2 + + **Description**: perform bitwise XOR on register rs1 and rs2 and place the result in rd. + + **Pseudocode**: x[rd] = x[rs1] ^ x[rs2] + + **Invalid values**: NONE + + **Exception raised**: NONE + +- **SLT**: Set Less Then + + **Format**: slt rd, rs1, rs2 + + **Description**: set register rd to 1 if register rs1 is less than rs2 when both are treated as signed numbers, else 0 is written to rd. + + **Pseudocode**: if (x[rs1] < x[rs2]) x[rd] = 1 else x[rd] = 0 + + **Invalid values**: NONE + + **Exception raised**: NONE + +- **SLTU**: Set Less Then Unsigned + + **Format**: sltu rd, rs1, rs2 + + **Description**: set register rd to 1 if register rs1 is less than rs2 when both are treated as unsigned numbers, else 0 is written to rd. + + **Pseudocode**: if (x[rs1] > x[rs2] + + **Invalid values**: NONE + + **Exception raised**: NONE + +- **SRA**: Shift Right Arithmetic + + **Format**: sra rd, rs1, rs2 + + **Description**: arithmetic right shift (the original sign bit is copied into the vacated upper bits). + + **Pseudocode**: x[rd] = x[rs1] >>s x[rs2] + + **Invalid values**: NONE + + **Exception raised**: NONE + +Control Transfer Instructions +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +**Unconditional Jumps** + +- **JAL**: Jump and Link + + **Format**: jal rd, imm[20:1] + + **Description**: offset is sign-extended and added to the pc to form the jump target address (pc is calculated using signed arithmetic), then setting the least-significant bit of the result to zero, and store the address of instruction following the jump (pc+4) into register rd. + + **Pseudocode**: x[rd] = pc+4; pc += sext(imm[20:1]) + + **Invalid values**: NONE + + **Exception raised**: jumps to an unaligned address (4-byte or 2-byte boundary) will usually raise an exception. + +- **JALR**: Jump and Link Register + + **Format**: jalr rd, rs1, imm[11:0] + + **Description**: target address is obtained by adding the 12-bit signed immediate to the register rs1 (pc is calculated using signed arithmetic), then setting the least-significant bit of the result to zero, and store the address of instruction following the jump (pc+4) into register rd. + + **Pseudocode**: t = pc+4; pc = (x[rs1]+sext(imm[11:0]))&∼1 ; x[rd] = t + + **Invalid values**: NONE + + **Exception raised**: jumps to an unaligned address (4-byte or 2-byte boundary) will usually raise an exception. + +**Conditional Branches** + +- **BEQ**: Branch Equal + + **Format**: beq rs1, rs2, imm[12:1] + + **Description**: takes the branch (pc is calculated using signed arithmetic) if registers rs1 and rs2 are equal. + + **Pseudocode**: if (x[rs1] == x[rs2]) pc += sext({imm[12:1], 1’b0}) else pc += 4 + + **Invalid values**: NONE + + **Exception raised**: no instruction fetch misaligned exception is generated for a conditional branch that is not taken. An Instruction address misaligned exception is raised if the target address is not aligned on 4-byte or 2-byte boundary, because the core supports compressed instructions. + +- **BNE**: Branch Not Equal + + **Format**: bne rs1, rs2, imm[12:1] + + **Description**: takes the branch (pc is calculated using signed arithmetic) if registers rs1 and rs2 are not equal. + + **Pseudocode**: if (x[rs1] != x[rs2]) pc += sext({imm[12:1], 1’b0}) else pc += 4 + + **Invalid values**: NONE + + **Exception raised**: no instruction fetch misaligned exception is generated for a conditional branch that is not taken. An Instruction address misaligned exception is raised if the target address is not aligned on 4-byte or 2-byte boundary, because the core supports compressed instructions. + +- **BLT**: Branch Less Than + + **Format**: blt rs1, rs2, imm[12:1] + + **Description**: takes the branch (pc is calculated using signed arithmetic) if registers rs1 less than rs2 (using signed comparison). + + **Pseudocode**: if (x[rs1] < x[rs2]) pc += sext({imm[12:1], 1’b0}) else pc += 4 + + **Invalid values**: NONE + + **Exception raised**: no instruction fetch misaligned exception is generated for a conditional branch that is not taken. An Instruction address misaligned exception is raised if the target address is not aligned on 4-byte or 2-byte boundary, because the core supports compressed instructions. + +- **BLTU**: Branch Less Than Unsigned + + **Format**: bltu rs1, rs2, imm[12:1] + + **Description**: takes the branch (pc is calculated using signed arithmetic) if registers rs1 less than rs2 (using unsigned comparison). + + **Pseudocode**: if (x[rs1] = x[rs2]) pc += sext({imm[12:1], 1’b0}) else pc += 4 + + **Invalid values**: NONE + + **Exception raised**: no instruction fetch misaligned exception is generated for a conditional branch that is not taken. An Instruction address misaligned exception is raised if the target address is not aligned on 4-byte or 2-byte boundary, because the core supports compressed instructions. + +- **BGEU**: Branch Greater or Equal Unsigned + + **Format**: bgeu rs1, rs2, imm[12:1] + + **Description**: takes the branch (pc is calculated using signed arithmetic) if registers rs1 is greater than or equal rs2 (using unsigned comparison). + + **Pseudocode**: if (x[rs1] >=u x[rs2]) pc += sext({imm[12:1], 1’b0}) else pc += 4 + + **Invalid values**: NONE + + **Exception raised**: no instruction fetch misaligned exception is generated for a conditional branch that is not taken. An Instruction address misaligned exception is raised if the target address is not aligned on 4-byte or 2-byte boundary, because the core supports compressed instructions. + +Load and Store Instructions +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +- **LB**: Load Byte + + **Format**: lb rd, imm(rs1) + + **Description**: loads a 8-bit value from memory, then sign-extends to 32-bit before storing in rd (rd is calculated using signed arithmetic), the effective address is obtained by adding register rs1 to the sign-extended 12-bit offset. + + **Pseudocode**: x[rd] = sext(M[x[rs1] + sext(imm[11:0])][7:0]) + + **Invalid values**: NONE + + **Exception raised**: loads with a destination of x0 must still raise any exceptions and action any other side effects even though the load value is discarded. + +- **LH**: Load Halfword + + **Format**: lh rd, imm(rs1) + + **Description**: loads a 16-bit value from memory, then sign-extends to 32-bit before storing in rd (rd is calculated using signed arithmetic), the effective address is obtained by adding register rs1 to the sign-extended 12-bit offset. + + **Pseudocode**: x[rd] = sext(M[x[rs1] + sext(imm[11:0])][15:0]) + + **Invalid values**: NONE + + **Exception raised**: loads with a destination of x0 must still raise any exceptions and action any other side effects even though the load value is discarded, also an exception is raised if the memory address isn't aligned (2-byte boundary). + +- **LW**: Load Word + + **Format**: lw rd, imm(rs1) + + **Description**: loads a 32-bit value from memory, then storing in rd (rd is calculated using signed arithmetic). The effective address is obtained by adding register rs1 to the sign-extended 12-bit offset. + + **Pseudocode**: x[rd] = sext(M[x[rs1] + sext(imm[11:0])][31:0]) + + **Invalid values**: NONE + + **Exception raised**: loads with a destination of x0 must still raise any exceptions and action any other side effects even though the load value is discarded, also an exception is raised if the memory address isn't aligned (4-byte boundary). + +- **LBU**: Load Byte Unsigned + + **Format**: lbu rd, imm(rs1) + + **Description**: loads a 8-bit value from memory, then zero-extends to 32-bit before storing in rd (rd is calculated using unsigned arithmetic), the effective address is obtained by adding register rs1 to the sign-extended 12-bit offset. + + **Pseudocode**: x[rd] = zext(M[x[rs1] + sext(imm[11:0])][7:0]) + + **Invalid values**: NONE + + **Exception raised**: loads with a destination of x0 must still raise any exceptions and action any other side effects even though the load value is discarded. + +- **LHU**: Load Halfword Unsigned + + **Format**: lhu rd, imm(rs1) + + **Description**: loads a 16-bit value from memory, then zero-extends to 32-bit before storing in rd (rd is calculated using unsigned arithmetic), the effective address is obtained by adding register rs1 to the sign-extended 12-bit offset. + + **Pseudocode**: x[rd] = zext(M[x[rs1] + sext(imm[11:0])][15:0]) + + **Invalid values**: NONE + + **Exception raised**: loads with a destination of x0 must still raise any exceptions and action any other side effects even though the load value is discarded, also an exception is raised if the memory address isn't aligned (2-byte boundary). + +- **SB**: Store Byte + + **Format**: sb rs2, imm(rs1) + + **Description**: stores a 8-bit value from the low bits of register rs2 to memory, the effective address is obtained by adding register rs1 to the sign-extended 12-bit offset. + + **Pseudocode**: M[x[rs1] + sext(imm[11:0])][7:0] = x[rs2][7:0] + + **Invalid values**: NONE + + **Exception raised**: NONE + +- **SH**: Store Halfword + + **Format**: sh rs2, imm(rs1) + + **Description**: stores a 16-bit value from the low bits of register rs2 to memory, the effective address is obtained by adding register rs1 to the sign-extended 12-bit offset. + + **Pseudocode**: M[x[rs1] + sext(imm[11:0])][15:0] = x[rs2][15:0] + + **Invalid values**: NONE + + **Exception raised**: an exception is raised if the memory address isn't aligned (2-byte boundary). + +- **SW**: Store Word + + **Format**: sw rs2, imm(rs1) + + **Description**: stores a 32-bit value from register rs2 to memory, the effective address is obtained by adding register rs1 to the sign-extended 12-bit offset. + + **Pseudocode**: M[x[rs1] + sext(imm[11:0])][31:0] = x[rs2][31:0] + + **Invalid values**: NONE + + **Exception raised**: an exception is raised if the memory address isn't aligned (4-byte boundary). + +Memory Ordering +^^^^^^^^^^^^^^^^^^ + +- **FENCE**: Fence Instruction + + **Format**: fence pre, succ + + **Description**: order device I/O and memory accesses as viewed by other RISC-V harts and external devices or coprocessors. Any combination of device input (I), device output (O), memory reads (R), and memory writes (W) may be ordered with respect to any combination of the same. Informally, no other RISC-V hart or external device can observe any operation in the successor set following a FENCE before any operation in the predecessor set preceding the FENCE, as the core support 1 hart, the fence instruction has no effect so we can considerate it as a nop instruction. + + **Pseudocode**: No operation (nop) + + **Invalid values**: NONE + + **Exception raised**: NONE + +Environment Call and Breakpoints +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +- **ECALL**: Environment Call + + **Format**: ecall + + **Description**: make a request to the supporting execution environment, which is usually an operating system. The ABI for the system will define how parameters for the environment request are passed, but usually these will be in defined locations in the integer register file. + + **Pseudocode**: RaiseException(EnvironmentCall) + + **Invalid values**: NONE + + **Exception raised**: Raise an Environment Call exception. + +- **EBREAK**:Environment Break + + **Format**: ebreak + + **Description**: cause control to be transferred back to a debugging environment. + + **Pseudocode**: RaiseException(Breakpoint) + + **Invalid values**: NONE + + **Exception raised**: Raise a Breakpoint exception. + diff --git a/RISCV_Instructions_RV32M.rst b/RISCV_Instructions_RV32M.rst new file mode 100644 index 0000000000..771934de93 --- /dev/null +++ b/RISCV_Instructions_RV32M.rst @@ -0,0 +1,143 @@ +.. + Copyright (c) 2023 OpenHW Group + Copyright (c) 2023 Thales + + SPDX-License-Identifier: Apache-2.0 WITH SHL-2.1 + +.. Level 1 + ======= + + Level 2 + ------- + + Level 3 + ~~~~~~~ + + Level 4 + ^^^^^^^ + +.. _cva6_riscv_instructions_RV32M: + +*Applicability of this chapter to configurations:* + +This chapter is applicable to all CV32A6 configurations. + +.. csv-table:: + :widths: auto + :align: left + :header: "Configuration", "Implementation" + + "CV32A60X", "Implemented extension" + "CV32A60MX", "Implemented extension" + +**Note**: CV64A6 implements RV64M that includes additional instructions. + + +RV32M Multiplication and Division Instructions +------------------------------------------------------ + +This chapter describes the standard integer multiplication and division instruction extension, which +is named “M” and contains instructions that multiply or divide values held in two integer registers. + +Multiplication Operations +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +- **MUL**: Multiplication + + **Format**: mul rd, rs1, rs2 + + **Description**: performs a 32-bit × 32-bit multiplication and places the lower 32 bits in the destination register (Both rs1 and rs2 treated as signed numbers). + + **Pseudocode**: x[rd] = x[rs1] * x[rs2] + + **Invalid values**: NONE + + **Exception raised**: NONE + +- **MULH**: Multiplication Higher + + **Format**: mulh rd, rs1, rs2 + + **Description**: performs a 32-bit × 32-bit multiplication and places the upper 32 bits in the destination register of the 64-bit product (Both rs1 and rs2 treated as signed numbers). + + **Pseudocode**: x[rd] = (x[rs1] s*s x[rs2]) >>s 32 + + **Invalid values**: NONE + + **Exception raised**: NONE + +- **MULHU**: Multiplication Higher Unsigned + + **Format**: mulhu rd, rs1, rs2 + + **Description**: performs a 32-bit × 32-bit multiplication and places the upper 32 bits in the destination register of the 64-bit product (Both rs1 and rs2 treated as unsigned numbers). + + **Pseudocode**: x[rd] = (x[rs1] u*u x[rs2]) >>u 32 + + **Invalid values**: NONE + + **Exception raised**: NONE + +- **MULHSU**: Multiplication Higher Signed Unsigned + + **Format**: mulhsu rd, rs1, rs2 + + **Description**: performs a 32-bit × 32-bit multiplication and places the upper 32 bits in the destination register of the 64-bit product (rs1 treated as signed number, rs2 treated as unsigned number). + + **Pseudocode**: x[rd] = (x[rs1] s*u x[rs2]) >>s 32 + + **Invalid values**: NONE + + **Exception raised**: NONE + +Division Operations +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +- **DIV**: Division + + **Format**: div rd, rs1, rs2 + + **Description**: perform signed integer division of 32 bits by 32 bits (rounding towards zero). + + **Pseudocode**: x[rd] = x[rs1] /s x[rs2] + + **Invalid values**: NONE + + **Exception raised**: NONE + +- **DIVU**: Division Unsigned + + **Format**: divu rd, rs1, rs2 + + **Description**: perform unsigned integer division of 32 bits by 32 bits (rounding towards zero). + + **Pseudocode**: x[rd] = x[rs1] /u x[rs2] + + **Invalid values**: NONE + + **Exception raised**: NONE + +- **REM**: Remain + + **Format**: rem rd, rs1, rs2 + + **Description**: provide the remainder of the corresponding division operation DIV (the sign of rd equals the sign of rs1). + + **Pseudocode**: x[rd] = x[rs1] %s x[rs2] + + **Invalid values**: NONE + + **Exception raised**: NONE + +- **REMU**: Remain Unsigned + + **Format**: rem rd, rs1, rs2 + + **Description**: provide the remainder of the corresponding division operation DIVU. + + **Pseudocode**: x[rd] = x[rs1] %u x[rs2] + + **Invalid values**: NONE + + **Exception raised**: NONE + diff --git a/RISCV_Instructions_RV32ZCb.rst b/RISCV_Instructions_RV32ZCb.rst new file mode 100644 index 0000000000..4b9789adbc --- /dev/null +++ b/RISCV_Instructions_RV32ZCb.rst @@ -0,0 +1,171 @@ +.. + Copyright (c) 2023 OpenHW Group + Copyright (c) 2023 Thales + + SPDX-License-Identifier: Apache-2.0 WITH SHL-2.1 + +.. Level 1 + ======= + + Level 2 + ------- + + Level 3 + ~~~~~~~ + + Level 4 + ^^^^^^^ + +.. _cva6_riscv_instructions_RV32Zcb: + +*Applicability of this chapter to configurations:* + +.. csv-table:: + :widths: auto + :align: left + :header: "Configuration", "Implementation" + + "CV32A60X", "Implemented extension" + "CV32A60MX", "Implemented extension" + +**Note**: This chapter is specific to CV32A6 configurations. CV64A6 configurations implement as an option RV64Zcb, that includes one additional instruction. + + +RV32Zcb Code Size Reduction Instructions +----------------------------------------- + +Zcb belongs to group of extensions called RISC-V Code Size Reduction Extension (Zc*). Zc* has become the superset of Standard C extension adding more 16-bit instructions to the ISA. Zcb includes 16-bit version of additional Integer (I), Multiply (M) and Bit-Manipulation (Zbb) Instructions. +All the Zcb instructions require at least standard C extension support as pre-requisite, along with M and Zbb extensions for 16-bit version of the respective instructions. + +- **C.ZEXT.B**: Compressed Zero Extend Byte + + **Format**: c.zext.b rd' + + **Description**: This instruction takes a single source/destination operand. It zero-extends the least-significant byte of the operand by inserting zeros into all of the bits more significant than 7. + + **Pseudocode**: x[8 + rd'] = zext(x[8 + rd'][7:0]) + + **Invalid values**: NONE + + **Exception raised**: NONE + +- **C.SEXT.B**: Compressed Sign Extend Byte + + **Format**: c.sext.b rd' + + **Description**: This instruction takes a single source/destination operand. It sign-extends the least-significant byte in the operand by copying the most-significant bit in the byte (i.e., bit 7) to all of the more-significant bits. It also requires Bit-Manipulation (Zbb) extension support. + + **Pseudocode**: x[8 + rd'] = sext(x[8 + rd'][7:0]) + + **Invalid values**: NONE + + **Exception raised**: NONE + +- **C.ZEXT.H**: Compressed Zero Extend Halfword + + **Format**: c.zext.h rd' + + **Description**: This instruction takes a single source/destination operand. It zero-extends the least-significant halfword of the operand by inserting zeros into all of the bits more significant than 15. It also requires Bit-Manipulation (Zbb) extension support. + + **Pseudocode**: x[8 + rd'] = zext(x[8 + rd'][15:0]) + + **Invalid values**: NONE + + **Exception raised**: NONE + +- **C.SEXT.H**: Compressed Sign Extend Halfword + + **Format**: c.sext.h rd' + + **Description**: This instruction takes a single source/destination operand. It sign-extends the least-significant halfword in the operand by copying the most-significant bit in the halfword (i.e., bit 15) to all of the more-significant bits. It also requires Bit-Manipulation (Zbb) extension support. + + **Pseudocode**: x[8 + rd'] = sext(x[8 + rd'][15:0]) + + **Invalid values**: NONE + + **Exception raised**: NONE + +- **C.NOT**: Compressed Bitwise NOT + + **Format**: c.not rd' + + **Description**: This instruction takes the one’s complement of rd'/rs1' and writes the result to the same register. + + **Pseudocode**: x[8 + rd'] = x[8 + rd'] ^ -1 + + **Invalid values**: NONE + + **Exception raised**: NONE + +- **C.MUL**: Compressed Multiply + + **Format**: c.mul rd', rs2' + + **Description**: performs a 32-bit × 32-bit multiplication and places the lower 32 bits in the destination register (Both rd' and rs2' treated as signed numbers). It also requires M extension support. + + **Pseudocode**: x[8 + rd'] = (x[8 + rd'] * x[8 + rs2'])[31:0] + + **Invalid values**: NONE + + **Exception raised**: NONE + +- **C.LHU**: Compressed Load Halfword Unsigned + + **Format**: c.lhu rd', uimm(rs1') + + **Description**: This instruction loads a halfword from the memory address formed by adding rs1' to the zero extended immediate uimm. The resulting halfword is zero extended and is written to rd'. + + **Pseudocode**: x[8+rd'] = zext(M[x[8+rs1'] + zext(uimm[1])][15:0]) + + **Invalid values**: NONE + + **Exception raised**: an exception raised if the memory address isn't aligned (2-byte boundary). + +- **C.LH**: Compressed Load Halfword + + **Format**: c.lh rd', uimm(rs1') + + **Description**: This instruction loads a halfword from the memory address formed by adding rs1' to the zero extended immediate uimm. The resulting halfword is sign extended and is written to rd'. + + **Pseudocode**: x[8+rd'] = sext(M[x[8+rs1'] + zext(uimm[1])][15:0]) + + **Invalid values**: NONE + + **Exception raised**: an exception raised if the memory address isn't aligned (2-byte boundary). + +- **C.LBU**: Compressed Load Byte Unsigned + + **Format**: c.lbu rd', uimm(rs1') + + **Description**: This instruction loads a byte from the memory address formed by adding rs1' to the zero extended immediate uimm. The resulting byte is zero extended and is written to rd'. + + **Pseudocode**: x[8+rd'] = zext(M[x[8+rs1'] + zext(uimm[1:0])][7:0]) + + **Invalid values**: NONE + + **Exception raised**: NONE + +- **C.SH**: Compressed Store Halfword + + **Format**: c.sh rs2', uimm(rs1') + + **Description**: This instruction stores the least significant halfword of rs2' to the memory address formed by adding rs1' to the zero extended immediate uimm. + + **Pseudocode**: M[x[8+rs1'] + zext(uimm[1])][15:0] = x[8+rs2'] + + **Invalid values**: NONE + + **Exception raised**: an exception raised if the memory address isn't aligned (2-byte boundary). + +- **C.SB**: Compressed Store Byte + + **Format**: c.sb rs2', uimm(rs1') + + **Description**: This instruction stores the least significant byte of rs2' to the memory address formed by adding rs1' to the zero extended immediate uimm. + + **Pseudocode**: M[x[8+rs1'] + zext(uimm[1:0])][7:0] = x[8+rs2'] + + **Invalid values**: NONE + + **Exception raised**: NONE + \ No newline at end of file diff --git a/RISCV_Instructions_RVZicond.rst b/RISCV_Instructions_RVZicond.rst new file mode 100644 index 0000000000..d4f6ec39ad --- /dev/null +++ b/RISCV_Instructions_RVZicond.rst @@ -0,0 +1,62 @@ +.. + Copyright (c) 2023 OpenHW Group + Copyright (c) 2023 Thales + + SPDX-License-Identifier: Apache-2.0 WITH SHL-2.1 + +.. Level 1 + ======= + + Level 2 + ------- + + Level 3 + ~~~~~~~ + + Level 4 + ^^^^^^^ + +.. _cva6_riscv_instructions_RVZicond: + +*Applicability of this chapter to configurations:* + +.. csv-table:: + :widths: auto + :align: left + :header: "Configuration", "Implementation" + + "CV32A60X", "Implemented extension" + "CV32A60MX", "Not implemented extension" + +**Note**: RV32Zicond and RV64Zicond are identical. + + +RVZicond Integer Conditional operations +------------------------------------------- + +The instructions follow the format for R-type instructions with 3 operands (i.e., 2 source operands and 1 destination operand). Using these instructions, branchless sequences can be implemented (typically in two-instruction sequences) without the need for instruction fusion, special provisions during the decoding of architectural instructions, or other microarchitectural provisions. + +- **CZERO.EQZ**: Conditional zero, if condition is equal to zero + + **Format**: czero.eqz rd, rs1, rs2 + + **Description**: This instruction behaves as if there is a conditional branch dependent on rs2 being equal to zero, wherein it branches to code that writes a 0 into rd when the equivalence is true, and otherwise falls through to code that moves rs1 into rd. + + **Pseudocode**: if (x[rs2] == 0) x[rd] = 0 else x[rs1] + + **Invalid values**: NONE + + **Exception raised**: NONE + +- **CZERO.NEZ**: Conditional zero, if condition is nonzero + + **Format**: czero.nez rd, rs1, rs2 + + **Description**: This instruction behaves as if there is a conditional branch dependent on rs2 being not equal to zero, wherein it branches to code that writes a 0 into rd when the equivalence is true, and otherwise falls through to code that moves rs1 into rd + + **Pseudocode**: if (x[rs2] != 0) x[rd] = 0 else x[rs1] + + **Invalid values**: NONE + + **Exception raised**: NONE +