From f5d1182a2e2919dc456f62959a72b28707564199 Mon Sep 17 00:00:00 2001 From: Matteo Perotti Date: Tue, 22 Oct 2024 15:11:26 +0200 Subject: [PATCH] docs: Add Ara to docs --- docs/um/arch.md | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/docs/um/arch.md b/docs/um/arch.md index cc5ea9390..87a67240d 100644 --- a/docs/um/arch.md +++ b/docs/um/arch.md @@ -135,6 +135,19 @@ Cheshire defaults on using CVA6 with hypervisor and CLIC support enabled; RV32 c Each CVA6 core is a standalone AXI4 manager at the crossbar. Coherence is maintained through a self-invalidation scheme and RISC-V atomics are handled through a custom, user-channel-based AXI4 extension. For the latter, we wrap the cores and other managers to give each a default user channel assignment and, for atomics-capable managers, a unique ID on a slice of user bits. +### Ara Vector Accelerator + +[Ara](https://github.com/pulp-platform/ara) is a RISC-V V vector coprocessor tightly coupled with CVA6. Ara can be instantiated in Cheshire to enable RISC-V V support. Ara exposes the following parameters: + +| Parameter | Type / Range | Description | +| ------------------------ | ------------ | --------------------------------------------------------------- | +| `Ara` | `bit` | Enable the Ara Vector Accelerator | +| `AraNrLanes` | `byte_bt` | Number of parallel vector lanes in Ara | +| `AraVlen` | `word_bt` | RISC-V V VLEN parameter (default vector register length in bit) | +| `AraParMemReq` | `byte_bt` | Number of possible outstanding memory requests from Ara | + +Ara has a private AXI memory port resized to 64-bit to fit the current L2 memory bandwidth. Currently, we tested 2-lane Ara instances with `VLEN = 2048` without performance loss. Higher lane counts will instantiate a memory port with bandwidth greater than 64-bit/cycle. However, this will be bottlenecked by the current memory bandwidth (64-bit/cycle). + ### Interconnect The interconnect is composed of a main [AXI4](https://github.com/pulp-platform/axi) crossbar with AXI5 atomic operations (ATOPs) support and an auxiliary [Regbus](https://github.com/pulp-platform/register_interface) demultiplexer providing access to numerous peripherals and configuration interfaces. The Regbus has a static data width of 32 bit.