Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snitch Cluster Offloading #13

Merged
merged 15 commits into from
Dec 13, 2024
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
*~
.ninja*
**/build/*

.vscode/settings.json
10 changes: 10 additions & 0 deletions .vscode/c_cpp_properties.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
{
"configurations": [
{
"name": "cMake",
"configurationProvider": "ms-vscode.cmake-tools",
"compileCommands": "${config:cmake.buildDirectory}/compile_commands.json"
}
],
"version": 4
}
34 changes: 29 additions & 5 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
#
# Moritz Scherer <[email protected]>
# Viviane Potocnik <[email protected]>
# Philip Wiese <[email protected]>

cmake_minimum_required(VERSION 3.13)

Expand All @@ -27,16 +28,39 @@ set(CMAKE_RUNTIME_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR}/bin)

project(chimera-sdk LANGUAGES C ASM)

# WIESEP: It is important to set the ISA and ABI for the host and the cluster snitch
set(ABI ilp32)
set(ISA_CLUSTER_SNITCH rv32im)
set(ISA_HOST rv32imc)
Xeratec marked this conversation as resolved.
Show resolved Hide resolved

message(STATUS "[CHIMERA-SDK] ABI : ${ABI}")
message(STATUS "[CHIMERA-SDK] ISA_HOST : ${ISA_HOST}")
message(STATUS "[CHIMERA-SDK] ISA_CLUSTER_SNITCH : ${ISA_CLUSTER_SNITCH}")
if (${DISASSEMBLE_LIBRARIES})
message(STATUS "[CHIMERA-SDK] DISASSEMBLE_LIBRARIES : ON")
else()
message(STATUS "[CHIMERA-SDK] DISASSEMBLE_LIBRARIES : OFF")
endif()

include(${CMAKE_CURRENT_LIST_DIR}/cmake/Utils.cmake)

################################################################################
# Add subdirectories #
################################################################################
# WIESEP: Targets have to be included before the other folders to make them available
# Depending on the target, the following static libraries have to added by the targets:
# - runtime_host
# - runtime_cluster_snitch
add_subdirectory(targets)
add_subdirectory(hal)

add_library(chimera-sdk INTERFACE)
target_link_libraries(chimera-sdk INTERFACE hal)
target_link_libraries(chimera-sdk INTERFACE runtime)
target_sources(chimera-sdk INTERFACE $<TARGET_OBJECTS:runtime>)
# Include other subdirectories
add_subdirectory(hal)
add_subdirectory(devices)
add_subdirectory(drivers)

################################################################################
# Testing #
################################################################################
enable_testing()

add_subdirectory(tests)
42 changes: 35 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,17 +53,45 @@ The resulting binaries will be stored in `build/bin`, and can be used within the

To format all source files, run
```
python scripts/run_clang_format.py -ir hal/
python scripts/run_clang_format.py -ir targets/
python scripts/run_clang_format.py -ir tests/
python scripts/run_clang_format.py -ir hal/ targets/ tests/ drivers/
```

Our CI uses llvm-12 for clang-format, so on IIS machines you may run
```
python scripts/run_clang_format.py -ir hal/ --clang-format-executable=/usr/pack/riscv-1.0-kgf/pulp-llvm-0.12.0/bin/clang-format
python scripts/run_clang_format.py -ir tests/ hal/ targets/ drivers/ --clang-format-executable=/usr/pack/riscv-1.0-kgf/pulp-llvm-0.12.0/bin/clang-format

python scripts/run_clang_format.py -ir targets/ --clang-format-executable=/usr/pack/riscv-1.0-kgf/pulp-llvm-0.12.0/bin/clang-format

python scripts/run_clang_format.py -ir tests/ --clang-format-executable=/usr/pack/riscv-1.0-kgf/pulp-llvm-0.12.0/bin/clang-format
```

## Visual Studio Code Integration

To enable automatic configuration of the C/C++ extension and support for the integrated cMake build flow on the IIS workstations, add the following content to `.vscode/settings.json`:
```json
{
"cmake.configureSettings": {
"TOOLCHAIN_DIR": "/usr/pack/riscv-1.0-kgf/pulp-llvm-0.12.0/bin",
"TARGET_PLATFORM": "chimera-convolve",
},
"cmake.environment": {
"PATH": "/usr/pack/riscv-1.0-kgf/default/bin:${env:PATH}",
"LD_LIBRARY_PATH": "/usr/pack/riscv-1.0-kgf/lib64:/usr/pack/riscv-1.0-kgf/lib64",
}
}
Xeratec marked this conversation as resolved.
Show resolved Hide resolved
```
If you are not on an IIS system, you need to adjust the paths according to your local installation.

## Technical Details

### Mixed ISA Compilation
The current approach compiles all code for both the host and cluster cores into a single library. This requires precise handling to ensure compatibility between the different instruction set architectures (ISAs) and application binary interfaces (ABIs).
This requires careful handling to avoid invalid instructions caused by mismatched ISAs between the host and cluster cores. Hence, we define four CMake variables,`ABI`, `ISA_HOST`, and `ISA_CLUSTER_SNITCH`, to specify the appropriate ISA for each core type. The ABI has to be identical to ensure correct function calls.
Furthermore, the tests are split into `src_host` and `src_cluster` directories to clearly separate code executed on the host and cluster cores.

### cMake Build Flow
All runtime functions executed by the host core are compiled into a dedicated `runtime_host` static library and the cluster code into `runtime_cluster_<type>` (e.g. `runtime_cluster_snitch`). Additionally, the HAL layer is compiled into the `hal_host` static libary.
The final binary is seperated into two object libaries, one for the host and one for the cluster core. The host object library is linked with the `runtime_host` and `hal_host` libraries, while the cluster object library is linked with the `runtime_cluster_<type>` library. The final binary is then linked from the two object libraries.

### Warning
Special attention is required for functions that execute before the cluster core is fully initialized, such as the trampoline function and interrupt handlers. At this stage, critical resources like the stack, global pointer, and thread pointer are not yet configured. Consequently, the compiler must not generate code that allocates stack frames. To address this, such functions are implemented as naked functions, which prevent the compiler from adding prologues or epilogues that rely on stack operations.

**It is recommended to always check the generated assembly code to ensure that the correct instructions are generated for the target core!**

49 changes: 49 additions & 0 deletions cmake/Utils.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -29,3 +29,52 @@ macro(add_target_source name)
message(WARNING "Path ${CMAKE_CURRENT_LIST_DIR}/${name} does not exist")
endif()
endmacro()

# Define a reusable macro for handling folder mappings
macro(add_chimera_subdirectories target_platform category mappings)
# Initialize included folders
set(included_folders "")

# Process mappings
foreach(mapping IN LISTS ${mappings})
string(FIND "${mapping}" ":" delim_pos)
if(delim_pos EQUAL -1)
message(WARNING "[CHIMERA-SDK] Invalid mapping entry: '${mapping}'. Skipping.")
continue()
endif()


# Extract key and value
string(SUBSTRING "${mapping}" 0 ${delim_pos} key)
math(EXPR value_start "${delim_pos} + 1")
string(SUBSTRING "${mapping}" ${value_start} -1 value)

if(key STREQUAL "${target_platform}")
list(APPEND included_folders ${value})
break()
endif()
endforeach()

string(REPLACE "," ";" included_folders "${included_folders}")

# Align output with padding
string(LENGTH "[CHIMERA-SDK] Enabled ${category}s" category_prefix_length)
math(EXPR padding_length "36 - ${category_prefix_length}")
if(padding_length GREATER 0)
string(REPEAT " " ${padding_length} padding)
else()
set(padding "")
endif()

# Debug: Print the folders being included
message(STATUS "[CHIMERA-SDK] Enabled ${category}s${padding}: ${included_folders}")

# Add subdirectories, checking for a valid CMakeLists.txt
foreach(folder IN LISTS included_folders)
if(EXISTS ${CMAKE_CURRENT_LIST_DIR}/${folder}/CMakeLists.txt)
add_subdirectory(${folder})
else()
message(WARNING "[CHIMERA-SDK] ${category} folder '${folder}' does not contain a valid CMakeLists.txt. Skipping.")
endif()
endforeach()
endmacro()
Xeratec marked this conversation as resolved.
Show resolved Hide resolved
41 changes: 41 additions & 0 deletions devices/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Copyright 2024 ETH Zurich and University of Bologna.
# Licensed under the Apache License, Version 2.0, see LICENSE for details.
# SPDX-License-Identifier: Apache-2.0
#
# Philip Wiese <[email protected]>

# Define mappings directly using lists for each target
# set(CHIMERA_DEVICE_CONVOLVE_FOLDERS snitch_cluster)
# set(CHIMERA_DEVICE_OPEN_FOLDERS snitch_cluster)
# set(CHIMERA_DEVICE_HOST_FOLDERS)

# # Determine which folders to include based on TARGET_PLATFORM
# if(TARGET_PLATFORM STREQUAL "chimera-convolve")
# set(INCLUDED_FOLDERS ${CHIMERA_DEVICE_CONVOLVE_FOLDERS})
# elseif(TARGET_PLATFORM STREQUAL "chimera-open")
# set(INCLUDED_FOLDERS ${CHIMERA_DEVICE_OPEN_FOLDERS})
# elseif(TARGET_PLATFORM STREQUAL "chimera-host")
# set(INCLUDED_FOLDERS ${CHIMERA_DEVICE_HOST_FOLDERS})
# endif()

# # WIESEP: Print the folders being included
# message(STATUS "[CHIMERA-SDK] Enabled Devices : ${INCLUDED_FOLDERS}")

# # Add subdirectories, checking for a valid CMakeLists.txt in each folder
# foreach(folder IN LISTS INCLUDED_FOLDERS)
# if(EXISTS ${CMAKE_CURRENT_LIST_DIR}/${folder}/CMakeLists.txt)
# add_subdirectory(${folder})
# else()
# message(WARNING "[CHIMERA-SDK] Device folder '${folder}' does not contain a valid CMakeLists.txt. Skipping.")
# endif()
# endforeach()

# Define mappings for devices
set(DEVICE_MAPPINGS
chimera-convolve:snitch_cluster
chimera-open:snitch_cluster
chimera-host:
)

# Call the macro
add_chimera_subdirectories(${TARGET_PLATFORM} "Device" DEVICE_MAPPINGS)
15 changes: 15 additions & 0 deletions devices/snitch_cluster/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Copyright 2024 ETH Zurich and University of Bologna.
# Licensed under the Apache License, Version 2.0, see LICENSE for details.
# SPDX-License-Identifier: Apache-2.0
#
# Philip Wiese <[email protected]>


################################################################################
# Snitch Cluster Runtime Library #
################################################################################
file(GLOB_RECURSE C_SOURCES_SNITCH
"trampoline_snitchCluster.c"
)

target_sources(runtime_cluster_snitch PRIVATE ${C_SOURCES_SNITCH})
58 changes: 58 additions & 0 deletions devices/snitch_cluster/trampoline_snitchCluster.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
// Copyright 2024 ETH Zurich and University of Bologna.
// Licensed under the Apache License, Version 2.0, see LICENSE for details.
// SPDX-License-Identifier: Apache-2.0
//
// Philip Wiese <[email protected]>

#include <stdint.h>

// Persistent trampoline function pointer for each core
extern void (*_trampoline_function)(void *);

// Peristent argument storage for the trampoline function
extern void *_trampoline_args;

// Persistant stack pointer storage for each core
extern void *_trampoline_stack;

/**
* @brief Trampoline function for the cluster core.
* This function will set up the stack pointer and call the function.
*
* @warning Make sure that this function is compiled with ISA for the Snitch cores (RV32IM)
*
*/
// WIESEP: Make sure the compiler does not allocate a stack frame
void __attribute__((naked)) _trampoline() {
asm volatile(
// Get hart ID (hardware thread ID)
"csrr t1, mhartid\n" // Load mhartid into a0

// Load global pointer
".option push\n"
".option norelax\n" // Disable relaxation to ensure `la` behaves as expected
"la gp, __global_pointer$\n" // Load address of global pointer
".option pop\n"

// Set thread pointer (tp) to zero
"mv tp, zero\n"

// Set up stack pointer
"la a0, _trampoline_stack\n" // Load address of _trampoline_stack
"slli t1, t1, 2\n" // Multiply hart ID by 4 (size of pointer)
"add a0, a0, t1\n" // Compute the address of _trampoline_stack[hartId]
"lw sp, 0(a0)\n" // Load stack pointer from the computed address

// Load function pointer and arguments
"la a0, _trampoline_function\n" // Load address of _trampoline_function
"add a0, a0, t1\n" // Compute address of _trampoline_function[hartId]
"lw a1, 0(a0)\n" // Load function pointer into a1

"la a0, _trampoline_args\n" // Load address of _trampoline_args
"add a0, a0, t1\n" // Compute address of _trampoline_args[hartId]
"lw a0, 0(a0)\n" // Load argument pointer into a0

// Call the offloaded function
"jr a1\n" // Jump and link to the function pointer in a1
);
}
19 changes: 19 additions & 0 deletions drivers/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# Copyright 2024 ETH Zurich and University of Bologna.
# Licensed under the Apache License, Version 2.0, see LICENSE for details.
# SPDX-License-Identifier: Apache-2.0
#
# Philip Wiese <[email protected]>

# Define mappings for drivers
set(DRIVER_MAPPINGS
chimera-convolve:cluster
chimera-open:cluster
chimera-host:
)

# Call the macro
add_chimera_subdirectories(${TARGET_PLATFORM} "Driver" DRIVER_MAPPINGS)


# WIESEP: Export this directory as root include directory for the drivers
target_include_directories(runtime_host PUBLIC ${CMAKE_CURRENT_LIST_DIR})
15 changes: 15 additions & 0 deletions drivers/cluster/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Copyright 2024 ETH Zurich and University of Bologna.
# Licensed under the Apache License, Version 2.0, see LICENSE for details.
# SPDX-License-Identifier: Apache-2.0
#
# Philip Wiese <[email protected]>


################################################################################
# Host Runtime Library #
################################################################################
file(GLOB_RECURSE C_SOURCES
"offload_snitchCluster.c"
)

target_sources(runtime_host PRIVATE ${C_SOURCES})
Loading