Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snitch Cluster Offloading #13

Merged
merged 15 commits into from
Dec 13, 2024
Merged

Conversation

Xeratec
Copy link
Collaborator

@Xeratec Xeratec commented Nov 29, 2024

Changelog

This PR introduces significant improvements and additions to the project, focusing on enhanced test infrastructure, driver functionality, and target support. Key updates include better separation of host and cluster code, support for mixed ISA compilation, and the addition of a new cluster target for the CONVOLVE tapeout. The changes also address critical fixes related to interrupt handling and early-execution functions, ensuring robust and reliable system behavior.

Mixed ISA Compilation

The current approach compiles all code for both the host and cluster cores into a single library. This requires precise handling to ensure compatibility between the different instruction set architectures (ISAs) and application binary interfaces (ABIs).
This requires careful handling to avoid invalid instructions caused by mismatched ISAs between the host and cluster cores. Hence, we define four CMake variables,ABI, ISA_HOST, and ISA_CLUSTER_SNITCH, to specify the appropriate ISA for each core type. The ABI has to be identical to ensure correct function calls.
Furthermore, the tests are split into src_host and src_cluster directories to clearly separate code executed on the host and cluster cores.

cMake Build Flow

All runtime functions executed by the host core are compiled into a dedicated runtime_host static library and the cluster code into runtime_cluster_<type> (e.g. runtime_cluster_snitch). Additionally, the HAL layer is compiled into the hal_host static libary.
The final binary is seperated into two object libaries, one for the host and one for the cluster core. The host object library is linked with the runtime_host and hal_host libraries, while the cluster object library is linked with the runtime_cluster_<type> library. The final binary is then linked from the two object libraries.

Devices

Additionally, this PR adds a newdevices folder where we in the future pull in the device SDKs (such as PULP or Snitch runtime) as external libraries. The folder structure looks like this:

chimera-sdk
├─ drivers
│ ├─ uart
│ └─ cluster
├─ devices (NEW)
│ ├─ snitch_cluster
│ │ ├─ trampoline_snitchCluster.c
│ │ └─ snitch_runtime (external repo)
│ ├─ pulp_cluster
│ │ ├─ trampoline_pulpCluster.c
│ │ └─ pulp_runtime (external repo)
│ └─ ITA? (external repo)
├─ hal
└─ targets

(Some folders are omitted for simplicity)

Warning

Special attention is required for functions that execute before the cluster core is fully initialized, such as the trampoline function and interrupt handlers. At this stage, critical resources like the stack, global pointer, and thread pointer are not yet configured. Consequently, the compiler must not generate code that allocates stack frames. To address this, such functions are implemented as naked functions, which prevent the compiler from adding prologues or epilogues that rely on stack operations.

Added

  • Driver Functionality:
    • Introduced a cluster driver as part of the runtime_host that currently includes support for the snitch cluster.
  • Device Functionality:
    • Introduced the devices folder to add device specific code and add external runtimes.
  • CONVOLVE Cluster Target:
    • Added a new cluster target named CONVOLVE.
  • VSCode Configuration
    • Add VSCode configuration to enable automatic configuration of the C/C++ extension and support for the integrated CMake build flow on the IIS workstations.

Changed

  • Refactored Test Infrastructure:
    • Split into generic and target-specific tests.
    • Organized subtests by execution environment (host, snitchCluster).
    • Clear distinction made between host code and cluster code. This allows to specify the ISA and ABI for the cluster and host code separately
  • Refactored CMake Flow:
    • Refactored CMake build flow to collect all source into static runtime_host and runtime_cluster_<type> library and keep ISA flags private.
  • Improve README:
    • Added explanation to use the C/C++ extension for VSCode on the IIS workstations.
    • Simplified formatting commands

Fixed

  • ISA Compatibility and Naked Functions:
    • Addressed invalid instructions by ensuring that the trampoline function and interrupt handlers are correctly compiled with the appropriate ISA and ABI configurations.
    • Fixed issues with improperly generated stack frames for functions executed before cluster initialization by enforcing the use of naked functions.
  • Compiler Warnings
    • Avoid warning: trap_vector changed binding to STB_WEAK

- Split into generic and target specific tests
- Split subtests into where they are executed (host, snitchCluster)
- Refactor cmake build flow
- Add cluster driver
- Fix potential problem with cluster interrupt handler
- Make clear what is host code and what is cluster code
@Xeratec Xeratec requested a review from viv-eth November 29, 2024 16:17
@Xeratec Xeratec self-assigned this Nov 29, 2024
@Xeratec Xeratec requested a review from Scheremo November 29, 2024 16:19
@Xeratec Xeratec force-pushed the pr/snitchOffloading branch from 5fb4ab3 to 73c67c4 Compare November 29, 2024 16:32
@Xeratec Xeratec force-pushed the pr/snitchOffloading branch from 3e9dffe to 0ac8969 Compare November 29, 2024 16:38
@Xeratec Xeratec mentioned this pull request Nov 29, 2024
4 tasks
Copy link
Contributor

@viv-eth viv-eth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check if we are compatible with #11 , other than that only minor comments. Regarding the whole offloading process, please await feedback from @Scheremo

driver/cluster/CMakeLists.txt Outdated Show resolved Hide resolved
targets/chimera-convolve/CMakeLists.txt Outdated Show resolved Hide resolved
tests/chimera-convolve/.gitkeep Outdated Show resolved Hide resolved
tests/chimera-host/.gitkeep Outdated Show resolved Hide resolved
tests/chimera-open/.gitkeep Outdated Show resolved Hide resolved
Copy link
Collaborator

@Scheremo Scheremo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments. Functionally the changes look reasonable to me - I think we can improve implementation on the C side a bit, but that doesn't seem so urgent. CMake structure seems suboptimal to me, I think we should invest a bit of design effort into clearly disambiguating different ISAs and not push setting compile flags to the user.

CMakeLists.txt Outdated Show resolved Hide resolved
CMakeLists.txt Outdated Show resolved Hide resolved
CMakeLists.txt Outdated Show resolved Hide resolved
README.md Show resolved Hide resolved
drivers/cluster/CMakeLists.txt Outdated Show resolved Hide resolved
drivers/cluster/CMakeLists.txt Outdated Show resolved Hide resolved
drivers/cluster/offload_snitchCluster.c Show resolved Hide resolved
tests/CMakeLists.txt Outdated Show resolved Hide resolved
tests/generic/CMakeLists.txt Outdated Show resolved Hide resolved
tests/generic/host/returnZero/CMakeLists.txt Outdated Show resolved Hide resolved
@Xeratec Xeratec force-pushed the pr/snitchOffloading branch from dd5eda1 to 5dc3247 Compare December 10, 2024 16:48
@Xeratec
Copy link
Collaborator Author

Xeratec commented Dec 10, 2024

@Scheremo I implemented the feedback provided here and from our discussion.

Changed

  • Use same ABI for all objects
  • Improved some printing information during CMake configuration
  • Compile separately a runtime_host and runtime_cluster
  • Support adding multiple test folders for one target
  • Move generic tests chimera-open and chimera-convolve folders
  • Renamed hal library to hal_host

@Xeratec Xeratec requested a review from Scheremo December 10, 2024 16:49
@Xeratec Xeratec mentioned this pull request Dec 10, 2024
3 tasks
@Xeratec Xeratec requested a review from viv-eth December 10, 2024 17:13
Copy link
Collaborator

@Scheremo Scheremo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you almost perfectly nailed it this time around - remainder of my comments are on minor cmake structuring and code org. Those changes are (more or less) optional, the big picture looks very good to me!

CMakeLists.txt Show resolved Hide resolved
CMakeLists.txt Outdated Show resolved Hide resolved
CMakeLists.txt Outdated Show resolved Hide resolved
drivers/cluster/offload_snitchCluster.c Show resolved Hide resolved
drivers/cluster/trampoline_snitchCluster.c Outdated Show resolved Hide resolved
@Xeratec Xeratec force-pushed the pr/snitchOffloading branch from a991b97 to 8b2418c Compare December 11, 2024 11:12
@Xeratec
Copy link
Collaborator Author

Xeratec commented Dec 11, 2024

@Scheremo I implemented the suggested changes and added a devices folder with the trampoline function for the snitch_cluster as suggested above. Additionally, I slightly refactored the CMake structure to avoid duplication. With this it is possible to include certain tests, drivers and devices for specific targets using a mapping and a CMake macro.

For example

set(DEVICE_MAPPINGS
    chimera-convolve:snitch_cluster
    chimera-open:snitch_cluster
    chimera-host:
)

# Call the macro
add_chimera_subdirectories(${TARGET_PLATFORM} "Device" DEVICE_MAPPINGS)

Additionally, I add the libraries in the targets. This has the advantage that we do not create runtime libraries for devices that are not required for this target. The downside is that the targets subfolder has to be added before the others in the top level CMakeLists.txt. I added a comment to clarify this.

If you are ok with the changes, please resolve the conversation and re-approve the PR. Thanks for all the super valuable feedback!

@Xeratec Xeratec requested a review from Scheremo December 11, 2024 11:13
@Xeratec Xeratec added the enhancement New feature or request label Dec 12, 2024
Copy link
Collaborator

@Scheremo Scheremo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a fan of the procedural macro, but I'll leave final judgement up to you. Otherwise looks good to me.

cmake/Utils.cmake Outdated Show resolved Hide resolved
- This way I can ensure that the macro does not unintentionally modify the parent scope.
@Xeratec Xeratec merged commit 948131b into pulp-platform:devel Dec 13, 2024
@Xeratec Xeratec deleted the pr/snitchOffloading branch December 13, 2024 09:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants