Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sw: Restructuring of BLAS and DNN kernels #137

Merged
merged 53 commits into from
Jul 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
53 commits
Select commit Hold shift + click to select a range
a4ca2de
[sw] GEMM: enhance test coverage, optimize data handling, and align c…
Mar 19, 2024
f60e421
sw: Uniformize `DataGen` to dnn scripts writing to file not stdout
colluca Mar 7, 2024
0252090
dnn: Enable multiple configuration testing
colluca Mar 7, 2024
67a43f8
sw: Add transpose layer
fischeti Nov 7, 2023
40f85ce
flashattention_2: Add multiple config tests
colluca Mar 7, 2024
36bed99
sw: Align GEMM-dependent kernels to GEMM changes
colluca Apr 15, 2024
6af7070
flashattention_2: Add low-precision implementations
colluca Apr 12, 2024
f8fac92
gemm: Extend datagen assertions
colluca May 7, 2024
2b1360c
sw: Definitively remove custom math library
colluca Mar 12, 2024
247760d
FA-2: add transpose func declarations
May 28, 2024
7bf7123
flashattention_2: Calculate heap size in datagen
colluca Apr 18, 2024
e757eb3
flashattention_2: Add auto-regressive inference
colluca Apr 23, 2024
e8e9b49
flashattention_2: Take measurements on second Tc tile
colluca Apr 25, 2024
95f30a5
sw: Validate GEMM, Layernorm and FA-2 tile footprints in TCDM
colluca May 7, 2024
2e6c7c9
util/sim: Add TCDM validation function to `data_utils.py`
colluca May 7, 2024
a19a5a8
sw: fix build issues
May 28, 2024
66a9afc
FA-2: align with minifloat-attention
May 29, 2024
4742d96
gemm: add remainder comp for M dim
May 31, 2024
de9e48f
[sw] layernorm: code cleanup
May 31, 2024
0f5cdcd
[sw] FA-2: Update TCASAI runs
May 31, 2024
7ef21e3
[sw] gemm: add BETA to FP16/8 baseline
Jun 18, 2024
98bbbbb
[sw] FA2 tests: adjust run script
Jun 18, 2024
c6e12aa
[make] Add configurable LOGS_DIR
Jun 18, 2024
80c8185
[sw] Fix FA2 MiniFloat verification
Jun 18, 2024
01df07f
[tests] Align configs with current setup
Jun 18, 2024
840bd25
[tests] Add FA-2 tests to GitLab CI
Jun 19, 2024
5292643
[sw] dot: Align dot product with new structure after rebase
Jun 19, 2024
b5a7ea0
[util] Account for scalar results
Jun 19, 2024
9e41b0e
[docker] Change back to main
Jun 19, 2024
185f3cf
[sw] FA-2: pass gemm_args down to kernels
Jun 28, 2024
2e92a59
[sw] FCL: explicitely declare struct fields
Jun 28, 2024
35d97ae
[sw] DNN: fix header include order in DNN header
Jun 28, 2024
8234d6d
[tests] FA-2: fix indents
Jun 28, 2024
10bd1bd
[CI] Move FA tests to FDIV suite
Jun 28, 2024
8bb889d
Lint dnn.h
colluca Jul 2, 2024
c4a341c
flashattention_2: Remove failing FP8 tests
colluca Jul 2, 2024
f27adcd
treewide: Miscellaneous cleanup
colluca Jul 2, 2024
bfb1f1d
sw: Extend dnn .gitignore to all apps
colluca Jul 2, 2024
a91853f
ci: Add extensive transpose kernel tests
colluca Jul 2, 2024
20a4160
sw: Definitively remove custom math library
colluca Mar 12, 2024
c02319c
sw: fix build issues
May 28, 2024
d576aa8
[sw] DNN: fix header include order in DNN header
Jun 28, 2024
9884e25
[TCASAI] Move TCASAI framework to separate fork
Jul 2, 2024
b94eb27
[sw] FCL: remove deprecated baseline flag
Jul 2, 2024
02fe77b
[common] Fix LOGS_DIR to avoid breaking changes
Jul 2, 2024
bcd7319
[sw] GEMM: remove redundant prec field
Jul 2, 2024
3e68f65
[FA-2] tests: remove redundant gitignore file
Jul 2, 2024
49a9cd7
[WIP] Make tests uniform across apps
Jul 3, 2024
7462d50
Extend on Vivi's refactoring of the `run.py` scripts
colluca Jul 3, 2024
133e073
Correct linting
colluca Jul 3, 2024
77848bc
Move flashattention_2 tests to FDIV hardware
colluca Jul 4, 2024
68283d2
treewide: remove global Python declaration and use venv
Jul 4, 2024
a6a3ce5
final cleanup
colluca Jul 4, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions .clang-format-ignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,4 @@
# Licensed under the Apache License, Version 2.0, see LICENSE for details.
# SPDX-License-Identifier: Apache-2.0

# Ignore vendored third-party code
./sw/math/*
./sw/saris
8 changes: 0 additions & 8 deletions .github/workflows/lint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -78,13 +78,6 @@ jobs:
match_regex: true
exclude_paths: |
sw/snRuntime/src/omp/interface.h
sw/math/arch/generic/*
sw/math/arch/riscv64/bits/*
sw/math/include/*
sw/math/src/include/*
sw/math/src/internal/*
sw/math/src/math/*
sw/math/Makefile

##################
# Lint YML Files #
Expand Down Expand Up @@ -129,7 +122,6 @@ jobs:
- uses: actions/checkout@v3
- uses: DoozyX/[email protected]
with:
exclude: './sw/saris'
clangFormatVersion: 10

######################
Expand Down
8 changes: 4 additions & 4 deletions .gitlab-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,6 @@ snitch-cluster-sw:
- make sw
artifacts:
paths:
- sw/math/include/bits/alltypes.h
- target/snitch_cluster/sw/**/build/*.elf
expire_in: 1 day

Expand All @@ -44,7 +43,6 @@ snitch-cluster-sw-banshee:
- make SELECT_RUNTIME=banshee sw
artifacts:
paths:
- sw/math/include/bits/alltypes.h
- target/snitch_cluster/sw/**/build/*.elf
expire_in: 1 day

Expand Down Expand Up @@ -103,8 +101,8 @@ snitch-cluster-vsim:
# Test trace annotation
- make SIM_DIR=./runs/vsim/simple annotate -j
# Run additional, more extensive tests
- cd sw/apps/blas/gemm/test
- ./run.py runs.yaml --cfg $PWD/cfg/* --simulator vsim -j
- cd sw/apps/blas/gemm/test && ./test.sh && cd -
- cd sw/apps/dnn/transpose/test && ./test.sh && cd -

# Banshee
snitch-cluster-banshee:
Expand All @@ -129,6 +127,8 @@ snitch-cluster-fdiv-vsim:
- make CFG_OVERRIDE=cfg/fdiv.hjson sw
- make bin/snitch_cluster.vsim
- ./util/run.py sw/fdiv.yaml --simulator vsim -j --run-dir runs/vsim
# Run additional, more extensive tests
- cd sw/apps/dnn/flashattention_2/test && ./test.sh && cd -

# Test OmegaNet TCDM interconnect
snitch-cluster-omega-vsim:
Expand Down
1 change: 1 addition & 0 deletions docs/rm/snitch_target_utils/build.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: build
1 change: 1 addition & 0 deletions docs/rm/snitch_target_utils/run.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: run
3 changes: 1 addition & 2 deletions iis-setup.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@
# SPDX-License-Identifier: Apache-2.0

# Define environment variables
export PYTHON=/usr/local/anaconda3-2022.05/bin/python3
export BENDER=bender-0.27.1
export CC=gcc-9.2.0
export CXX=g++-9.2.0
Expand All @@ -14,7 +13,7 @@ export QUESTA_SEPP=questa-2022.3
export LLVM_BINROOT=/usr/pack/riscv-1.0-kgf/pulp-llvm-0.12.0/bin

# Create Python virtual environment with required packages
$PYTHON -m venv .venv
/usr/local/anaconda3-2022.05/bin/python3 -m venv .venv
source .venv/bin/activate
# Unpack packages in a local temporary directory which can be safely cleaned
# after installation. Also protects against "No space left on device" errors
Expand Down
7 changes: 6 additions & 1 deletion mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,9 @@ plugins:
- mkdocstrings:
handlers:
python:
paths: [util/sim]
paths:
- util/sim
- target/snitch_cluster/util
- macros:
on_error_fail: true
use_directory_urls: false
Expand Down Expand Up @@ -62,6 +64,9 @@ nav:
- rm/sim/Simulation.md
- rm/sim/Simulator.md
- rm/sim/Elf.md
- Snitch Target Utilities:
- run.py: rm/snitch_target_utils/run.md
- build.py: rm/snitch_target_utils/build.md
- Snitch Runtime:
- Pages: runtime/Pages/index.md
- Files: runtime/Files/index.md
Expand Down
13 changes: 0 additions & 13 deletions sw/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,19 +11,6 @@ This subdirectory contains the various bits and pieces of software for the Snitc
- `snRuntime`: The fundamental, bare-metal runtime for Snitch systems. Exposes a minimal API to manage execution of code across the available cores and clusters, query information about a thread's context, and to coordinate and exchange data with other threads. Hardware configuration dependent implementations of the `snRuntime` can be found, e.g., under `target/snitch_cluster/sw/snRuntime`.
- `snBLAS`: A minimal reference implementation of the basic linear algebra subprograms that demonstrates the use of Snitch and its extensions.

#### math

The math sources are taken from the musl library, patched with our own modifications. The bender vendor snippet in `Bender.yml` was used to copy in the original sources. Patches were generated using the following command (from the root of the repo):
```
git format-patch --relative -o sw/deps/patches/musl/ HEAD^1
```
And can be applied by running (in the root of the repo):
```
git apply sw/deps/patches/musl/0001-musl-Patch-to-build-math-library-for-Snitch.patch
```

The `all` target in `sw/math/Makefile` should be run to generate some files.

### Tests

- `benchmark`: Benchmarking executables that evaluate the performance characteristics of a system.
Expand Down
2 changes: 1 addition & 1 deletion sw/apps/atax/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ DATAGEN_PY = $(MK_DIR)/scripts/datagen.py
DATA_H = $(DATA_DIR)/data.h

$(DATA_H): $(DATAGEN_PY) $(DATA_CFG)
$< -c $(DATA_CFG) --section="$(SECTION)" > $@
$< -c $(DATA_CFG) --section="$(SECTION)" $@

.PHONY: clean-data clean

Expand Down
2 changes: 1 addition & 1 deletion sw/apps/correlation/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ DATAGEN_PY = $(MK_DIR)/scripts/datagen.py
DATA_H = $(DATA_DIR)/data.h

$(DATA_H): $(DATAGEN_PY) $(DATA_CFG)
$< -c $(DATA_CFG) --section="$(SECTION)" > $@
$< -c $(DATA_CFG) --section="$(SECTION)" $@

.PHONY: clean-data clean

Expand Down
2 changes: 1 addition & 1 deletion sw/apps/covariance/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ DATAGEN_PY = $(MK_DIR)/scripts/datagen.py
DATA_H = $(DATA_DIR)/data.h

$(DATA_H): $(DATAGEN_PY) $(DATA_CFG)
$< -c $(DATA_CFG) --section="$(SECTION)" > $@
$< -c $(DATA_CFG) --section="$(SECTION)" $@

.PHONY: clean-data clean

Expand Down
2 changes: 1 addition & 1 deletion sw/blas/axpy/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ $(dir $(DATA_H)):
mkdir -p $@

$(DATA_H): $(DATAGEN_PY) $(DATA_CFG) | $(dir $(DATA_H))
$< -c $(DATA_CFG) --section="$(SECTION)" > $@
$< -c $(DATA_CFG) --section="$(SECTION)" $@

.PHONY: clean-data clean

Expand Down
2 changes: 1 addition & 1 deletion sw/blas/dot/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ $(dir $(DATA_H)):
mkdir -p $@

$(DATA_H): $(DATAGEN_PY) $(DATA_CFG) | $(dir $(DATA_H))
$< -c $(DATA_CFG) --section="$(SECTION)" > $@
$< -c $(DATA_CFG) --section="$(SECTION)" $@

.PHONY: clean-data clean

Expand Down
2 changes: 1 addition & 1 deletion sw/blas/gemm/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ $(dir $(DATA_H)):
mkdir -p $@

$(DATA_H): $(DATAGEN_PY) $(DATA_CFG) | $(dir $(DATA_H))
$< -c $(DATA_CFG) --section="$(SECTION)" > $@
$< -c $(DATA_CFG) --section="$(SECTION)" $@

.PHONY: clean-data clean

Expand Down
9 changes: 9 additions & 0 deletions sw/blas/gemm/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Copyright 2024 ETH Zurich and University of Bologna.
# Licensed under the Apache License, Version 2.0, see LICENSE for details.
# SPDX-License-Identifier: Apache-2.0
#
# Luca Colagrande <[email protected]>

from .scripts import datagen

__all__ = ['datagen']
33 changes: 17 additions & 16 deletions sw/blas/gemm/data/params.json
Original file line number Diff line number Diff line change
@@ -1,22 +1,23 @@
// Copyright 2023 ETH Zurich and University of Bologna.
// Solderpad Hardware License, Version 0.51, see LICENSE for details.
// SPDX-License-Identifier: SHL-0.51

// Parameters for a GEMM
// Copyright 2024 ETH Zurich and University of Bologna.
// Licensed under the Apache License, Version 2.0, see LICENSE for details.
// SPDX-License-Identifier: Apache-2.0

{
M: 192,
setup_ssr: 1,
parallelize_m: 0,
parallelize_k: 0,
m_tiles: 2, // number of tiles in M dimension
n_tiles: 1, // number of tiles in N dimension
k_tiles: 1, // number of tiles in K dimension
load_a: 1,
load_b: 1,
load_c: 1,
transa: false,
transb: true, // must be true for SIMD
M: 16,
N: 16,
K: 16,
alpha: 1,
beta: 0,
ta: false,
tb: true, // must be true for SIMD
prec: "FP64",
expand: 0,
m_tiles: 2, // number of tiles in M dimension
k_tiles: 1, // number of tiles in K dimension
n_tiles: 1, // number of tiles in N dimension
parallelize_k: 0,
parallelize_m: 0,
baseline: false
gemm_fp: "gemm_fp32_opt"
}
viv-eth marked this conversation as resolved.
Show resolved Hide resolved
Loading
Loading