Add simple matmul on gemm accelerator #39

JosseVanDelm · 2023-12-05T17:44:21Z

Update: This PR just mainly puts everything in place (data layout and movement wise) to execute stuff on the gemm accelerator.

Right now it mostly bypasses MLIR stuff, because still missing:

support for more than 1D malloc
Support for matmul operation offloading
Support

WIP on adding a simple quantized matmul to this repository

func.func public @simple_matmul(%A: memref<64x64xi8>,
                                %B: memref<64x64xi8>,
                                %C: memref<64x64xi32>) -> () {
  %c_0 = arith.constant 0 : i32
  linalg.quantized_matmul ins(%A, %B, %c_0, %c_0: memref<64x64xi8>, memref<64x64xi8>, i32, i32)
  outs(%C: memref<64x64xi32>)
  return
}

Needs work:

~~figure out what parameters in set_batch_gemm can be extracted from the operation/memref itself and which ones we can assume to be hardcoded for now?~~ most are still hardcoded because we are not working on 4d memrefs yet.
figure out a way such that N, M , K defines don't clash with the library name (maybe?)
add an MLIR example, although might require help from @jorendumoulin for that.
~~Add support for or remove entirely the call to setup the CSRs for the accelerator.~~ accelerator not considered in this PR
Add golden model for datageneration
Fix automatic datageneration instead of using hardcoded values.
Add tests to CI

jorendumoulin

Very cool! Some minor comments.
I mostly don't like how we use custom variables and libraries from another repo without clearly referencing their location, but I don't know what is a better solution

jorendumoulin · 2023-12-14T13:15:29Z

kernels/simple_matmul/gendata.py

+    C_golden = np.matmul(A.astype(np.dtype("int32")), B.astype(np.dtype("int32")))
+    C = np.zeros(C_golden.shape, np.dtype("int32"))
+
+    assert A.shape[1] == B.shape[0]


The assert must come before the np.matmul computation. If the shapes do not correspond, the matmul would fail anyway, making the assert useless

jorendumoulin · 2023-12-14T13:16:44Z

kernels/simple_matmul/gendata.py

+
+    # C = A.B
+    A = np.random.randint(low_bound, high_bound, size=A_size, dtype=np.dtype("int8"))
+    # A = np.ones(A_size, dtype=np.dtype("int8"))


remove comment

jorendumoulin · 2023-12-14T13:16:54Z

kernels/simple_matmul/gendata.py

+    A = np.random.randint(low_bound, high_bound, size=A_size, dtype=np.dtype("int8"))
+    # A = np.ones(A_size, dtype=np.dtype("int8"))
+    B = np.random.randint(low_bound, high_bound, size=B_size, dtype=np.dtype("int8"))
+    # B = np.ones(B_size, dtype=np.dtype("int8"))


remove comment

jorendumoulin · 2023-12-14T13:39:00Z

runtime/include/memref.h

The MLIR docs show quite a nice way to have only one memref struct specification for all types, in C++

template<typename T, size_t N> struct MemRefDescriptor { T *allocated; T *aligned; intptr_t offset; intptr_t sizes[N]; intptr_t strides[N]; };

do you know if something similar is possible in C? if not, I guess this is fine for now

This is possible, but it looks a bit complicated https://isocpp.org/wiki/faq/mixing-c-and-cpp , let's take this for a future PR?

jorendumoulin · 2023-12-14T13:47:38Z

kernels/simple_matmul/main.c

+#include "snax-gemm-lib.h"
+#include "snax-gemm-params.h"


To me it feels kind of weird to include header files from another repository without specifying where they come from 😕 . I guess it also doesn't make sense to have them duplicate... Is there a way to share this in a better way?

if not, maybe make it clear with a comment where these files come from

Agreed! Clear like this?

jorendumoulin · 2023-12-14T13:53:12Z

kernels/simple_matmul/main.c

+  memrefA.stride[0] = sizeof(int8_t);
+  memrefA.stride[1] = sizeof(int8_t);


If we were to use a standard layout, this would also not be correct, but rather:

Suggested change

memrefA.stride[0] = sizeof(int8_t);

memrefA.stride[1] = sizeof(int8_t);

memrefA.stride[0] = sizeof(int8_t);

memrefA.stride[1] = sizeof(int8_t) * M_size;

Maybe just set them to 0 to make it very clear we are not using them.

jorendumoulin · 2023-12-14T13:53:14Z

kernels/simple_matmul/main.c

+  memrefA.aligned_data = memrefA.data;
+  memrefA.shape[0] = M_size;
+  memrefA.shape[1] = K_size;
+  // These are not considered correctly right now


Comment not very clear.

Suggested change

// These are not considered correctly right now

// Strides are not used due to the tiled-block layout.

// Instead we use the variables strideInnermostA, ldA and strideA

kernels/simple_matmul/main.c

kernels/simple_mult/main.c

kernels/simple_matmul/main.c

jorendumoulin

Awesome!!

jorendumoulin · 2023-12-18T15:10:23Z

kernels/simple_mult/main.c

@@ -55,6 +55,7 @@ int main() {

  int nerr = 0;
  for (int i = 0; i < N; i++) {
+    printf("result: %d golden: %d\n", memrefD.aligned_data[i], G[i]);


To delete then?

Will make a seperate PR for this

it's gone now haha

JosseVanDelm added 4 commits December 5, 2023 18:32

Move specific snax-library out of common makefile rules

b689b01

Add 2D variants of i8 and i32 memrefs

27735a4

Add snax-gemm makefile rules

09876df

Add WIP version of baseline

9ca258c

JosseVanDelm self-assigned this Dec 5, 2023

JosseVanDelm changed the title ~~Josse/add simple matmul~~ Add simple matmul on gemm accelerator Dec 5, 2023

JosseVanDelm added 9 commits December 12, 2023 11:27

Add automatic data generation

3c5a06a

Add caller from mlir

c6c5142

Add function to setup CPU kernel

360c183

Add a first version of allocation

ae7481c

Add working example for all ones

9c0dfb9

Add data layout transformation to golden model

d8c31be

Add workflow for simple_matmul

98bfa2b

Fix data layout transformation in golden model output

a28c6dd

Refactor gendata.py

658f204

JosseVanDelm marked this pull request as ready for review December 14, 2023 12:56

jorendumoulin requested changes Dec 14, 2023

View reviewed changes

JosseVanDelm added 2 commits December 18, 2023 16:02

Adress Joren's comments

7b8be56

Also improve comments on C memref

4a50c06

jorendumoulin reviewed Dec 18, 2023

View reviewed changes

kernels/simple_matmul/main.c Show resolved Hide resolved

JosseVanDelm requested a review from jorendumoulin December 18, 2023 15:05

jorendumoulin approved these changes Dec 18, 2023

View reviewed changes

Remove print statements from simple_mult

6334e89

JosseVanDelm merged commit f44b93a into main Dec 18, 2023
3 checks passed

JosseVanDelm mentioned this pull request Dec 21, 2023

Move to SNAX-GEMM config #33

Closed

JosseVanDelm deleted the Josse/add-simple-matmul branch February 12, 2024 11:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add simple matmul on gemm accelerator #39

Add simple matmul on gemm accelerator #39

JosseVanDelm commented Dec 5, 2023 •

edited

Loading

jorendumoulin left a comment

jorendumoulin Dec 14, 2023

jorendumoulin Dec 14, 2023

jorendumoulin Dec 14, 2023

jorendumoulin Dec 14, 2023

JosseVanDelm Dec 18, 2023 •

edited

Loading

jorendumoulin Dec 18, 2023

jorendumoulin Dec 14, 2023

jorendumoulin Dec 14, 2023

JosseVanDelm Dec 18, 2023

jorendumoulin Dec 14, 2023

jorendumoulin Dec 14, 2023

jorendumoulin left a comment

jorendumoulin Dec 18, 2023

JosseVanDelm Dec 18, 2023

JosseVanDelm Dec 18, 2023

JosseVanDelm Dec 18, 2023

		memrefA.stride[0] = sizeof(int8_t);
		memrefA.stride[1] = sizeof(int8_t);

	// These are not considered correctly right now
	// Strides are not used due to the tiled-block layout.
	// Instead we use the variables strideInnermostA, ldA and strideA

		#include "snax-gemm-lib.h"
		#include "snax-gemm-params.h"

Add simple matmul on gemm accelerator #39

Add simple matmul on gemm accelerator #39

Conversation

JosseVanDelm commented Dec 5, 2023 • edited Loading

jorendumoulin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JosseVanDelm Dec 18, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jorendumoulin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JosseVanDelm commented Dec 5, 2023 •

edited

Loading

JosseVanDelm Dec 18, 2023 •

edited

Loading