-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PUSCH receiver kernels #108
base: main
Are you sure you want to change the base?
Conversation
3fae5dd
to
d7c809d
Compare
d7c809d
to
b9ff155
Compare
21bbf18
to
4ba934c
Compare
26dacc3
to
bf0e68d
Compare
d1605f1
to
a1bc514
Compare
c7c4ea6
to
8d5ad46
Compare
e11e157
to
0f9df05
Compare
5010e69
to
916a29a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Marco, this is very good merge request. I reviewed files and there almost nothing really need to futher changes, only very few suggestions. And also, something maybe we can do together with this merge:
- For the MatMul kernels. I saw the "conflict optimization" kernel (which I did during the PUSCH DATE) was included in matmul_f32, but it is not included the integar kernels, could you also help to copy-paste into the integar kernel file?
- Dont forget to update the "change log" file.
Thats all! Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is better to add a "#define local_parallel" instead of commenting the others out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same at this kernel, if we keep all of the kernels in this "main.c", we should add #define. Otherwise, we only leave one kernel instead of commenting others out. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you.
[Lint] Fix format error
82c5e0b
to
c2db140
Compare
c2db140
to
0f1de6f
Compare
software/apps/baremetal/Makefile
Outdated
ALL_GCC := $(filter-out matmul_f16 matmul_f32, $(ALL)) | ||
ALL_LLVM := $(filter-out synth_i32 chest_q16 cfft_radix2_q16 cfft_radix4_q16, $(ALL)) | ||
FP_APPS := axpy_f16 axpy_f32 | ||
FP_APPS += cfft_radix4_f16 chest_f16 cholesky_f16 | ||
FP_APPS += cmatmul_f16 matmul_f16 matmul_f32 | ||
FP_APPS += dotp_f16 dotp_f32 | ||
FP_APPS += mimo_mmse_f32 mimo_mmse_f16 mimo_mmse_f8 ofdm_f16 | ||
|
||
I_APPS := synth_i32 | ||
I_APPS += cfft_radix2_q16 cfft_radix4_q16 chest_q16 cholesky_q16 cholesky_q32 | ||
I_APPS += cmatmul_q16 mimo_mmse_q16 | ||
|
||
ALL_GCC := $(filter-out $(FP_APPS), $(ALL)) | ||
ALL_LLVM := $(filter-out $(I_APPS), $(ALL)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we have the convention of adding the i32/f16/... suffix, we could easily automatically find all FP and I apps with a wildcard, right?
// Check the result | ||
if (core_id == 0) { | ||
for (uint32_t i = 0; i < 2 * N_TX * N_ITR; i++) { | ||
uint32_t x = (*(uint32_t *)&l1_x[i]) & 0x0000FFFF; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we do this in Banshee?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for noticing, this comes from the work on end-to-end MIMO decoding. This is how I extracted the results for Monte Carlo simulation with Banshee.
|
||
#else | ||
#define N_ROUNDS (1) | ||
#define DMA_TRANSFER1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, there are a lot of defines and parameters like this that are not documented. What happens if we don't have this define?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will remove unuseful code and document the defines.
#ifndef BANSHEE | ||
uint32_t num_cores = mempool_get_core_count(); | ||
mempool_barrier_init(core_id); // Initialize barrier and synchronize | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That we have so much special treatment of Banshee is a bit bad. What is missing in Banshee to make this work? I see that the DMA is missing, but that we could also 'hide' in the DMA runtime functions instead of like it's done here. And why are the barriers treated differently?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for noticing, this can be removed here. The special treatment of barriers comes from the work on the end-to-end MIMO to run single-core Monte Carlo simulations with Banshee.
software/runtime/runtime.mk
Outdated
@@ -26,7 +26,8 @@ DATA_DIR ?= $(abspath $(ROOT_DIR)/../data) | |||
|
|||
COMPILER ?= gcc | |||
XPULPIMG ?= $(xpulpimg) | |||
ZFINX ?= $(zfinx) | |||
ZFINX ?= $(zfinx) | |||
XDIVSQRT ?= $(xDivSqrt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just found an issue this morning. As we are using LLVM to compile the floating point kernel, the ASM syntax is slightly difference from integer ASM in GCC. For example, starting from line 460 (Sorry that I cannot comment on it inline), the branch-jump is defined like "init_comp/inner_loop/store", however, LLVM doesn't support the use of "text" to define the jump location, so we should change it to "10f->10, 11f->11, 12f->12" (pay attention: a. the variable "1" was already used; b. The branch asm need to add "f" following the jump "number").
Thanks for fixing this in advance!
191563b
to
9594bd6
Compare
9594bd6
to
264879e
Compare
Changelog
Added
Checklist
Please check our contributing guidelines before opening a Pull Request.