Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PUSCH receiver kernels #108

Open
wants to merge 40 commits into
base: main
Choose a base branch
from
Open

PUSCH receiver kernels #108

wants to merge 40 commits into from

Conversation

mbertuletti
Copy link
Collaborator

Changelog

Added

  • FFT f16
  • Complex matmul f16, complex matmul fixed-point 16
  • Cholesky decomposition f16, Cholesky decomposition fixed-point 16
  • Linear system solution f16, Linear system solution fixed-point 16
  • MMSE application f16, MMSE application fixed-point 16
  • OFDM application f16

Checklist

  • Automated tests pass
  • Changelog updated
  • Code style guideline is observed

Please check our contributing guidelines before opening a Pull Request.

@mbertuletti mbertuletti force-pushed the mbertuletti/mimo_receiver branch 6 times, most recently from 3fae5dd to d7c809d Compare April 25, 2024 13:55
@mbertuletti mbertuletti force-pushed the mbertuletti/mimo_receiver branch from d7c809d to b9ff155 Compare July 5, 2024 14:58
@mbertuletti mbertuletti force-pushed the mbertuletti/mimo_receiver branch 5 times, most recently from 21bbf18 to 4ba934c Compare August 27, 2024 14:51
@mbertuletti mbertuletti force-pushed the mbertuletti/mimo_receiver branch 6 times, most recently from 26dacc3 to bf0e68d Compare September 5, 2024 12:22
@mbertuletti mbertuletti force-pushed the mbertuletti/mimo_receiver branch 4 times, most recently from d1605f1 to a1bc514 Compare September 13, 2024 06:59
@mbertuletti mbertuletti force-pushed the mbertuletti/mimo_receiver branch 2 times, most recently from c7c4ea6 to 8d5ad46 Compare October 16, 2024 09:20
@mbertuletti mbertuletti force-pushed the mbertuletti/mimo_receiver branch 3 times, most recently from e11e157 to 0f9df05 Compare October 28, 2024 13:19
@mbertuletti mbertuletti force-pushed the mbertuletti/mimo_receiver branch from 5010e69 to 916a29a Compare December 6, 2024 10:57
Copy link
Collaborator

@yichao-zh yichao-zh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Marco, this is very good merge request. I reviewed files and there almost nothing really need to futher changes, only very few suggestions. And also, something maybe we can do together with this merge:

  1. For the MatMul kernels. I saw the "conflict optimization" kernel (which I did during the PUSCH DATE) was included in matmul_f32, but it is not included the integar kernels, could you also help to copy-paste into the integar kernel file?
  2. Dont forget to update the "change log" file.

Thats all! Thanks!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is better to add a "#define local_parallel" instead of commenting the others out.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, thanks.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same at this kernel, if we keep all of the kernels in this "main.c", we should add #define. Otherwise, we only leave one kernel instead of commenting others out. What do you think?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you.

@mbertuletti mbertuletti force-pushed the mbertuletti/mimo_receiver branch 4 times, most recently from 82c5e0b to c2db140 Compare December 10, 2024 14:01
@mbertuletti mbertuletti force-pushed the mbertuletti/mimo_receiver branch from c2db140 to 0f1de6f Compare December 10, 2024 16:00
Comment on lines 23 to 34
ALL_GCC := $(filter-out matmul_f16 matmul_f32, $(ALL))
ALL_LLVM := $(filter-out synth_i32 chest_q16 cfft_radix2_q16 cfft_radix4_q16, $(ALL))
FP_APPS := axpy_f16 axpy_f32
FP_APPS += cfft_radix4_f16 chest_f16 cholesky_f16
FP_APPS += cmatmul_f16 matmul_f16 matmul_f32
FP_APPS += dotp_f16 dotp_f32
FP_APPS += mimo_mmse_f32 mimo_mmse_f16 mimo_mmse_f8 ofdm_f16

I_APPS := synth_i32
I_APPS += cfft_radix2_q16 cfft_radix4_q16 chest_q16 cholesky_q16 cholesky_q32
I_APPS += cmatmul_q16 mimo_mmse_q16

ALL_GCC := $(filter-out $(FP_APPS), $(ALL))
ALL_LLVM := $(filter-out $(I_APPS), $(ALL))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we have the convention of adding the i32/f16/... suffix, we could easily automatically find all FP and I apps with a wildcard, right?

// Check the result
if (core_id == 0) {
for (uint32_t i = 0; i < 2 * N_TX * N_ITR; i++) {
uint32_t x = (*(uint32_t *)&l1_x[i]) & 0x0000FFFF;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we do this in Banshee?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for noticing, this comes from the work on end-to-end MIMO decoding. This is how I extracted the results for Monte Carlo simulation with Banshee.


#else
#define N_ROUNDS (1)
#define DMA_TRANSFER1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, there are a lot of defines and parameters like this that are not documented. What happens if we don't have this define?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will remove unuseful code and document the defines.

Comment on lines 58 to 61
#ifndef BANSHEE
uint32_t num_cores = mempool_get_core_count();
mempool_barrier_init(core_id); // Initialize barrier and synchronize
#endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That we have so much special treatment of Banshee is a bit bad. What is missing in Banshee to make this work? I see that the DMA is missing, but that we could also 'hide' in the DMA runtime functions instead of like it's done here. And why are the barriers treated differently?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for noticing, this can be removed here. The special treatment of barriers comes from the work on the end-to-end MIMO to run single-core Monte Carlo simulations with Banshee.

@@ -26,7 +26,8 @@ DATA_DIR ?= $(abspath $(ROOT_DIR)/../data)

COMPILER ?= gcc
XPULPIMG ?= $(xpulpimg)
ZFINX ?= $(zfinx)
ZFINX ?= $(zfinx)
XDIVSQRT ?= $(xDivSqrt)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing out

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just found an issue this morning. As we are using LLVM to compile the floating point kernel, the ASM syntax is slightly difference from integer ASM in GCC. For example, starting from line 460 (Sorry that I cannot comment on it inline), the branch-jump is defined like "init_comp/inner_loop/store", however, LLVM doesn't support the use of "text" to define the jump location, so we should change it to "10f->10, 11f->11, 12f->12" (pay attention: a. the variable "1" was already used; b. The branch asm need to add "f" following the jump "number").

Thanks for fixing this in advance!

@mbertuletti mbertuletti force-pushed the mbertuletti/mimo_receiver branch 5 times, most recently from 191563b to 9594bd6 Compare December 19, 2024 13:44
@mbertuletti mbertuletti force-pushed the mbertuletti/mimo_receiver branch from 9594bd6 to 264879e Compare December 19, 2024 13:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants