Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

samples: bluetooth: Increase nRF54H20 stack for hci_ipc #82255

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Vge0rge
Copy link
Collaborator

@Vge0rge Vge0rge commented Nov 28, 2024

Increase the stack size for hci_ipc on nRF54H20 since the IPC communication increases stack usage.

Increase the stack size for hci_ipc on nRF54H20 since the IPC
communication increases stack usage.

Signed-off-by: Georgios Vasilakis <[email protected]>
@jhedberg
Copy link
Member

Wouldn't IPC usage increase stack usage for any board, i.e. why not put this into prj.conf instead since this app is all about IPC?

Copy link
Member

@jhedberg jhedberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't IPC usage increase stack usage for any board, i.e. why not put this into prj.conf instead since this app is all about IPC?

Actually, 1024 is the default main stack size, so why do you even need this change to begin with? Does the board override the default somewhere and then you need to override the override? Doesn't sound like a very nice solution.

zephyr/kernel/Kconfig

Lines 156 to 160 in 2f583a8

config MAIN_STACK_SIZE
int "Size of stack for initialization and main thread"
default 2048 if COVERAGE_GCOV
default 512 if ZTEST && !(RISCV || X86 || ARM || ARC || NIOS2)
default 1024

Requesting changes until we get some clarity on the above

@Vge0rge
Copy link
Collaborator Author

Vge0rge commented Nov 28, 2024

Wouldn't IPC usage increase stack usage for any board, i.e. why not put this into prj.conf instead since this app is all about IPC?

Actually, 1024 is the default main stack size, so why do you even need this change to begin with? Does the board override the default somewhere and then you need to override the override? Doesn't sound like a very nice solution.

zephyr/kernel/Kconfig

Lines 156 to 160 in 2f583a8

config MAIN_STACK_SIZE
int "Size of stack for initialization and main thread"
default 2048 if COVERAGE_GCOV
default 512 if ZTEST && !(RISCV || X86 || ARM || ARC || NIOS2)
default 1024

Requesting changes until we get some clarity on the above

In the prj.conf this sample sets the main stack to 512:

CONFIG_MAIN_STACK_SIZE=512

I see no reason to affect the nrf5340 for this since it seems to work with a reduced stack.

@jhedberg
Copy link
Member

jhedberg commented Nov 28, 2024

I see no reason to affect the nrf5340 for this since it seems to work with a reduced stack.

"Seems to work" could mean that it's very close to the limit or even slightly over it and you just get lucky that it doesn't trigger a stack overflow/panic. I'd advise to at least do some profiling of exactly how much is consumed there. If there's any chance that other platforms (even non-Nordic ones) using hci_ipc might be close to the limits, I think it's safer to update prj.conf than to do this only for one single board.

@Thalley
Copy link
Collaborator

Thalley commented Nov 28, 2024

I see no reason to affect the nrf5340 for this since it seems to work with a reduced stack.

"Seems to work" could mean that it's very close to the limit or even slightly over it and you just get lucky that it doesn't trigger a stack overflow/panic. I'd advise to at least do some profiling of exactly how much is consumed there. If there's any chance that other platforms (even non-Nordic ones) using hci_ipc might be close to the limits, I think it's safer to update prj.conf than to do this only for one single board.

Agreed. It's unclear why this specific board would need additional stack

@Vge0rge
Copy link
Collaborator Author

Vge0rge commented Nov 29, 2024

I see no reason to affect the nrf5340 for this since it seems to work with a reduced stack.

"Seems to work" could mean that it's very close to the limit or even slightly over it and you just get lucky that it doesn't trigger a stack overflow/panic. I'd advise to at least do some profiling of exactly how much is consumed there. If there's any chance that other platforms (even non-Nordic ones) using hci_ipc might be close to the limits, I think it's safer to update prj.conf than to do this only for one single board.

With "seems to work" I meant that there are CI plans which run this sample and pass. I don't work closely with bluetooth, maybe the original author of this stack size can give us a comment here. @kapi-no can you give your opinion about increasing the stack size globally here?

The reason that I made this change is that I enabled in our SDK the IPC communication for this device and I realized that this image cannot run anymore because of stack overflow. So I am afraid that I don't have a very detailed description why this stack increase is needed. @Thalley could you be more specific on what type of explanation do you need for increasing the stack?

@jhedberg
Copy link
Member

Is there some reason why we can't just drop all references to MAIN_STACK_SIZE in this app? Then the option would just get the default value from kernel/Kconfig which is 1024 for the boards that you care about.

@Vge0rge
Copy link
Collaborator Author

Vge0rge commented Nov 29, 2024

Is there some reason why we can't just drop all references to MAIN_STACK_SIZE in this app? Then the option would just get the default value from kernel/Kconfig which is 1024 for the boards that you care about.

Maybe there is but I don't know why they reduced the stack size in the first place. I mentioned in my previous message the author of the line which reduces the stack size, hopefully we can get some more information about this.

@kapi-no
Copy link
Collaborator

kapi-no commented Nov 29, 2024

I see no reason to affect the nrf5340 for this since it seems to work with a reduced stack.

"Seems to work" could mean that it's very close to the limit or even slightly over it and you just get lucky that it doesn't trigger a stack overflow/panic. I'd advise to at least do some profiling of exactly how much is consumed there. If there's any chance that other platforms (even non-Nordic ones) using hci_ipc might be close to the limits, I think it's safer to update prj.conf than to do this only for one single board.

With "seems to work" I meant that there are CI plans which run this sample and pass. I don't work closely with bluetooth, maybe the original author of this stack size can give us a comment here. @kapi-no can you give your opinion about increasing the stack size globally here?

The reason that I made this change is that I enabled in our SDK the IPC communication for this device and I realized that this image cannot run anymore because of stack overflow. So I am afraid that I don't have a very detailed description why this stack increase is needed. @Thalley could you be more specific on what type of explanation do you need for increasing the stack?

I think this stack setting was selected to optimize RAM usage as it was a bit hard to fit the default configuration in the nrf5340dk/nrf5340/cpunet target with the maximum Bluetooth connection setting (CONFIG_BT_MAX_CONN=16) and the maximum size configuration of Bluetooth buffers. Also, the nrf5340dk/nrf5340/cpunet target was the only supported target back then. Please note that I may not recall all the details and nuances as I added the sample and its default configuration (prj.conf) in 2019.

@Thalley
Copy link
Collaborator

Thalley commented Nov 29, 2024

could you be more specific on what type of explanation do you need for increasing the stack?

Basically, why does the nRF54H20 need more stack than e.g. the nRF5340?
And why 1024? Why not e.g. 768?

@kapi-no
Copy link
Collaborator

kapi-no commented Nov 29, 2024

I am okay with the stack size increase as long as the Zephyr CI and NCS CI do not expose any RAM overflow issues.

@Thalley
Copy link
Collaborator

Thalley commented Nov 29, 2024

I am okay with the stack size increase as long as the Zephyr CI and NCS CI do not expose any RAM overflow issues.

CI may not catch all RAM overflows: #81044

@Vge0rge
Copy link
Collaborator Author

Vge0rge commented Nov 29, 2024

could you be more specific on what type of explanation do you need for increasing the stack?

Basically, why does the nRF54H20 need more stack than e.g. the nRF5340? And why 1024? Why not e.g. 768?

Unfortunately I cannot provide these details myself. I am not aware of stack profiling reports for the IPC mechanisms and it is not my domain as well. I am a user of IPC, I am not developing/testing it. So I chose 1024 because it the default in Zephyr and I saw in practice that it passes CI tests.

The reason that this got noticed now is that in our sdk its the first time that IPC is enabled by default and thus some tests failed because of this.

@Thalley
Copy link
Collaborator

Thalley commented Nov 29, 2024

The reason that this got noticed now is that in our sdk its the first time that IPC is enabled by default and thus some tests failed because of this.

But it isn't failing on the nRF5340?

I think it's worth figuring out why, and if there's something that requires additional stack space on the nRF54H20 that shouldn't do that, or if the nRF5340 is also at or near the limit

@Vge0rge
Copy link
Collaborator Author

Vge0rge commented Dec 2, 2024

The reason that this got noticed now is that in our sdk its the first time that IPC is enabled by default and thus some tests failed because of this.

But it isn't failing on the nRF5340?

I think it's worth figuring out why, and if there's something that requires additional stack space on the nRF54H20 that shouldn't do that, or if the nRF5340 is also at or near the limit

Since there is CI testing this for nRF5340 I (hope that I) can safely assume that indeed it doesn't fail.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Bluetooth HCI Bluetooth HCI Driver area: Bluetooth area: Samples Samples size: XS A PR changing only a single line of code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants