Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NET_TC_THREAD_PREEMPTIVE=y results in race conditions in LwM2M engine #78989

Closed
dsitelew-gcx opened this issue Sep 25, 2024 · 4 comments · Fixed by #79847
Closed

NET_TC_THREAD_PREEMPTIVE=y results in race conditions in LwM2M engine #78989

dsitelew-gcx opened this issue Sep 25, 2024 · 4 comments · Fixed by #79847
Assignees
Labels
area: LWM2M area: Networking bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug

Comments

@dsitelew-gcx
Copy link
Contributor

Describe the bug

Enabling the preemptive TX/RX threads introduces race conditions in the LwM2M engine.

Looking at the socket_loop implementation there is almost no synchronisation. For example, scheduling a message with lwm2m_send_cb results in a crash due to a race condition, e.g:

[01:00:52.833,679] <dbg> net_lwm2m_message_handling: reply 0x20019d50 handled and removed
[01:00:52.833,831] <err> os: ***** SECURE FAULT *****
[01:00:52.833,831] <err> os:   Address: 0x10
[01:00:52.833,862] <err> os:   Attribution unit violation
[01:00:52.833,862] <err> os: r0/a1:  0x00000000  r1/a2:  0x00000000  r2/a3:  0x2001d764
[01:00:52.833,892] <err> os: r3/a4:  0x00000000 r12/ip:  0x00004000 r14/lr:  0x00015e35
[01:00:52.833,892] <err> os:  xpsr:  0x21000000
[01:00:52.833,923] <err> os: Faulting instruction address (r15/pc): 0x00044c98
[01:00:52.833,953] <err> os: >>> ZEPHYR FATAL ERROR 41: Unknown error on CPU 0
[01:00:52.833,984] <err> os: Current thread: 0x20013448 (lwm2m-sock-recv)
[01:00:52.845,520] <err> fatal_error: Resetting system

In this case it's a NULL-pointer dereference here

To Reproduce

Sorry, no minimal reproducible code example.
I think setting the NET_TC_THREAD_PREEMPTIVE to y and sending a lot of messages should suffice.

Expected behavior

  • LwM2M-engine should be thread-safe.
  • Alternatively it should not allow turning the preemptive multithreading on.

Sorry if this is the wrong place to post this, I just wanted to warn others of a potential problem.

The configuration flag is of course marked as experimental, so there should be no expectation that everything will work as expected, but I think since it is clear that LwM2M will not work with this flag, turning it on should result in a build error until the LwM2M engine is made thread-safe.

@dsitelew-gcx dsitelew-gcx added the bug The issue is a bug, or the PR is fixing a bug label Sep 25, 2024
Copy link

Hi @dsitelew-gcx! We appreciate you submitting your first issue for our open-source project. 🌟

Even though I'm a bot, I can assure you that the whole community is genuinely grateful for your time and effort. 🤖💙

@dkalowsk
Copy link
Contributor

dkalowsk commented Oct 1, 2024

@dsitelew-gcx which platform is this on?

@dkalowsk dkalowsk added the priority: low Low impact/importance bug label Oct 1, 2024
@dsitelew-gcx
Copy link
Contributor Author

@dkalowsk sorry, forgot to mention it, it's an nRF9160 (Arm Cortex-M33).

Honestly, looking at the code, I thought it didn't matter what platform it was.

@rlubos
Copy link
Contributor

rlubos commented Oct 15, 2024

I've reproduced the crash, the culprit turned out the be a preempted memset() during message deallocation. It should be fixed with #79847, with those fixes in place I was able to flood the server with LwM2M send messages w/o hitting the crash again (initially it crashed after a few seconds).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: LWM2M area: Networking bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants