-
Notifications
You must be signed in to change notification settings - Fork 321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ipc4 multicore issue fixes #8143
Conversation
d41a5e9
to
1fef2a9
Compare
590c626
to
8681982
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good catch @RanderWang , the SECONDARY_CORE checking indeed does not seem. One minor code style proposal inline, but otherwise looks good.
In multicore case, IPC message is dispatched from primary core to secondary core which send reply message to host. Primary core will do nothing if IPC_TASK_SECONDARY_CORE is set. But in rare case, the secondary code finish the reply message and clear this flag before the ipc thread in primary core check this flag, then primary core also send reply message again. This results to the reply message being inserted two times in ipc message list and infinite loop when visiting the list. This patch will check msg_reply state and do nothing if it is prepared . We don't need to init reply message since it is initialized after deleting from the ipc list. Signed-off-by: Rander Wang <[email protected]>
8681982
to
000a099
Compare
Use list_is_empty to check the message is queued or not. The notify message is initialized to empty after deleting from the ipc msg list. We use the same idea in ipc_msg_send. Signed-off-by: Rander Wang <[email protected]>
000a099
to
793cd28
Compare
Validated this PR with 3 different MTL RVPs, all passed a 3 hours multicore stress test. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code is good, but there's an issue with the long inline comment.
Lets get this tested in daily test tomorrow. @mwasko fyi. |
BUG: Various IPC timed out with multi-core run simultaneously #7774.
This issue is very random and can be reproduced after many cycles test. The direct reason is that : there is a self loop
in the ipc message list like: ipc msg -> M1 -> M2 ->..... > Mn ->Mn , this result to a infinite loop with ipc spin lock hold for the logging function. This is one major reason that the issue can't be reproduced with logging disabled.
Why the single ipc message is double added ? The reason is that In multicore case, IPC message is dispatched from primary core to secondary core which send reply message to host. Primary core will do nothing if IPC_TASK_SECONDARY_CORE is set. But in rare case, the secondary code finish the reply message and clear this flag before the ipc thread in primary core check this flag because the primary core was busy with logging thread and check it after awhile, then primary core also send reply message again. This results to the reply message being inserted two times in ipc message list and result to infinite loop when visiting the list for the logging after awhile.
We can find some hint in kernel log before ipc timeout.
For the case in bug:#7774
This PR checks whether the msg has been done by secondary core and do nothing for primary core. Also simply the check for logging.