-
Notifications
You must be signed in to change notification settings - Fork 321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mailbox: use uncached address to write sw regs #8398
Conversation
please check the following log. core1 and core2 allocate the same slot offset and have the different content of mailbox (the bold lines should be same). The time stamp show that there is no competition risk. [ 0.449035] ipc: dai_init_llp_info: comp:10 0xe0004 +++ alloc slot 77c, now 19 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marcinszkudlinski @lyakh please take a look, we'd have to understand why this PR has an impact.
offset), bytes); | ||
ptr_c = (uint32_t __sparse_cache *)(MAILBOX_SW_REG_BASE + offset); | ||
ptr = cache_to_uncache(ptr_c); | ||
memcpy_s(ptr, MAILBOX_SW_REG_SIZE - offset, src, bytes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is quite worrying finding, how come this has impact? Is the register read via cached address by other cores? This still doesn't explain why this PR would have any impact. We should reallly understand why dcache_writeback_region() is not working, we rely on this in many places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% sure if this is true - but I think that data inside the sw_mailbox may be modified by the Linux driver. dcache writeback may destroy changes if this is the case.
Cache access to the mailbox doesn't make sense anyway.
and
mailbox_sw_regs_write is used by dai to update llp node. In multi-core test, the updated llp node is invisable to other cores although dcache_writeback_region is used
dcache writeback is not enough, dcache_invalidate is needed on the reader core.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I am also very surprised. The reading method is mailbox_sw_reg_read, so it should be ok. This change use the same method for mailbox_sw_reg_read.
And there is a PR to switch to uncached memory by who ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Linux driver doesn't access llp in mailbox now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dcache writeback is not enough, dcache_invalidate is needed on the reader core.
This would be a logical explanation, yes. Can we add such an invalidation on the reader side? But in principle - yes, a cached write plus a writeback shouldn't be faster than just a plain simple uncached write. We'd only benefit from cached access there if the same core were accessing that data multiple times without dropping caches - reading or writing (without writing back) data there. But I doubt that we ever do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reg reads seem to use uncached addresses, so these seem correct. Nevertheless, i don't think the writeback makes given the range of addresses may not be cache aligned, so best to change the register write routines to use uncached addresses.
@kv2019i what kind of registers are those?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lyakh these are "sw regs", so I guess it can be whatever sw/fw agree upon. Do you think the writeback is safe here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lyakh these are "sw regs", so I guess it can be whatever sw/fw agree upon. Do you think the writeback is safe here?
@kv2019i my understanding of "software registers" is that they're just generic locations in RAM used purely by the software without any direct hardware effects? If that's correct, then at the physical level they should also be handled like ordinary RAM, i.e. all cache management applies?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thoughs here are
- cache ops here are not being aligned or forced on cache line boundaries or sizes.
- No obvious locking, so can race (and corrupt with 1 above).
- The uncache alias is probably fixing this issue as the HW IO arbitration is probably enforcing serialised IO on the memory for all cores (i.e. the HW bus arbitration is a HW spinlock).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3. The
My thoughs here are
1. cache ops here are not being aligned or forced on cache line boundaries or sizes.
sw_regs_write may be no cache aligned since the llp slot size is 20 bytes
2. No obvious locking, so can race (and corrupt with 1 above).
we have lock to protect sw_regs_write
3. The uncache alias is probably fixing this issue as the HW IO arbitration is probably enforcing serialised IO on the memory for all cores (i.e. the HW bus arbitration is a HW spinlock).
llp code in dai. Two cores find the same slot is unused, and they check the slot status at different time (time interval is about 20ms, so no competition risk) key = k_spin_lock(&sof_get()->fw_reg_lock);
/* find unused llp slot offset with node_id of zero */
for (i = 0; i < max_slot; i++, offset += sizeof(slot)) {
uint32_t node_id;
node_id = mailbox_sw_reg_read(offset);
if (!node_id)
break;
}
if (i >= max_slot) {
comp_err(dev, "can't find free slot");
k_spin_unlock(&sof_get()->fw_reg_lock, key);
return -EINVAL;
}
memset_s(&slot, sizeof(slot), 0, sizeof(slot));
slot.node_id = node->dw & IPC4_NODE_ID_MASK;
mailbox_sw_regs_write(offset, &slot, sizeof(slot));
k_spin_unlock(&sof_get()->fw_reg_lock, key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RanderWang can you remove the volatile and use the uncache alias only - does this help ?
offset), bytes); | ||
ptr_c = (uint32_t __sparse_cache *)(MAILBOX_SW_REG_BASE + offset); | ||
ptr = cache_to_uncache(ptr_c); | ||
memcpy_s(ptr, MAILBOX_SW_REG_SIZE - offset, src, bytes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thoughs here are
- cache ops here are not being aligned or forced on cache line boundaries or sizes.
- No obvious locking, so can race (and corrupt with 1 above).
- The uncache alias is probably fixing this issue as the HW IO arbitration is probably enforcing serialised IO on the memory for all cores (i.e. the HW bus arbitration is a HW spinlock).
mailbox_sw_regs_write is used by dai to update llp node. In multi-core test, the updated llp node is invisable to other cores although dcache_writeback_region is used on both cavs & ace platforms. Now use uncached address to write memory directly which is aligned with other mailbox access functions. Validated on cavs & ace platforms. Signed-off-by: Rander Wang <[email protected]>
2a93022
to
99ebb76
Compare
removed volatile , thanks! Tested on TGL & MTL |
It may be could improve thesofproject/linux#4832 since this PR also fixes dai module initialization failure. |
mailbox_sw_regs_write is used by dai to update llp node. In multi-core test, the updated llp node is invisable to other cores although dcache_writeback_region is used on both cavs & ace platforms. Now use uncached address to write memory directly which is aligned with other mailbox access functions.
Validated on cavs & ace platforms.
multi-core topology: #8240