Cross core fixes for mtl-006-drop-stable #8194

serhiy-katsyuba-intel · 2023-09-13T14:00:41Z

No description provided.

src/ipc/ipc4/helper.c

marcinszkudlinski · 2023-09-14T09:18:33Z

src/include/sof/audio/audio_stream.h

+	 * For cross-core connection such problems result in weird bugs: corrupted
+	 * memory due to wrong invalidate size.
+	 *
+	 * Note, this workaround will not work for 8 bit audio!


WA is good for now - soon comp_buffers won't be used in cross core connections at all

marcinszkudlinski · 2023-09-14T09:19:52Z

src/audio/mixin_mixout/mixin_mixout.c

+		/* Just ignore full sinks and copy data to other connected mixouts */
+		full_sinks[i] = (frames == 0);
+		if (frames == 0) {
+///if (sink_c->sink && sink_c->sink->state == COMP_STATE_ACTIVE)


style, pls fix even in 006

lyakh

I think as it stands we have a few locations, e.g. in pipeline scheduling core, that aren't cross-core safe. You can find multiple "FIXME" and other comments around the code... I wouldn't be surprised if attempts to run cross-core with these changes result in crashes.

lyakh · 2023-09-14T07:50:36Z

src/audio/mixin_mixout/mixin_mixout.c

@@ -46,24 +46,6 @@ DECLARE_TR_CTX(mixout_tr, SOF_UUID(mixout_uuid), LOG_LEVEL_INFO);
 #define MIXIN_MAX_SINKS IPC4_MIXIN_MODULE_MAX_OUTPUT_QUEUES
 #define MIXOUT_MAX_SOURCES IPC4_MIXOUT_MODULE_MAX_INPUT_QUEUES



to the commit description: does this mean, that we now always would have a buffer between a mixin and a mixout? Only to support those probably rather rare cases where we have to connect cross-cores? I think it would be a pity to lose that optimisation. No way to have both? Maybe even a separate component, but hopefully that isn't needed. But I'd make a clear separation between the two cases - maybe add a new file for that.

lyakh · 2023-09-14T07:57:51Z

src/audio/mixin_mixout/mixin_mixout.c

 	 */
+	avail_frames = input_buffers[0].size;	/// audio_stream_get_avail_frames(input_buffers[0].data);


comment style, and that's probably not a comment, you wanted to have upstream?

lyakh · 2023-09-14T08:01:17Z

src/audio/mixin_mixout/mixin_mixout.c

+		full_sinks[i] = (frames == 0);
+		if (frames == 0) {
+///if (sink_c->sink && sink_c->sink->state == COMP_STATE_ACTIVE)
+///	return 0;


lyakh · 2023-09-14T10:49:07Z

src/ipc/ipc4/helper.c

@@ -433,13 +429,12 @@ int ipc_comp_connect(struct ipc *ipc, ipc_pipe_comp_connect *_connect)
 		source->direction_set = true;
 	}

-	irq_local_enable(flags);
-


you wrote that it cannot be preempted - how about the scheduler? Cannot a timer interrupt preempt it and then run the LL-scheduler and access these buffers in an inconsistent state? Doesn't seem safe to me

agree with @lyakh
@lyakh problem is that we cannot proceed IDC when interrupts are locked, but the bigger problem is that we DO need to have connected LL pipelines this week on branch 006, there's no time to do it better in the way we discussed before

@serhiy-katsyuba-intel maybe we can try to locate all modules structures in shared memory and execute bind on primary core? Just a suggestion without deep analysis

you wrote that it cannot be preempted - how about the scheduler? Cannot a timer interrupt preempt it and then run the LL-scheduler and access these buffers in an inconsistent state? Doesn't seem safe to me

I assume driver should send bind IPC only when pipelines (with binding components) are not running. If this assumption is true -- bind operation can only be preempted to run some other pipelines but not the pipelines containing binding components.

Second assumption, is that IPC has no bugs (hope this does not sound too naive :) ). So while bind IPC is processing, IPC register has BUSY flag set indicating to driver FW is not ready to receive next IPC. After bind IPC processing finished, FW sends IPC reply that clears BUSY flag and only after that driver could send another IPC, like start the pipelines.

If these two assumptions are correct, when we a safe.

The problem with irq_local_disable() (actually, the name is bad -- it disables interrupts globally on all cores) is that after interrupts are disabled IDC stops working, so thing like process_on_core(..., BLOCKING) or comp_bind_remote() will dead lock.

@serhiy-katsyuba-intel maybe we can try to locate all modules structures in shared memory and execute bind on primary core? Just a suggestion without deep analysis.

No, I guess component expects that all its ops are called from same core. It might allocate some data in cached memory in .create() and expects it can always access it without any multi-core tricks.

@serhiy-katsyuba-intel so you suggest that the fix:

prevent the IPC task getting preempted which could result in buffers being only half connected when a pipeline task gets executed.

may be actually caused by driver or IPC driver problem - races that lead to binding modules on running pipeline?
the issue in question: #6653

I checked that the pipeline state is not verified before bind operation, if the above is true it is worth to reject bind when the pipeline is in incorrect state

I checked that the pipeline state is not verified before bind operation, if the above is true it is worth to reject bind when the pipeline is in incorrect state

Yes, that might be the good option.

Another option is to block preempting without blocking interrupts. Zephyr might have some API for this, if, not, than it could be done on LL scheduler level -- there are zephyr_ll_lock()/unlock(), however, they are implemented now just by calling irq_local_disable()/enable() :(

I assume driver should send bind IPC only when pipelines (with binding components) are not running.

@serhiy-katsyuba-intel @marcinszkudlinski the problem is with forked streams: when a stream is running and then you bind or unbind another stream to or from it.

@lyakh, added k_sched_lock() in a separate PR #8205 to guard bind/unbind.

However, if bind allowed to be called on running pipeline then we have more problems: .params() and .prepare() will not be called after bind for component on a running pipeline. Buffer stream params (like format, channel number) will be not initialized if capture pipeline is binded to some other running pipeline -- comp_verify_params() only initializes source buffers on capture pipeline.
Skipped .prepare() call may also create problems, e.g. mixin and mixout use .prepare() to select proper gain and mix functions depending of format.
So we need to address those problems if we allow bind to be called for running pipeline.

Heavily reimplemented version of mixin and mixout to support mixin on one core to be connected to mixout(s) of another core. This version adds buffer between mixin and mixout and so is slightly slower then previous version. Signed-off-by: Serhiy Katsyuba <[email protected]>

There are multiple problems with audio_stream_avail_frames_aligned() implementation: (1) default alignment initialization in audio_stream_init() does not work as frame format is set later, (2) if alignment was not explicitly initialized, audio_stream_avail_frames_aligned() returns wrong (higher) number of frames, (3) audio_stream_avail_frames_aligned() returns wrong result for formats with frame size not equal to power of 2, e.g. for streams with 3, 5 and 7 channels. For cross-core connection such problems result in weird bugs: corrupted memory due to wrong invalidate size. Signed-off-by: Serhiy Katsyuba <[email protected]>

Bind for components located on different cores is processed on core 0, so buffer is created on core 0, not on its source component core. Signed-off-by: Serhiy Katsyuba <[email protected]>

comp_bind() may call comp_bind_remote() which uses IDC and so interrupts should be enabled, otherwise dead lock will happen. Same for comp_unbind(). Those irq_local_disable() and irq_local_enable() are actually global. irq_local_disable()/irq_local_enable() are not really needed as bind/unbind cannot be preempted anyway because of IPC synchronisation: only after bind/unbind IPC reply next IPC (e.g. pipeline start) could come. Signed-off-by: Serhiy Katsyuba <[email protected]>

serhiy-katsyuba-intel requested review from lgirdwood, plbossart, mmaka1, lbetlej and dbaluta as code owners September 13, 2023 14:00

serhiy-katsyuba-intel requested review from abonislawski and marcinszkudlinski September 13, 2023 14:02

marcinszkudlinski reviewed Sep 14, 2023

View reviewed changes

lyakh reviewed Sep 14, 2023

View reviewed changes

serhiy-katsyuba-intel force-pushed the cross-core-fixes-006 branch 3 times, most recently from ee9feab to 6a9018e Compare September 14, 2023 12:50

serhiy-katsyuba-intel added 5 commits September 14, 2023 15:50

ipc4: Fix for cross-core buffer creation

320eb6b

Bind for components located on different cores is processed on core 0, so buffer is created on core 0, not on its source component core. Signed-off-by: Serhiy Katsyuba <[email protected]>

serhiy-katsyuba-intel force-pushed the cross-core-fixes-006 branch from 6a9018e to bdfec1d Compare September 14, 2023 13:52

marcinszkudlinski merged commit 413ceaa into thesofproject:mtl-006-drop-stable Sep 14, 2023
25 of 41 checks passed

keqiaozhang mentioned this pull request Sep 19, 2023

[BUG][mtl-006] I/O error when running aplay on speaker pipeline #8224

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cross core fixes for mtl-006-drop-stable #8194

Cross core fixes for mtl-006-drop-stable #8194

serhiy-katsyuba-intel commented Sep 13, 2023

marcinszkudlinski Sep 14, 2023

marcinszkudlinski Sep 14, 2023

lyakh left a comment

lyakh Sep 14, 2023

lyakh Sep 14, 2023

lyakh Sep 14, 2023

lyakh Sep 14, 2023

marcinszkudlinski Sep 14, 2023 •

edited

Loading

serhiy-katsyuba-intel Sep 14, 2023

marcinszkudlinski Sep 14, 2023

serhiy-katsyuba-intel Sep 14, 2023

lyakh Sep 15, 2023

serhiy-katsyuba-intel Sep 15, 2023

		@@ -46,24 +46,6 @@ DECLARE_TR_CTX(mixout_tr, SOF_UUID(mixout_uuid), LOG_LEVEL_INFO);
		#define MIXIN_MAX_SINKS IPC4_MIXIN_MODULE_MAX_OUTPUT_QUEUES
		#define MIXOUT_MAX_SOURCES IPC4_MIXOUT_MODULE_MAX_INPUT_QUEUES

		*/
		avail_frames = input_buffers[0].size; /// audio_stream_get_avail_frames(input_buffers[0].data);

Cross core fixes for mtl-006-drop-stable #8194

Cross core fixes for mtl-006-drop-stable #8194

Conversation

serhiy-katsyuba-intel commented Sep 13, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lyakh left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

marcinszkudlinski Sep 14, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

marcinszkudlinski Sep 14, 2023 •

edited

Loading