-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ASoC: SOF: debug: Introduce debug option to test DSP ops #4726
Conversation
Introduce a new debug option to allow testing DSP operations such a boot firware, set power state etc. Add new new kconfig to allow this option for developers. For now only 2 ops are supported for booting a new firmware and setting the DSP power state to D3. To allow passing a new firmware file, modify the signature of the load_firmware op to take the firmware file name as an argument. When the kconfig for DSP op testing is enabled, runtime PM is disabled to allow the DSP state to be solely managed from the userspace and no audio card is registered. Signed-off-by: Ranjani Sridharan <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not following the general direction if I am honest.
help | ||
This option is used for testing DSP operations such as load/unload FW, | ||
set power state, enable/disable core etc. from userspace. | ||
Say Y if you want to retain DSP context for FW exceptions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that line looks like a copy-paste from above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would mention that this will disable the audio support completely
@@ -472,6 +472,7 @@ static int sof_probe_continue(struct snd_sof_dev *sdev) | |||
/* hereafter all FW boot flows are for PM reasons */ | |||
sdev->first_boot = false; | |||
|
|||
#if !IS_ENABLED(CONFIG_SND_SOC_SOF_DEBUG_DSP_OPS_TEST) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use if (!IS_ENABLED) instead of #if
|
||
/* | ||
* test firmware boot by passing the firmware file as the argument: | ||
* ex: echo boot_firmware,intel/avs/tgl/community/dsp_basefw.bin > dsp_test_op |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why combine the two parts?
echo intel/avs/tgl/community/dsp_basefw.bin > filename
echo 1 > boot
return ret; | ||
|
||
/* resume DMA trace */ | ||
return sof_fw_trace_resume(sdev); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks like code duplication, I don't follow what it's supposed to do
Do we really want to do this or run something like the python stuff on top of a simplified SOF driver.
We have to be really careful not to open a pandora box of new stuff that we'll have to maintain forever twice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@plbossart the idea is to be able to run the python scripts on Linux with minimal effort by using the debugfs option. This is in essence a simplified SOF driver isnt it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So is this basically an alternative to https://github.com/thesofproject/sof-diagnostic-driver that don't have to load/unload?
Isn't the sof-diagnostic-driver approach more generic/flexible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, the question is really "what is the partitioning between scripts and driver"? Do we want to reuse snd_sof_run_firmware and hard-code it in a sequence used by a script, or do we want the scripts to do something that runs the firmware?
If you really want to use scripts, then the driver part should be rather minimal and probably a pass-through to access registers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but @plbossart if we want to write scripts to do all the register reads/writes, it would essentially be rewriting the SOF driver. Isn't this a saner approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but @plbossart if we want to write scripts to do all the register reads/writes, it would essentially be rewriting the SOF driver. Isn't this a saner approach?
the question is whether
a) you want to reuse the existing Python scripts used by FW validation or
b) you want to develop NEW python scripts based on a TBD API
It's not clear from this PR what the direction is.
a) is complicated because I don't know what the interface requirements are, but
b) doesn't seem very appealing to me since it'd be a 3rd way of interacting with the DSP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marc-hb this isnt a replacement for the diagnostic driver. This is a way to enable us to run the python scripts for FW testing on Linux with minimal effort.
What other purpose does the sof-diagnostic-driver have?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ranj063, it depends which python script we are talking about... One set I have seen is implementing IPC message crafting, sending (yep, with register poking and then polling for completion).
It might be hard to get it to download the firmware because of the DMA configuring, that is true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@plbossart @ujfalusi @andyross @marc-hb this PR is NOT about IPC ABI validation but a feature to support debug and compliance/stress testing of kernel + SOF + Zephyr functional IP programming flows with userspace scripting (not userspace IP programming).
We need to test the kernel functional IP code alongside the associated Zephyr IP code together (with the same timing and sequencing) which is not possible using Python to flip bits via mmap(). We already have a Python mmap() CI tool today that verifies IPC ABI compliance only. i.e. not useful to certify IP programming compliance.
Yes we can do a lot of test with aplay/amixer, but these tools are end user applications not designed to strongly validate/stress the kernel + SOF + Zephyr combination.
e.g. we need to be able to stress (yes, low hanging fruit will come first)
- D3 -> D0 and vice versa
- DSP core off/on
- S0ix entry/exit
- FW boot
- IMR save/restore
- topology load/unload
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will assert that you can do all of the above with sof-test. Just run thousands of cycles of suspend-resume, module load/unload, etc.
int snd_sof_load_firmware_raw(struct snd_sof_dev *sdev); | ||
int snd_sof_load_firmware_memcpy(struct snd_sof_dev *sdev); | ||
int snd_sof_load_firmware_raw(struct snd_sof_dev *sdev, const char *fw_filename); | ||
int snd_sof_load_firmware_memcpy(struct snd_sof_dev *sdev, const char *fw_filename); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this go to a separate patch?
https://github.com/zephyrproject-rtos/zephyr/ already has a pure-Python driver from @andyross for testing purposes: zephyrproject-rtos/zephyr@9073db4
This was the first, very old commit, newer version is now here: |
@marc-hb To be fair: cavstool isn't PM-aware, so you generally have to unload the kernel driver in order to use it. But it's been very successful as a paradigm. I even banged out a broad equivalent for bringing up Zephyr on the mt8195 DSP (which has a MUCH simpler host control interface, again to be fair). I don't personally see any reason not to have this in the kernel, it seems like an obvious thing for the driver to provide. I will say though that I'd prefer to see it start from the perspective of "loading arbitrary firmware to the device and getting output" instead of just assuming everything is an audio blob that handles ALSA streams. The SOF unit testing story for DSP code has always been kinda weak, and Zephyr has had to pick up a lot of that, historically. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ranj063, imho if this test mode is activated then we should not boot the 'default' firmware on module load but wait for the firmware path to be loaded and the instruction to boot the DSP with it.
If this mode is selected we should be printing a warning that only firmware testing is allowed via some tbd interface and the kconfig should be more 'alarming' to deter users from selecting it. Currently it is 'why not, it should not do much harm' category.
Note: We sort of have an interface to do simple things via the message injector with the default firmware, we need to prevent rpm to keep the DSP up, but we can send arbitrary messages all while we have the sound card..
help | ||
This option is used for testing DSP operations such as load/unload FW, | ||
set power state, enable/disable core etc. from userspace. | ||
Say Y if you want to retain DSP context for FW exceptions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would mention that this will disable the audio support completely
return ret; | ||
|
||
/* resume DMA trace */ | ||
return sof_fw_trace_resume(sdev); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ranj063, it depends which python script we are talking about... One set I have seen is implementing IPC message crafting, sending (yep, with register poking and then polling for completion).
It might be hard to get it to download the firmware because of the DMA configuring, that is true.
} | ||
|
||
/* ops are executed as "op_name,argument1,argument2...". For example, to set the DSP power state | ||
* to D3: echo "load_firmware,<PATH>/sof-tgl.ri" > dsp_test_op" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not separate files for the firmware path, set_power_state, etc?
How one would guess the valid op_name
s and what they are set at the moment?
dfse->type = SOF_DFSENTRY_TYPE_BUF; | ||
dfse->sdev = sdev; | ||
|
||
debugfs_create_file("dsp_test_op", 0222, sdev->debugfs_root, dfse, &sof_dsp_ops_tester_fops); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dsp_test
or rather separate files under dsp_test/
directory?
plat_data->fw_filename); | ||
if (!fw_filename) | ||
return -ENOMEM; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, freeing up the passed fw_filename
is OK?
@@ -157,7 +159,7 @@ static size_t sof_ipc4_fw_parse_basefw_ext_man(struct snd_sof_dev *sdev) | |||
if (ret) | |||
return ret; | |||
} | |||
|
|||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we skipping this part?
@@ -14,22 +14,23 @@ | |||
#include "sof-priv.h" | |||
#include "ops.h" | |||
|
|||
int snd_sof_load_firmware_raw(struct snd_sof_dev *sdev) | |||
int snd_sof_load_firmware_raw(struct snd_sof_dev *sdev, const char *fw_filename) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should not neglect the snd_sof_load_firmware_memcpy()
@@ -266,6 +266,14 @@ config SND_SOC_SOF_DEBUG_RETAIN_DSP_CONTEXT | |||
Say Y if you want to retain DSP context for FW exceptions. | |||
If unsure, select "N". | |||
|
|||
config SND_SOC_SOF_DEBUG_DSP_OPS_TEST | |||
bool "SOF enable DSP ops testing" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does constitute a "DSP ops"? Isn't this something like "DSP test mode"? Should we have a sof_debug
flag for this to be enabled/disabled instead of re-building the kernel?
@@ -488,6 +489,7 @@ static int sof_probe_continue(struct snd_sof_dev *sdev) | |||
"error: failed to register machine driver %d\n", ret); | |||
goto fw_trace_err; | |||
} | |||
#endif | |||
|
|||
ret = sof_register_clients(sdev); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we loading clients in DSP test mode? They might interfere with the testing, no?
fw_trace_err: | ||
#endif | ||
sof_fw_trace_free(sdev); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also enable ftrace? That will send messages, again, might interfere with testing?
The ability to make the DSP panic after some time parameter would also be useful to test recovery features like these: |
The --skip-to-first-trace option was added as a precaution to exclude log pollution from before the start of a test but how much pollution it excludes was never assessed; we just don't know. Hopefully not much? Replace, see below. mtrace timestamps used to return to zero on every DSP boot, which --skip-to-first-trace depends on. After SOF commit 76b0979c6453 ("sync time between host and fw for logging"), mtrace timestamps still return to zero (briefly) but only on "cold" DSP boots, not on when booting from IMR. Booting from IMR has now been enabled by default in most configurations. That combination breaks --skip-to-first-trace. The only practical short-term solution is to turn off --skip-to-first-trace and take the risk of old logs polluting the test results, which is what this commit does. If results pollution proves to be a problem, we could in the longer term try to disable IMR boot just before running this test using sof_debug as discussed in thesofproject#1153 This would also measure from a possibly cleaner DSP state. New, DSP "debug ops" being proposed in thesofproject/linux#4726 might also help. Fixes: thesofproject#1153 Signed-off-by: Baofeng Tian <[email protected]> Signed-off-by: Chao Song <[email protected]>
The --skip-to-first-trace option was added as a precaution to exclude log pollution from before the start of a test, normally the residue traces are related to trigger stop IPC, dma put and component reset (as you can see). There could be 10 ~ 20 lines. mtrace timestamps used to return to zero on every DSP boot, which --skip-to-first-trace depends on. After SOF commit 76b0979c6453 ("sync time between host and fw for logging"), mtrace timestamps still return to zero (briefly) but only on "cold" DSP boots, not on when booting from IMR. Booting from IMR has now been enabled by default in most configurations. That combination breaks --skip-to-first-trace. The only practical short-term solution is to turn off --skip-to-first-trace and take the risk of old logs polluting the test results, which is what this commit does. If results pollution proves to be a problem, we could in the longer term try to disable IMR boot just before running this test using sof_debug as discussed in thesofproject#1153 This would also measure from a possibly cleaner DSP state. New, DSP "debug ops" being proposed in thesofproject/linux#4726 might also help. Fixes: thesofproject#1153 Signed-off-by: Baofeng Tian <[email protected]> Signed-off-by: Chao Song <[email protected]>
It is not a scope of this PR. |
Understood but would it be a useful stepping stone? Looks like one. |
To be honest I would rather not do such a thing. |
The --skip-to-first-trace option was added as a precaution to exclude log pollution from before the start of a test, normally the residue traces are related to trigger stop IPC, dma put and component reset (as you can see). There could be 10 ~ 20 lines. mtrace timestamps used to return to zero on every DSP boot, which --skip-to-first-trace depends on. After SOF commit 76b0979c6453 ("sync time between host and fw for logging"), mtrace timestamps still return to zero (briefly) but only on "cold" DSP boots, not on when booting from IMR. Booting from IMR has now been enabled by default in most configurations. That combination breaks --skip-to-first-trace. The only practical short-term solution is to turn off --skip-to-first-trace and take the risk of old logs polluting the test results, which is what this commit does. If results pollution proves to be a problem, we could in the longer term try to disable IMR boot just before running this test using sof_debug as discussed in #1153 This would also measure from a possibly cleaner DSP state. New, DSP "debug ops" being proposed in thesofproject/linux#4726 might also help. Fixes: #1153 Signed-off-by: Baofeng Tian <[email protected]> Signed-off-by: Chao Song <[email protected]>
I'll close this for now since it's not an active development. We can always re-open this later. |
Introduce a new debug option to allow testing DSP operations such a boot firware, set power state etc. Add new new kconfig to allow this option for developers. For now only 2 ops are supported for booting a new firmware and setting the DSP power state to D3.
To allow passing a new firmware file, modify the signature of the load_firmware op to take the firmware file name as an argument. When the kconfig for DSP op testing is enabled, runtime PM is disabled to allow the DSP state to be solely managed from the userspace and no audio card is registered.