Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] driver should limit FW crash reports to single reports #2890

Closed
lgirdwood opened this issue May 5, 2021 · 6 comments
Closed

[FEATURE] driver should limit FW crash reports to single reports #2890

lgirdwood opened this issue May 5, 2021 · 6 comments
Assignees
Labels
enhancement New feature or request

Comments

@lgirdwood
Copy link
Member

Currently if the FW crashes we get multiple copies of the crash dump in dmesg (i.e. one dump per tried IPC). Kernel should limit this to one dump until next FW boot. i.e. today we have poor dmesg SNR as dmesg can overflow with the same message

[   26.116905] sof-audio-pci 0000:00:0e.0: error: assertion failed
[   26.116911] sof-audio-pci 0000:00:0e.0: error: trace point 00004000
[   26.116917] sof-audio-pci 0000:00:0e.0: error: panic at :0
[   26.116923] sof-audio-pci 0000:00:0e.0: error: DSP Firmware Oops
[   26.116928] sof-audio-pci 0000:00:0e.0: error: Exception Cause: IllegalInstructionCause, Illegal instruction
[   26.116936] sof-audio-pci 0000:00:0e.0: EXCCAUSE 0x00000000 EXCVADDR 0x00000000 PS       0x00000000 SAR     0x00000000
[   26.116942] sof-audio-pci 0000:00:0e.0: EPC1     0x00000000 EPC2     0x00000000 EPC3     0x00000000 EPC4    0x00000000
[   26.116948] sof-audio-pci 0000:00:0e.0: EPC5     0x00000000 EPC6     0x00000000 EPC7     0x00000000 DEPC    0x00000000
[   26.116955] sof-audio-pci 0000:00:0e.0: EPS2     0x00000000 EPS3     0x00000000 EPS4     0x00000000 EPS5    0x00000000
[   26.116961] sof-audio-pci 0000:00:0e.0: EPS6     0x00000000 EPS7     0x00000000 INTENABL 0x00000000 INTERRU 0x00000000
[   26.116982] sof-audio-pci 0000:00:0e.0: stack dump from 0x00000000
[   26.117001] sof-audio-pci 0000:00:0e.0: 0x00000000: 00000000 00000000 00000000 00000000
[   26.117016] sof-audio-pci 0000:00:0e.0: 0x00000004: 00000000 00000000 00000000 00000000
[   26.117031] sof-audio-pci 0000:00:0e.0: 0x00000008: e4a6e200 2c20b772 2f118818 ffff9ce4
[   26.117062] sof-audio-pci 0000:00:0e.0: 0x0000000c: c0bb825d ffffffff c226f8e8 00000000
[   26.117070] sof-audio-pci 0000:00:0e.0: 0x00000010: c226f8d0 ffffbc6a e4a6e200 2c20b772
[   26.117077] sof-audio-pci 0000:00:0e.0: 0x00000014: c0baf1b7 ffffffff c226f9ec ffffbc6a
[   26.117085] sof-audio-pci 0000:00:0e.0: 0x00000018: 80010000 00000000 294f5e00 ffff9ce4
[   26.117092] sof-audio-pci 0000:00:0e.0: 0x0000001c: e4a6e200 2c20b772 2f118bd0 ffff9ce4
[   26.117103] sof-audio-pci 0000:00:0e.0: error: hda irq intsts 0x00000000 intlctl 0xc0000000 rirb 00
[   26.117109] sof-audio-pci 0000:00:0e.0: error: dsp irq ppsts 0x00000000 adspis 0x00000000
[   26.117118] sof-audio-pci 0000:00:0e.0: error: host status 0x00000000 dsp status 0x00000000 mask 0x00000003
[   26.117127] sof-audio-pci 0000:00:0e.0: error: failed to set dai config for iDisp3
[   26.117136] sof-audio-pci 0000:00:0e.0: ASoC: error at snd_soc_dai_hw_params on iDisp3 Pin: -110
[   26.117153]  iDisp3: ASoC: hw_params BE failed -110
[   26.117159]  HDMI3: ASoC: hw_params BE failed -110
[   26.628803] sof-audio-pci 0000:00:0e.0: error: ipc timed out for 0x80010000 size 216
[   26.628822] sof-audio-pci 0000:00:0e.0: status: fw entered - code 00000005
[   26.628870] sof-audio-pci 0000:00:0e.0: error: assertion failed
[   26.628876] sof-audio-pci 0000:00:0e.0: error: trace point 00004000
[   26.628882] sof-audio-pci 0000:00:0e.0: error: panic at :0
[   26.628888] sof-audio-pci 0000:00:0e.0: error: DSP Firmware Oops
[   26.628893] sof-audio-pci 0000:00:0e.0: error: Exception Cause: IllegalInstructionCause, Illegal instruction
[   26.628901] sof-audio-pci 0000:00:0e.0: EXCCAUSE 0x00000000 EXCVADDR 0x00000000 PS       0x00000000 SAR     0x00000000
[   26.628907] sof-audio-pci 0000:00:0e.0: EPC1     0x00000000 EPC2     0x00000000 EPC3     0x00000000 EPC4    0x00000000
[   26.628913] sof-audio-pci 0000:00:0e.0: EPC5     0x00000000 EPC6     0x00000000 EPC7     0x00000000 DEPC    0x00000000
[   26.628920] sof-audio-pci 0000:00:0e.0: EPS2     0x00000000 EPS3     0x00000000 EPS4     0x00000000 EPS5    0x00000000
[   26.628926] sof-audio-pci 0000:00:0e.0: EPS6     0x00000000 EPS7     0x00000000 INTENABL 0x00000000 INTERRU 0x00000000
[   26.628931] sof-audio-pci 0000:00:0e.0: stack dump from 0x00000000
[   26.628941] sof-audio-pci 0000:00:0e.0: 0x00000000: 00000000 00000000 00000000 00000000
[   26.628949] sof-audio-pci 0000:00:0e.0: 0x00000004: 00000000 00000000 00000000 00000000
[   26.628957] sof-audio-pci 0000:00:0e.0: 0x00000008: e4a6e200 2c20b772 2f118818 ffff9ce4
[   26.628964] sof-audio-pci 0000:00:0e.0: 0x0000000c: c0bb825d ffffffff 9cb10c72 00000000
[   26.628972] sof-audio-pci 0000:00:0e.0: 0x00000010: c226f8d0 ffffbc6a e4a6e200 2c20b772
[   26.628979] sof-audio-pci 0000:00:0e.0: 0x00000014: c0baf1b7 ffffffff c226f9ec ffffbc6a
[   26.628987] sof-audio-pci 0000:00:0e.0: 0x00000018: 80010000 00000000 294f5e00 ffff9ce4
[   26.628994] sof-audio-pci 0000:00:0e.0: 0x0000001c: e4a6e200 2c20b772 2f118bd0 ffff9ce4
[   26.629005] sof-audio-pci 0000:00:0e.0: error: hda irq intsts 0x00000000 intlctl 0xc0000000 rirb 00
[   26.629011] sof-audio-pci 0000:00:0e.0: error: dsp irq ppsts 0x00000000 adspis 0x00000000
[   26.629020] sof-audio-pci 0000:00:0e.0: error: host status 0x00000000 dsp status 0x00000000 mask 0x00000003
[   26.629029] sof-audio-pci 0000:00:0e.0: error: failed to set dai config for iDisp3
[   26.629038] sof-audio-pci 0000:00:0e.0: ASoC: error at snd_soc_dai_hw_params on iDisp3 Pin: -110
[   26.629056]  iDisp3: ASoC: hw_params BE failed -110
@plbossart
Copy link
Member

The SNR isn't going to be fundamentally better if we remove the crash information, the ASoC core will continue doing a lot of error handling and throw repeated IPC errors. You would e.g. still get this as a result

[   26.628803] sof-audio-pci 0000:00:0e.0: error: ipc timed out for 0x80010000 size 216
[   26.629029] sof-audio-pci 0000:00:0e.0: error: failed to set dai config for iDisp3
[   26.629038] sof-audio-pci 0000:00:0e.0: ASoC: error at snd_soc_dai_hw_params on iDisp3 Pin: -110
[   26.629056]  iDisp3: ASoC: hw_params BE failed -110

@keyonjie keyonjie added the enhancement New feature or request label May 10, 2021
@lgirdwood
Copy link
Member Author

The SNR isn't going to be fundamentally better if we remove the crash information, the ASoC core will continue doing a lot of error handling and throw repeated IPC errors. You would e.g. still get this as a result

[   26.628803] sof-audio-pci 0000:00:0e.0: error: ipc timed out for 0x80010000 size 216
[   26.629029] sof-audio-pci 0000:00:0e.0: error: failed to set dai config for iDisp3
[   26.629038] sof-audio-pci 0000:00:0e.0: ASoC: error at snd_soc_dai_hw_params on iDisp3 Pin: -110
[   26.629056]  iDisp3: ASoC: hw_params BE failed -110

Agree ALSA/ASoC will keep sending new IPC but a simple flag is enough to say "FW is dead, lets not keep dumping the same IPC registers and FW stack trace to dmesg"

TBH, this is where we should be trying to recover FW and doing a D0 -> D3 -> D0 cycle.

@plbossart
Copy link
Member

TBH, this is where we should be trying to recover FW and doing a D0 -> D3 -> D0 cycle.

some day we'll finally work on #1675

@lgirdwood
Copy link
Member Author

@plbossart fwiw, I've asked @ujfalusi to look at this log fix and also the FW #1675 when he's waiting on his aux bus review comments.

@plbossart
Copy link
Member

@ujfalusi I think this is already implemented? Can we close this?

@ujfalusi
Copy link
Collaborator

@plbossart, I think this can be closed, we keep the dumps at bay now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants