-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CM4: Kernel panic with PI7C9X2G404EV PCIe switch #5352
Comments
Transferred to Linux since the firmware is not involved with PCIe |
Afaik @6by9 has successfully used the pericom / diodes switch with the CM4. Do you remember if there were any hoops you had to jump? Does your card had the eeprom attached to the switch? |
I've a PI7C9X2G304 1:2 PCIe switch built in to a Hauppauge WinTV QuadHD DVB-T2/C tuner card. That needed I've also used a ASMedia ASM1184e 1:4 PCIe switch, which had no issues. I've not tested it recently though. My NAS is a CM4IO running with a Pericom PCIe switch (12d8:2404), VL805 USB3 controller and Marvell 9215 SATA controller. It's still running a self-built 5.10.74-v8 kernel on Buster as I haven't rebuilt it for a while.
No issues there. |
Any chance you could share the kernel (config or files) and your firmware / eeprom version, so we can check with the diodes/pericom eval board? The asm1184e is working in the same setup without a problem, so we're wondering what's the problem with the pericom. We have tested several boards with the pericom (custom board design) as well as the cm4io with the official pericom eval board. All fail with the same weird behavior as explained above. |
Kernel config was the base bcm2711_defconfig + CONFIG_SATA_AHCI=m. Looking at the board, the chip is labelled P17C9X2G 404SLBFDE 1712GT. It is on a CM4Lite, so I'll see if I get time to make up a fresh SD card and try booting with it. |
Tested on Bullseye 64bit Lite with kernel/firmware as per the Sept image, apt dist-upgrade'd, and rpi-update. Rebooted half a dozen times with each and had no issues. About the only thing I didn't update was my bootloader, which is still on
Not much more I can add - for me it has been just plug and play. |
Thanks a lot! We will test with with firmware version you're using and also check if we can get our hands on one of the above mentioned cards. |
The Ebay link above is nearly identical to my board, although I can't guarantee it. |
Hmm your dmesg output shows, that your card is only running Gen1 (2,5 GT/s). Have you pinned this to Gen1 in the devicetree or anything like that? |
No, this was literally a vanilla Bullseye Lite 64bit image which I updated. No changes to config.txt or device tree at all. |
OK, so there is something fishy here, as it should get a Gen2 link. But it seems to falls to Gen1 while doing link training. |
Seeing as there is nothing sensitive in this kernel log, attaching the whole thingdmesg.txt |
Checking on my Buster 5.10.74 kernel, the more complete lspci output for the host and Pericom upstream port:
Checking /proc/device-tree/scb/pcie@7d500000/max-link-speed, it is set to 1. |
6.1.12 is running as Gen2 on all active links, and no kernel panics seen.
(Port 02:01.0 had an accident where I dropped the switch with a card plugged in and it pulled the plastic shell of the connector off rendering it useless, so it's not connected to anything. That is likely why it is reported as degraded to 2.5GT/s) |
One proposed patch on the linux-pci list: I think the author is correct in that accesses when the link is doing a retrain result in fatal errors, but I also think the race condition is unavoidable. Bad signal integrity could cause link-down at any time, and there's no notification that state has changed. It's also half the story - there is an actual bug in how the probe code is enabling CRS software visibility, which results in fatal errors if an endpoint does have an active link but has not yet completed internal initialisation and responds with Retry to a config read. The call to pci_enable_crs() which suppresses that particular error response is done before the link is established, so PERST# is asserted and root complex configuration registers are held in reset. I'm waiting on hardware that should let me replicate the failure case that the mailing list alludes to. |
Ugh, even CRSV=1 doesn't work in all cases. I have an Etron EJ188 xhci controller which, if plugged into the switch's downstream port, occasionally responds to the first config write with a CRS response, whereas the first set of reads work OK. Writes with a retry status always result in a fatal bus exception. This device arguably deviates from the PCIe spec, as Retry status after a reset should be a global read-write lockout and not just a write lockout, but it exists and predates the CM4. |
@P33M The patch you've linked to makes enumeration work reliably with the ASMedia ASM1184e, but timeouts (and thus kernel panics) persist with the Pericom PI7C9X2G404SL (occasionally) and with the Pericom PI7C9X2G404SV (very often). With those chips, particularly with the latter, enumeration sometimes initially works but subsequently timeouts occur out of the blue when accessing devices downstream of that switch. Also, sometimes the switch itself is enumerated but downstream devices are not. Since the issue is easy to reproduce with the PI7C9X2G404SV, I recommend testing with that one. The chip is interesting because it consumes less power than the others and conforms to industrial requirements (temperature and otherwise). Unfortunately eval boards with the PI7C9X2G404SV are hard to come by and expensive. @iluminat23 has a single one, but it's loaned, so he's very hesitant to give it away. That eval board absolutely does not work reliably in the CM4 IO board, but shows no issues at all in a Sitara AM64 EVM or in an x86 PC. So there must be something fishy going on with the CM4 PCIe controller. By accessing the switch's registers after a kernel panic on enumeration (over SMBUS from a different machine) it has been discovered that the Common Clock Configuration bit in the Link Control Register of the Switch Upstream Port is zero. In the working case, i.e., without kernel panic on enumeration, that bit is set. So it seems that the switch has difficulty recovering the reference clock presented by the BCM2711 PCIe controller. Analysis of that clock with a Keysight high-speed oscilloscope has indeed exposed issues: For one, the clock changes multiple times after power-on, oscillating around the expected frequency. Second, there's a "capacitor effect" visible whereby the clock takes a while to "come down" to the expected voltage, presumably because the AC coupling needs time to charge. It has been tried to work around the unclean clock signal by delaying release of PERST of the switch. That didn't help. It has also been tried to not show the clock signal to the switch until it has stabilized (i.e. keeping the clock visible to the switch in High-Z for a while). Didn't help either. It has been tried to shift the clock voltage up a little to open the "eye" of the signal. Didn't help. Analysis of the communication between the CM4 and the PI7C9X2G404SV with a Teledyne Lecroy protocol analyzer initially didn't work at all. After desoldering the AC coupling on the CM4, the analyzer started to see communication, but there were no TLPs visible that would point to an error. It seems the issue is at the physical layer, not the TLP layer. @CyrilBrulebois has commented on #5455 and on the mailing list that updating the CM4 EEPROM helps. This was tried as well. It didn't help with the PI7C9X2G404SV. Again, no issues if the switch eval board is plugged into an AM64 EVM or x86 PC. It's only the CM4 that exhibits these issues. |
Thanks for the summary of the status quo and esp. the things we tried so far, @l1k ! |
Reference clock exposed by the BCM2711 PCIe controller, showing a "capacitor effect" (green signal, top-left corner) and showing that the clock oscillates between 96 MHz and 106 MHz before settling at 100 MHz after about 391 usec (yellow signal at the bottom). Note that 391 usec is well after PERST has been released, so the unclean clock is visible to the PCIe device attached to the controller. However, working around these issues still did not make the PI7C9X2G404SV work reliably. |
I forgot to mention that forcing de-emphasis to -3.5dB (instead of the default -6dB) in the Link Control 2 register of the BCM2711 Root Port seemed to slightly improve reliability with the PI7C9X2G404SV. (The kernel would predictably crash when flashing firmware onto a Renesas USB controller attached below the switch with -6dB, but not with de-emphasis at -3.5dB. Nevertheless crashes would still occur later on.) This seems to further suggest that there's a signal issue at play. |
Hi, I've probed the CM4 PERST and PCIE REFCLK, it seems that REFCLK is enabled about 500us before de-asserting PERST, see scope capture below: I've noticed that in the PI7C9X2G404EV datasheet the REFCLK should already be stable for about 100ms when PERST is de-asserted: The issue is similar to the one here: https://community.toradex.com/t/verdin-imx8mm-clock-reset-sequencing/15851/11 I want to add the 100ms of REFCLK stabilization before de-asserting PERST, I'm not familliar with the Broadcom PCIe kernel driver, can somebody point me in the right direction? |
@florincosta |
No, but there's some terminology in this thread that needs clearing up - Referring to the CEM specification v4 rev 1.0, figure 8, the critical time periods are 1 through 5.
The delay referred to in the patch is inaccurately described/placed. After the previous line, the link will have autonomously activated and in general will be in L0 on the first or second iteration of the loop checking for link-up. Referring to the base specification v4.0 6.6.1 (I don't have v5 to hand, but in any case this part of the spec is stable across gen 3 thru 5), the context describes a time period after the link is active where software must not send any configuration requests to the downstream device, unless a Readiness Notification is received. I doubt the RC can generate an interrupt for a RN therefore the 100ms delay is required - but this should be after L0. In practice it doesn't matter. There is an ultimate timeout of 1 second after the first attempt to issue a configuration request for the device for it to return a successful completion. This delay is catered for in the various enumeration mechanisms in the upper Linux PCI layer. |
If the switch really needs 1000x the minimum required Tperst-clk, then there's an issue - there's no easy way to extend this time. |
Ta. |
See 6f634d7. |
Yeah, it seems the PI7C9X2G404EV switch needs a Tperst-clk of 100ms, although in the CEM spec this Tperst-clk is only specified as a minimum of 100us, maximum not specified, so the PI7C9X2G404EV requires 100ms of REFCLK stabilization before de-asserting the PERST# signal.. @P33M thanks for the info regarding Tperst-clk 100us and where is handled. I still did try @danclive suggestion with 6.2.16-v8+ but got the same results: So I guess now there is only on way to fix this, add the 100ms delay the hardway :) |
I think I may have found a way to optionally delay PERST# deassertion with a magic register bit in PCIE_MISC_HARD_DEBUG - bit 3 directly controls the output pad. If I set this bit prior to the call to It may be the case that the RC is trying to initiate training during this delay which would violate constraint 4 above but that is probably tolerable. |
Nope, delaying the deassert in this manner doesn't affect when the RC starts link training (~300us after). It's a pretty tall ask to make the driver do this for every endpoint (thus delaying boot) so another devicetree knob is needed. |
On a CM4 try testing |
Hi, I've tested
We tested this with ~3800 cold boots.
We also tested this with ~80 reboots:
We had no time to investigate the LAN7430 issue any further. But I wanted to report back our test results. |
Can you re-do |
I merged the changes that were present in #5609 as of 24-Oct-2023 into 5.15.92 and have been beating the snot out of it for 6 days with no further issues. I've sent 5+ terabytes across the switch each way between USB3-attached disks and SATA-attached disks. Previously I was getting crashes at least every few hours/few hundred gigs when attempting this. I'm running the following on a CM4 + official I/O board:
So - thanks! |
@P33M This didn't change any thing for the reboot issue. cold boot:
reboot:
|
Is the behaviour the same if you don't use a soft reboot, and instead forcibly reset the CM4 via the RUN pin? |
My test setup failed me and I'm not able to do any tests for now. I'm also not in the office for the next weeks and will not be able to fix the setup. |
Regarding the disappearing LAN7430, I've found that the default CM4 boot order includes NVMe probe, which will wake up the link without the requisite Tperst_clk time. Please confirm whether or not 0x4 or 0x6 appears anywhere in your BOOT_ORDER eeprom config. If a switch is attached there's no point having either of these as the bootloader won't enumerate devices behind a switch. |
Yes on the most of my CMs I have not touched the boot order and thus NVMe boot is enabled. |
I'm experiencing similar behavior on a custom carrier board using Pericom Semiconductor PI7C9X2G608GP PCIe2 6-Port/8-Lane Packet Switch. There are 2 main issues we are seeing that seem related to the OP
Setting dtparam=pcie_tperst_clk_ms=250 seems to help as the crash happens less often but still is reproducible Last night I tested the following patch, which did change the behavior, in that the system booted succesfully 100% of the time, however 1 out of 10 tests results in no PCIe at all. Anything else we can try? |
Is this the right place for my bug report?
I guess
Describe the bug
We use a PI7C9X2G404EV PCIe switch with the CM4 IO Board.
When we boot, most times the kernel panics when accessing the ID register of the PCIe switch.
To reproduce
Use a PI7C9X2G404EV PCIe switch card with the CM4 IO Board. No additional H/W needs to be attached to the switch to already trigger this behavior.
Expected behaviour
System should boot up and the PCIe switch should be working (e.g. show up with lspci)
Actual behaviour
The kernel stops with a kernel panic often/most times (depends on time of day, phase of the moon and other unknown conditions). The kernel log shows a Gen1 link 2.5 GT/s
If the system boots the switch works and a Gen2 link 5.0 GT/s is shown in the log. In this case if additional H/W is connected to the switch the H/W works as expected. Tested with a LAN7430 GBit ether adapter and a Renesas USB3 controller.
System
raspinfo from the running system: http://paste.debian.net/1271060/
Which model of Raspberry Pi?
Which OS and version (
cat /etc/rpi-issue
)?Generated using pi-gen, https://github.com/RPi-Distro/pi-gen, 005a8c73b05a2cab394073150208bf4f069e861a, stage2
Which firmware version (
vcgencmd version
)?Copyright (c) 2012 Broadcom
version 8ba17717fbcedd4c3b6d4bce7e50c7af4155cba9 (clean) (release) (start)
Which kernel version (
uname -a
)?Linux raspberrypi 5.15.84-v8+ raspberrypi/firmware#1613 SMP PREEMPT Thu Jan 5 12:03:08 GMT 2023 aarch64 GNU/Linux
Logs
panic
good case
Additional context
The issues seems simmilar to raspberrypi/firmware#1766
Adding
dtparm=pcie=off
to boot/config.txt had no effect.We also Tested the latest 6.2 from from https://github.com/raspberrypi/linux/tree/rpi-6.2.y.
Other things we tested:
PCIEASPM_PERFORMANCE
max-link-speed = <1>;
dtoverlay=pcie-32bit-dma
in the config.txtpci=nomsi
to the cmdlineIs there anything we could try to do? Did we miss somthing?
The text was updated successfully, but these errors were encountered: