Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CONFIG_NAND_TIMING_MODE not working on sam9x60 custom board #174

Open
LeSpocky opened this issue Feb 19, 2024 · 10 comments
Open

CONFIG_NAND_TIMING_MODE not working on sam9x60 custom board #174

LeSpocky opened this issue Feb 19, 2024 · 10 comments
Assignees

Comments

@LeSpocky
Copy link
Contributor

After successfully evaluating the sam9x60-curiosity board with at91bootstrap v4.0.6 and booting from NAND flash we based our own design on the D5M variant of the sam9x60 SiP. The raw NAND flash chip we are using is a Spansion® SLC NAND Flash Memory S34ML02G1, which has different page size and spare area size than the MX30LF4G28AD used on the curiosity board. at91bootstrap configuration
(see .config) is based on sam9x60_curiositynf_uboot_defconfig and I get this on boot if CONFIG_NAND_TIMING_MODE is enabled:

AT91Bootstrap 4.0.8 (2020-08-01 00:00:00)

NAND: ONFI flash detected
NAND: Manufacturer ID: 0x1 Chip ID: 0xda
NAND: Page Bytes: 2048, Spare Bytes: 64
NAND: ECC Correctability Bits: 1, ECC Sector Bytes: 512
NAND: Switch to timing mode 3
NAND: Disable On-Die ECC
PMECC: version is: 0x102
PMECC: page_size: 2048, oob_size: 64, pmecc_cap: 8, sector_size: 512
NAND: Initialize PMECC params, cap: 8, sector: 512
NAND: Image: Copy 0xc0000 bytes from 0x40000 to 0x21f00000
PMECC: sector bits = 15, bit 1 means corrupted sector, Now correcting...
Correct error bit in OOB @[#Byte 6,Bit# 5] 164 -> 132
Correct error bit @[#Byte 498,Bit# 5] 160 -> 128
Correct error bit @[#Byte 402,Bit# 5] 160 -> 128
Correct error bit @[#Byte 306,Bit# 5] 160 -> 128
Correct error bit @[#Byte 210,Bit# 5] 160 -> 128
Correct error bit @[#Byte 137,Bit# 5] 32 -> 0
Correct error bit @[#Byte 114,Bit# 5] 160 -> 128
PMECC: failed to correct corrupted bits!

If I disable CONFIG_NAND_TIMING_MODE loading the U-Boot image is successful like this:

AT91Bootstrap 4.0.8 (2020-08-01 00:00:00)

NAND: ONFI flash detected
NAND: Manufacturer ID: 0x1 Chip ID: 0xda
NAND: Page Bytes: 2048, Spare Bytes: 64
NAND: ECC Correctability Bits: 1, ECC Sector Bytes: 512
NAND: Disable On-Die ECC
PMECC: version is: 0x102
PMECC: page_size: 2048, oob_size: 64, pmecc_cap: 8, sector_size: 512
NAND: Initialize PMECC params, cap: 8, sector: 512
NAND: Image: Copy 0xc0000 bytes from 0x40000 to 0x21f00000
NAND: Done to load image

Notice the additional line "NAND: Switch to timing mode 3". The option was enabled with 8427813 for sam9x60 boards before release v4.0.6 and it actually does work here on the sam9x60 curiosity (also timing mode 3). I did not check timings in detail for that timing mode 3 which is chosen.

@LeSpocky
Copy link
Contributor Author

LeSpocky commented Feb 21, 2024

According to the datasheet the S34ML02G1 does not support the CMD_GET_FEATURE (EEh) send in function nand_get_feature_timing_mode(). At least that command is not listed in the "command set" section of the datasheet. Quote:

Open NAND Flash Interface (ONFI) 1.0 compliant

All ONFI spec versions from 1.0 to 5.1 I looked at list the command 0xEE as optional.

@LiBinSHA
Copy link
Collaborator

Yes, I confirm this issue.
I will add timing mode support for these kind of NAND flash in next release.

@LeSpocky
Copy link
Contributor Author

After investigating timing issues more deeply in U-Boot and having another look at at91bootstrap, I came to the conclusion at91bootstrap does nothing wrong here.

It seems to be a problem with the specific NAND flash and/or our board layout in combination with SAM9X60. The slower timing modes 0 to 2 work fine in U-Boot, but mode 3 also fails. Symptom is always ECC errors.

(The same flash chip works fine when used with SAMA5D2 or SAM9G20 on other boards, which have a lower rate for MCK, and thus use slightly different timings in the end.)

U-Boot binary is currently 676772 bytes here. In mode 0 we can read with 5.0 MiB/s which would take us ~130 ms to read U-Boot binary. In mode 3 with roughly 9.7 MiB/s it would take ~67 ms to load U-Boot. I'm not optimizing for 70 ms boot time here, so I will leave CONFIG_NAND_TIMING_MODE disabled for now.

If you still want to look into this, maybe some idea how to support other modes would be nice, for example if I could override the mode from Kconfig?

@LiBinSHA
Copy link
Collaborator

Some NAND flash (S34ML01G2 and W29N02KVxxAF) do not work properly in Timing
Mode 3, since their maximum tREA time is 4ns longer then normal NAND flash.
The workaround is to extend the SMC NRD pulse to meet tREA timing.

@LiBinSHA
Copy link
Collaborator

Please try this patch.
0001-driver-nandflash-update-nand-smc-timing.patch

@LeSpocky
Copy link
Contributor Author

Some NAND flash (S34ML01G2 and W29N02KVxxAF) do not work properly in Timing Mode 3, since their maximum tREA time is 4ns longer then normal NAND flash. The workaround is to extend the SMC NRD pulse to meet tREA timing.

The S34ML01G2 you mentioned is not the same chip as the S34ML02G1 we use. I studied the datasheet of both of them as well as the ONFI Spec Revision 4.2 again. The timings are the same and also mostly the same as in ONFI spec mode 3, especially tREA is listed as 20 ns in all three documents. So I'm not sure what you mean with "4ns longer" and "normal NAND flash"?

Please try this patch. 0001-driver-nandflash-update-nand-smc-timing.patch

With that patch it works, see my debug output:

--- at91bootstrap-nand-before.log       2024-02-29 14:39:48.270341137 +0100
+++ at91bootstrap-nand-after.log        2024-02-29 14:42:50.914346614 +0100
@@ -7,12 +7,11 @@
 NAND: ECC Correctability Bits: 1, ECC Sector Bytes: 512
 NAND: mode: 3, cs: 3, mck_ps: 5000 (5 ns), tdf: 15 (75 ns)
 NAND: NWE: setup: 2 (10 ns), pulse: 3 (15 ns), hold: 2 (10 ns), cycle: 7 (35 ns)
-NAND: NRD: setup: 0 (0 ns), pulse: 3 (15 ns), hold: 3 (15 ns), cycle: 6 (30 ns)
+NAND: NRD: setup: 0 (0 ns), pulse: 4 (20 ns), hold: 3 (15 ns), cycle: 7 (35 ns)
 NAND: Switch to timing mode 3
 NAND: Disable On-Die ECC
 PMECC: version is: 0x102
 PMECC: page_size: 2048, oob_size: 64, pmecc_cap: 8, sector_size: 512
 NAND: Initialize PMECC params, cap: 8, sector: 512
 NAND: Image: Copy 0xc0000 bytes from 0x40000 to 0x21f00000
-PMECC: sector bits = 15, bit 1 means corrupted sector, Now correcting...
-PMECC: failed to correct corrupted bits!
+NAND: Done to load image

So you might add this to your patch:

Tested-by: Alexander Dahl <[email protected]>

I used another patch to create that debug output, but GitHub does not allow me to attach it here. Should I make a PR for that?

Bonus question: I made the same change proposed by your patch in U-Boot and now I can successfully read from the flash in U-Boot, too. When applied the transfer rate is somewhat lower than on the sam9x60 curiosity board with 3 pulse cycles instead of 4.

If I would propose the change to U-Boot developers, should that workaround go into the generic atmel raw nand driver or should it be made a quirk based on the nand chip instead of the nand controller? (Same question would apply to Linux, but I did not test the fix in Linux yet.)

@LeSpocky
Copy link
Contributor Author

One further thing: the SAM9X60-Curiosity User's Guide mentions this for the NAND Flash:

Matched Net Lengths [Tolerance = 0.5mm]

On our prototype board these lengths are not matched. Can that be another reason for the flash not working with the previous timings?

@LiBinSHA
Copy link
Collaborator

LiBinSHA commented Mar 5, 2024

The patch I provided is based on bitbucket, not github, sorry about that. The timing mode 3 issue can also be reproduced on linux. I will apply this patch to bootstrap.

LeSpocky added a commit to LeSpocky/linux that referenced this issue Mar 13, 2024
From reading the S34ML02G1 and the SAM9X60 datasheets again, it seems
like we have to wait tREA after rising RE# before sampling the data.
Thus pulse must be at least tREA.  The fix works in bootstrap, u-boot,
and Linux, but the explanation might be improved.  It probably worked on
sam9g20 and sama5d2 before, because those have a slower mck clock rate
and thus the resolution of the timings setup is not as tight as with
sam9x60.

Without the fix we got PMECC errors when reading after switching to ONFI
timing mode 3.

Link: linux4sam/at91bootstrap#174
Cc: Li Bin <[email protected]>
@LeSpocky
Copy link
Contributor Author

LeSpocky commented Apr 5, 2024

Please try this patch. 0001-driver-nandflash-update-nand-smc-timing.patch

Changeset e2dfd81 hit master. Extends the patch proposed here with another check to only apply it for TIMING_MODE_3.

Meanwhile I could verify the same approach works on U-Boot and Linux.

Did not test v4.0.9-rc1 on real hardware yet though, will do that later.

@LeSpocky
Copy link
Contributor Author

LeSpocky commented Apr 9, 2024

Did not test v4.0.9-rc1 on real hardware yet though, will do that later.

Works for me:

AT91Bootstrap 4.0.9-rc1 (2020-08-01 00:00:00)

NAND: ONFI flash detected
NAND: Manufacturer ID: 0x1 Chip ID: 0xda
NAND: Page Bytes: 2048, Spare Bytes: 64
NAND: ECC Correctability Bits: 1, ECC Sector Bytes: 512
NAND: Switch to timing mode 3
NAND: Disable On-Die ECC
NAND: Initialize PMECC params, cap: 8, sector: 512
NAND: Image: Copy 0xc0000 bytes from 0x40000 to 0x21f00000
NAND: Done to load image

LeSpocky added a commit to LeSpocky/linux that referenced this issue Apr 11, 2024
From reading the S34ML02G1 and the SAM9X60 datasheets again, it seems
like we have to wait tREA after rising RE# before sampling the data.
Thus pulse must be at least tREA.  The fix works in bootstrap, u-boot,
and Linux, but the explanation might be improved.  It probably worked on
sam9g20 and sama5d2 before, because those have a slower mck clock rate
and thus the resolution of the timings setup is not as tight as with
sam9x60.

Without the fix we got PMECC errors when reading after switching to ONFI
timing mode 3.

Link: linux4sam/at91bootstrap#174
Cc: Li Bin <[email protected]>
LeSpocky added a commit to LeSpocky/u-boot that referenced this issue Apr 15, 2024
From reading the S34ML02G1 and the SAM9X60 datasheets again, it seems
like we have to wait tREA after rising RE# before sampling the data.
Thus pulse time must be at least tREA.

Without this fix we got PMECC errors when reading, after switching to
ONFI timing mode 3 on SAM9X60 SoC with S34ML02G1 raw NAND flash chip.

The approach to set timings used before worked on sam9g20 and sama5d2
with the same flash (S34ML02G1), probably because those have a slower
mck clock rate and thus the resolution of the timings setup is not as
tight as with sam9x60.

The approach to fix the issue was carried over from at91bootstrap, and
has been successfully tested in at91bootstrap, U-Boot and Linux.

Link: linux4sam/at91bootstrap#174
Cc: Li Bin <[email protected]>
Signed-off-by: Alexander Dahl <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants