Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NFFS issue after many writes by btsettings #9748

Closed
bixivs opened this issue Aug 31, 2018 · 14 comments
Closed

NFFS issue after many writes by btsettings #9748

bixivs opened this issue Aug 31, 2018 · 14 comments
Assignees
Labels
area: File System bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug

Comments

@bixivs
Copy link
Contributor

bixivs commented Aug 31, 2018

Hello

I have encountered an issue using the bluetooth settings storage on an nffs partition of the flash of an nrf52832.
After many writes of the setting file the device hangs at boot trying to mount the nffs partition with the following message


***** BUS FAULT *****
  Precise data bus error
  BFAR Address: 0xf41ccc
***** Hardware exception *****
Current thread ID = 0x20002f04
Faulting instruction address = 0x52ee
Fatal fault in essential thread! Spinning...

the crash seems to happen at the instruction:

rc = fs->mount(mp);

line 462 of the fs.c.

I replicated the same issue with a slightly modified mesh example you can find here:
http://dropbox.cblelectronics.com/6a0d2aa14337
and another board that act as a switch (configured as a generic onoff client).
The switch publish the onoff status of the client on the the same group address the onoff server are subscribed on.
After many onoff messages, If I reboot the server node, it gives the error.
I have never verified how many messages are needed to trigger the issue but 50 or more are enough.
The server node is configured to update the saved SEQ and RPL at every message.

Thanks
Daniele

@nashif nashif added the bug The issue is a bug, or the PR is fixing a bug label Aug 31, 2018
@carlescufi
Copy link
Member

@bixivs thanks for the report. I am copying @jhedberg here in case there is something he can think of.

@jhedberg
Copy link
Member

Nothing comes to mind, unfortunately. It is worth pointing out however that pretty much all testing I've done with settings has been with the FCB backend and not NFFS.

@nvlsianpu
Copy link
Collaborator

Thank for report. I'm off work till 10.IX. This might be manifestation of bug either in NFFS (not confirmed #9749) or during the settings destination file compression.

@nashif nashif added the priority: low Low impact/importance bug label Sep 4, 2018
@carlescufi
Copy link
Member

@bixivs is this with current Zephyr master?

@nvlsianpu
Copy link
Collaborator

@bixivs Can you verify this along with newest master as we just fixed #9749, which mighty caused this issue.

@bixivs
Copy link
Contributor Author

bixivs commented Sep 14, 2018

@carlescufi @nvlsianpu
Sorry for the late reply!
Just tested with the current master. I get the same error.
I'm able to reproduce the issue in this way:
This is a modified mesh sample in order to use the NFFS: https://github.com/bixivs/zephyr_mesh_sample_nffs_bug.git
I use an aconno acn52832 http://aconno.de/acn52832/ as a test device, the board files are on the cbl_branch of my zephyr fork https://github.com/bixivs/zephyr.git
After a complete flash erase I program the device and provision it using the bluez meshctl tool issuing the following commands.

power on
security 0
discover-unprovisioned on
provision dddd0000000000000000000000000000
menu config
target 0100
appkey-add 1
bind 0 1 1000
back

then I issue some onoff messages

menu onoff
target 0100
onoff 0
onoff 0
onoff 0
onoff 0

I reset the device and everithing works fine the settings are correctly reloaded and I can reconnect to it using the meshctl.
Now if I send about 100 onoff messages and I reset the node again I get:

***** Booting Zephyr OS zephyr-v1.13.0-122-g077df9e13 *****
Initializing...
Mounting NFFS***** BUS FAULT *****
  Precise data bus error
  BFAR Address: 0xf41ccc
***** Hardware exception *****
Current thread ID = 0x20002f0c
Faulting instruction address = 0x537a
Fatal fault in essential thread! Spinning...

@nvlsianpu
Copy link
Collaborator

@bixivs Is it possible that you will provide content of nffs partition (flash adresses <0x0007a000 0x00006000) )which cause the failure (as a RAW binary file or InteHex)?

@bixivs
Copy link
Contributor Author

bixivs commented Sep 19, 2018

@nvlsianpu Sure, on the attached zip you can find two files. The "stillworking" is the content of the storage partition before the fault and the "failing" after the fault

nffs_issue.zip

@nvlsianpu
Copy link
Collaborator

@bixivs I'm able to reproduce failure you encountered. So far it fails for your peculiar NFFS configuration (with the default one it's not failing) - I will experiment with configuration a little and give you know.

@nvlsianpu
Copy link
Collaborator

I have some data which points what happeneed. For failing nffs partition (you sent), during NFFS initialization, NFFS attempt to alocate 108 data-block in pick (it end up with 14).

@nvlsianpu
Copy link
Collaborator

nvlsianpu commented Sep 24, 2018

apache/mynewt-nffs#10
NFFS can store a valid file-system instance which is unrecoverable after reset.

@nashif
Copy link
Member

nashif commented Feb 19, 2019

any update on this?

@nvlsianpu
Copy link
Collaborator

Jullabs declare to work on fixing this in near future (in a month probably) - This bug is very tough one, so even if they start today it need few week to be to fixed.

@nvlsianpu
Copy link
Collaborator

NFFS support was removed #21793.
You can use LittleFS as FS back-end instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: File System bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug
Projects
None yet
Development

No branches or pull requests

5 participants