Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NFFS can store a valid file-system instance which is unrecoverable after reset. #10

Open
nvlsianpu opened this issue Sep 24, 2018 · 18 comments

Comments

@nvlsianpu
Copy link
Contributor

nvlsianpu commented Sep 24, 2018

It is possible that during regular filesystem operation filesystem footprint is shaped such that during subsequent NFFS restore operation maximum number of block will be exceeded in peek. This is caused by difference in algorithm which should lead to expected (the same) NFFS runtime variables state.

Proposition for resolution of that is to rework NFFS restore in such manner that will be more sequential process than is now:

  • 1st restore inodes to RAM
  • 2nd restore valid block to RAM, thanks to that inodes state are known, the maximum block number shouldn’t be exceeded.

It also looks like both scans should be perform by traversing oldest to newest area. Descending solution should be even more effective, but it is not supported by nffs objects on the flash.

@nvlsianpu nvlsianpu changed the title NFFS can store valid a file-system instance which is unrecoverable after reset. NFFS can store a valid file-system instance which is unrecoverable after reset. Sep 24, 2018
@nvlsianpu
Copy link
Contributor Author

@carlescufi @andrzej-kaczmarek Any comments on that. I will try to resolve the issue soon, is there anyone worth to be added in the loop for discussions?

@nvlsianpu
Copy link
Contributor Author

In consequence at some point of execution call rc = nffs_restore_object(&disk_object); returns FS_ENOMEM error code, which is ignored.

@carlescufi
Copy link
Contributor

@andrzej-kaczmarek @sjanc can you please help us make this known upstream in Mynewt to see if someone there can help @nvlsianpu with the analysis and fix?

@carlescufi
Copy link
Contributor

@ccollins476ad could you take a look at this?

@ccollins476ad
Copy link

ccollins476ad commented Sep 27, 2018

Thanks for reporting this, @nvlsianpu. I agree with the analysis, and I think the proposed solution is a good one.

EDIT: I keep convincing myself that there is still the possibility of failure, even if inodes are processed before data blocks. I will check back after I refamiliarize myself with NFFS!

@ccollins476ad
Copy link

Just to follow up on my last comment - I do think a problem remains, even if we fully process inodes before data blocks. Specifically, the inode pool can become exhausted during file system restoration.

The below is not the greatest explanation in the world. Please don't hesitate to ask if something is not clear.

Unfortunately, when we discover an inode deletion record, we cannot immediately discard the deleted inode and its children. Instead, we need to remember that these items are deleted. If we were to discard deleted inodes during the restore process, that would introduce an ambiguity for subsequently discovered children: is the parent deleted, or has it just not been discovered yet? Currently, we are allowed to assume that the parent is not discovered yet, because we maintain records of deleted inodes.

I have an idea for how to solve this. I think the core issue here is that the contents of an NFFS disk are not in chronological order. For example, say a disk contains 1) a parent directory P, 2) a child file C, and 3) a deletion record for P. NFFS allows these objects to be be located on disk in any order relative to one another. If we had some guarantee that children always appear after their parent, and that deletion records are after the inodes they delete, then the restore process would not need to remember which inodes were deleted. Instead, NFFS could assume that an orphan's parent has been deleted, and could safely discard the orphan.

Currently, NFFS writes an object to any disk area that can accommodate it. This is what introduces the uncertainty in the ordering of objects. Instead, NFFS could write to areas in a sequential fashion. If the object being written does not fit in the current area, NFFS would select the next area as its new current area. Nothing new would get written to the first area until it is garbage collected, even if it could accommodate a subsequent write.

@nvlsianpu
Copy link
Contributor Author

Thanks for elaboration - you are right in your analyze.

What you propose will increase wear levering, especially if object are big in comparison to area size.

We can consider usage of additional RAM temporary, just when initializing:

inode size on disk: 20 B + file_name/directory_name
data_block size on disk: 24 B + data
nffs_area header on disk: 3 B

File deletion:
the worst case: the smallest file create and immediate delete:
[data_block of 1 B data] + [inode of 1 B file_name] + [deletion inode] = 25 + 21 + 20 = 66 B

For each deleted inode only ID and sequence number must be known during file image parsing -> 8 B

each the worst case required 8 B for store required deletion info.

The worst case is that whole partition is filled by deleted files, no directory at all, regular area size:

deleted_inode_info_max = ((Partition_size - area_size - areas_count * 3) / 66 ) * 8. So ratio is around 1/8 [RAM/ROM]

For NFFS which contain 2 areas of size 4096 required 62 item of deleted inode info -> 496 B required.
I think this is much tu much. And will be even worse if considers directory-create-delete scenario.

But I have also idea of change management of IDs:

@nvlsianpu
Copy link
Contributor Author

Lest assume that NFFS will try to avoid make holes in inode ID range - this mean that it will use only smal ranges of ID, count of the maximum number of each kind of inode. This mean that we need to remember (ID, Sequence number) pair of deleted inode.

struct nffs_deleted_inode {
    uint32_t ndi_id;            /* deleted object ID. */
    uint32_t ndi_seq;           /* Sequence number. */
};

Whenever NFFS attempt to create a new inode it will take spare ID form deletion list (uses it with increment sequence number obviously), and just in case the list was empty it will get brand new ID.
For current NFFS we need to create a list of deleted inode ID-Seq pairs with capacity of FS_NFFS_NUM_INODES * 2 (as there are FILE and DIRECTORY inode kind).

Such a patch will ensure that we will always fit in memory constrains when restoring NFFS, but RAM consumption per inode will increase.

@ccollins476ad
Copy link

@nvlsianpu, I think I understand your proposed solution, and I believe it would indeed solve this problem. However, I think there may be another issue that the solution does not address (this issue exists in the current implementation as well).

Key

Symbol Meaning
DEL x Deletion record for inode x
fx(p=y) File inode with ID x; parent directory has ID y
bx(p=y) Data block with ID x; parent file has ID y

Say we have the following NFFS layout:

Area 0 Area 1 Scratch Area 3
f100(p=0) b3(p=100) ... b5(p=200)
b1(p=100) b4(p=100) ... b6(p=200)
b2(p=100) f200(p=0) ... DEL 100

In other words, we have two files, 100 and 200. Each file contains several data blocks. Finally, file 100 has recently been deleted.

When garbage collection occurs, the disk will now look like this:

Area 0 Area 1 Area 2 Scratch
f100(p=0) b3(p=100) b5(p=200) ...
b1(p=100) b4(p=100) b6(p=200) ...
b2(p=100) f200(p=0) ...

That is, the deletion record gets erased.

If the system reboots, file 100 now looks valid again; its deletion record was lost.

I might not be thinking hard enough, but I still believe we need some chronological ordering of records within the disk.

@nvlsianpu
Copy link
Contributor Author

So deleted inode record are not GC properly as i understand - I think if we apply my proposition of manage ID ranges then we will be able to keep deletion inode entry RAM representation, It might be extension to previously proposed struct nffs_deleted_inode.

@ccollins476ad
Copy link

@nvlsianpu , perhaps I just don't understand your proposed solution. During garbage collection, how would NFFS know to preserve the deletion record?

@nvlsianpu
Copy link
Contributor Author

so struct nffs_deleted_inode need to know where overridden inode is - so need to add a reference to it. This reference should be null when referenced inode is deleted.

@nvlsianpu
Copy link
Contributor Author

@ccollins476ad - I just looked at the nffs_gc() code https://github.com/apache/mynewt-nffs/blob/master/src/nffs_gc.c#L498. It looks like block which don't have parent inode in the `from_area' won't be copied. Am I right, or i missed something.

@ccollins476ad
Copy link

@nvlsianpu, I don't believe that is the case. The code applies nffs_gc_inode_blocks() to every inode in the system, regardless of which area they reside in.

@nvlsianpu
Copy link
Contributor Author

@ccollins476ad You are right - I miss the iteration range. Sorry for false alarm.
What do you think about:

so struct nffs_deleted_inode need to know where overridden inode is - so need to add a reference to it. This reference should be null when referenced inode is deleted.

?

@MitterdorferMathias
Copy link

Hallo,
we are currently using NFFS in our project. We also recognized problems during the file system init phase, if the same file was written multiple times.
To resolve this issue do you think that there are changes of the metadata stored in flash required?
(concerning migration of NFFS for a firmware update)

@nvlsianpu
Copy link
Contributor Author

@MitterdorferMathias As you can read actually there are 2 bug-fix way. In both metadata of data and inode object are kept the same. It is possible that metadata of the area object will change.
But for botch solution NFFS will be unable to parse previous FS flash representation.

@nvlsianpu
Copy link
Contributor Author

@ccollins476ad @utzig Any update?

nvlsianpu added a commit to nvlsianpu/zephyr that referenced this issue Jan 17, 2020
NFFS is removed as it has serious bugs (by design) which haven't
been resolved since extended range of time.
One of most serious issues bunch were described here:
apache/mynewt-nffs#10

Since lack of support NFFS upsterem it doesn't make sense to keep
it in zephyr.

Signed-off-by: Andrzej Puzdrowski <[email protected]>
carlescufi pushed a commit to zephyrproject-rtos/zephyr that referenced this issue Jan 21, 2020
NFFS is removed as it has serious bugs (by design) which haven't
been resolved since extended range of time.
One of most serious issues bunch were described here:
apache/mynewt-nffs#10

Since lack of support NFFS upsterem it doesn't make sense to keep
it in zephyr.

Signed-off-by: Andrzej Puzdrowski <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants