Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimization potential for df on FAT? #21

Open
Mellvik opened this issue Aug 30, 2023 · 1 comment
Open

Optimization potential for df on FAT? #21

Mellvik opened this issue Aug 30, 2023 · 1 comment
Labels
enhancement New feature or request

Comments

@Mellvik
Copy link
Owner

Mellvik commented Aug 30, 2023

While rounding up the work on issue #20 , I noticed that doing a df on a mounted FAT volume generates an enormous amount of unmaps/remaps of the same block. Enormous = 487

Even on a small (17M) FAT volume, df is notoriously time consuming - if significantly better today than a couple of years ago (ELKS). I have been assuming that the reason was lots of disk reads to gather required data, but 17 consecutive blocks isn't much (the metric seems to be 1 block per MB for FAT16, 41 consecutive blocks on a 41M volume).

Watching the 487 maps/unmaps per block - and (notably) not having looked at the code - it seems there must be some optimization potential, best case at the FS level, worst case in the df code itself.

@Mellvik Mellvik added the enhancement New feature or request label Aug 30, 2023
@ghaerr
Copy link

ghaerr commented Aug 30, 2023

Are you using the newer list_buffer_status code that shows maps/remaps/unmaps? If so, it is important to differentiate between a remap and the others. A remap shows that the desired buffer was present in L1 (not necessarily with b_mapcount > 0) and was reused, without a L2<->L1 copy. The map and unmap counts show copies in and out of L1 respectively, and take lots of time. It should be showing lots and lots of remaps and few map/unmaps, relatively.

As you know, calculating disk free on FAT filesystems got so bad that FAT32 introduced a new mechanism for that, which isn't supported. We use the original method which has to scan the entire FAT table, thus mapping every single FAT block, likely many times, which is probably what you're seeing.

it seems there must be some optimization potential, best case at the FS level, worst case in the df code itself.

The df code uses the new ustatfs system call which then uses the same mechanism as reported by the mount command. IIRC there's an option as to whether to calculate FAT free space since its well known to be a very slow algorithm on larger hard drives. There is no way it can be sped up without moving to the new FAT32 method, which essentially runs it in the background on DOS/Windows and then writes a snapshot block somewhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants