Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store metadata in redb #1954

Closed
wants to merge 41 commits into from
Closed

Store metadata in redb #1954

wants to merge 41 commits into from

Conversation

dignifiedquire
Copy link
Contributor

@dignifiedquire dignifiedquire commented Jan 15, 2024

This stores metadata in redb instead of in the file system. A part of the data can be inferred from the file system (e.g. partial and complete hashes), another part is exclusively in the redb database (tags).

The upside of this is that it reduces startup delay due to scanning the file system. Also, manipulation of tags should be much faster now since it does not involve IO anymore.

The downside is that now the file system can become inconsistent with the metadata, since there is redundant information.

Closes #1942

@rklaehn rklaehn self-assigned this Jan 16, 2024
@rklaehn rklaehn marked this pull request as ready for review January 19, 2024 15:07
@rklaehn rklaehn changed the title [WIP] improve startup time Store metadata in redb Jan 19, 2024
- Migration code now uses a new directory blobs-v2
- Integration tests have to sync metadata after changing the file system
(and maybe later for use from iroh cli)
@rklaehn rklaehn force-pushed the perf-startup branch 10 times, most recently from ba335df to d2d816d Compare January 29, 2024 12:45
Version is now tracked within the database.
@rklaehn
Copy link
Contributor

rklaehn commented Jan 29, 2024

Update on this whole trainwreck. There wasn't a bug. I got this helper fn step that waits for a certain number of gcs before proceeding. But this did not drain the queue before, so sometimes it would not wait because there were old GcCompleted events in the queue. This is a normal flume queue, not a broadcast queue that forgets messages.

@dignifiedquire
Copy link
Contributor Author

I guess the bug was in the test helper 😅

@rklaehn
Copy link
Contributor

rklaehn commented Jan 30, 2024

OK, I have second thoughts about merging this as is. It completely removes the outboard cache, hereby completely changing the performance characteristics of the thing.

The former version of the flat store had a consistent concept. It might not have been perfect, but it worked for a large set of use cases. This is some weird hybrid between the former concept and something else.

image

@dignifiedquire
Copy link
Contributor Author

this outboard cache was a pretty big footgun given it wasn't limited in size asfaict, so if we bring it back it should be at minimum size limited

@rklaehn
Copy link
Contributor

rklaehn commented Jan 30, 2024

this outboard cache was a pretty big footgun given it wasn't limited in size asfaict, so if we bring it back it should be at minimum size limited

I don't think so. Outboards 1/256 of data size, so for a 1 TiB disk even if you had all outboards in memory it would be just 4 GiB. Ok, could become a problem if you have a Raspberry Pi 3 with a giant external hard drive...

But for any real world app even on mobile it would be fine.

Part of the reason for the whole bao-tree crate and the chosen chunk group size of 16 chunks was that you would be able to hold the outboards in memory even on a small device. With the original bao crate it would have been 1/16, which would have been way too much.

@ppodolsky
Copy link
Contributor

ppodolsky commented Jan 31, 2024

I don't think so. Outboards 1/256 of data size, so for a 1 TiB disk even if you had all outboards in memory it would be just 4 GiB. Ok, could become a problem if you have a Raspberry Pi 3 with a giant external hard drive...

Ha-ha, exactly me. 8GB RAM Pi with attached 200TB HDD :D Seems I'm quite unlucky customer.
May it worth to use some sort of LRU for dat cache?

@rklaehn
Copy link
Contributor

rklaehn commented Jan 31, 2024

I don't think so. Outboards 1/256 of data size, so for a 1 TiB disk even if you had all outboards in memory it would be just 4 GiB. Ok, could become a problem if you have a Raspberry Pi 3 with a giant external hard drive...

Ha-ha, exactly me. 8GB RAM Pi with attached 200TB HDD :D Seems I'm quite unlucky customer. May it worth to use some sort of LRU for dat cache?

OK, seriously, thank you for being such a demanding customer.

I am working on a larger refactoring. Basically I want to store small files inline in the redb always, and also store small outboards in the redb once they are complete. This should reduce both the number of files and the number of file operations for many use cases.

Having the data for small files and the outboards for small to medium files in redb would provide a kind of size limited cache for those outboards, just the redb cache. And for larger files I would then rely on the file system cache of the operating system.

Could you do me a favour and compute some sort of histogram of your file sizes, or just do something like

stat -f "%N %z" * # (Mac)
stat -c "%n %s" * # (Linux)

and send me the result?

That would help a lot.

Here is the new PR, building on this one: #1985

@ppodolsky
Copy link
Contributor

ppodolsky commented Jan 31, 2024

I have done this: find . -type f -print0 | xargs -0 ls -l | awk '{size[int(log($5)/log(2))]++}END{for (i in size) printf("%10d %3d\n", 2^i, size[i])}' | sort -n

         0  11
        32   2
       256   2
       512   1
      2048  78
      4096 983
      8192 2370
     16384 24614
     32768 31862
     65536 61111
    131072 111704
    262144 148639
    524288 160883
   1048576 132321
   2097152 87118
   4194304 51952
   8388608 24728
  16777216 9846
  33554432 4255
  67108864 1883
 134217728 561
 268435456 106
 536870912  45
1073741824  12

@dignifiedquire
Copy link
Contributor Author

closing in favor of ongoing PRs refactoring the store in a more step-by-step fashion

@rklaehn rklaehn deleted the perf-startup branch April 10, 2024 15:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Slow boot time with a large repo size
3 participants