Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up fat32 impl #255

Merged
merged 2 commits into from
Sep 11, 2024
Merged

Speed up fat32 impl #255

merged 2 commits into from
Sep 11, 2024

Conversation

Frostman
Copy link
Contributor

@Frostman Frostman commented Sep 10, 2024

Hello,

We're looking to use go-diskfs to build an air-gapped installer image for our product. It includes a bunch of files (like grub tree), OCI images, binaries and etc.

The speed up is 20+ times on bigger file systems - from 540 seconds to 23 seconds for a 1G image with two partitions of ~500MB each.

The test setup is:

Intel(R) Core(TM) Ultra 7 155H, 32GiB RAM, SN740 NVMe WD 512GB
go 1.23.0
6.10.7-200.fc40.x86_64 #1 SMP PREEMPT_DYNAMIC

Here are three profiling reports:

@deitch
Copy link
Collaborator

deitch commented Sep 10, 2024

20x! That is great. I will kick off CI here, and take a closer look tomorrow.

@deitch
Copy link
Collaborator

deitch commented Sep 11, 2024

CI is green, so happy to merge this in, and thank you.

Do you mind linking to the product where you use it? I always like to learn more about how things are used in the wild.

I took a look at what you did. Instead of storing the clusters as a map, you replaced it with a slice. Whereas for a map, the indication that a cluster is not used is the fact that it does not exist in the map, for the slice it is that the value at that position in the slice is 0. A slice is not inherently faster or slowed than a map, and indexed slice lookups vs hashed map lookups should be about the same speed. So where does the speedup come from? Is it the allocateSpace(), where it needs to loop through the clusters and create a slice of keys and then sort them?

@deitch deitch merged commit 971d4ea into diskfs:master Sep 11, 2024
20 checks passed
@Frostman
Copy link
Contributor Author

Frostman commented Sep 11, 2024

@deitch, thx for merging!

Removing unused keys sort gave "only" 3x speed up, and that's b/c sort is not too slow, I think building list of keys before sort was the slower part. If you'd look at the profile, you can see that mapaccess2_fast32 called from allocateSpace is the biggest time consumer and that is just implementation of the get from the map, source https://go.dev/src/runtime/map_fast32.go. The problem is that read from the map is way more expensive then read from slice by index (on big scale). When you're reading from map you need to calculate hash and go through hash collisions/overflow. When you're reading from slice it's simply memory pointer offset.

@Frostman
Copy link
Contributor Author

@deitch the project I've mentioned is an installer for the https://hedgehog.cloud/. We're a startup building a product for DCs.

@deitch
Copy link
Collaborator

deitch commented Sep 11, 2024

project I've mentioned is an installer for the https://hedgehog.cloud/. We're a startup building a product for DCs

That is pretty cool. I will reach out offline to ask about it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants