-
-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimization of zim::Cache #385
Comments
Great issue to explore. It is, definitely, the biggest performance degradation factor in libzim. I'll leave some follow-up question:
|
@veloman-yunkan I'm not sure if this ticket is a questionning or a kind of plan to change something... What I believe is that this is really a topic worth a discussion and would like an agreement about a better solution before making a PR. |
Honestly, I never have a look at the cache code.
We could :
The
The
Would it solve our issue ?
Do you have number for this ?
What strategy are you thinking about ? In case of zimdump/zimcheck, we iterate hover the entry in the cluster order. We don't need to keep a cluster in the cache as I soon we need another cluster we know we have finish with the old one (modulo the number of thread). We could simply limit the cache size to the number of thread to reduce the cache size (and speedup the lookup). But not sure it worth it if we reimplement the cache correctly. |
@mgautierfr @kelson42 Before optimizing (rather, reimplementing) our own cache, maybe it makes sense to use a 3rd-party cache? |
My own plan was to implement something like https://github.com/lamerman/cpp-lru-cache, with the only difference of using an |
I had a look in the implementation and it seems good. It should be pretty easy to test (at least) cache implementation/efficiency and take decision based on results. |
Benchmarking cpp-lru-cache against the current implementation of
|
Now how do we proceed from here? cpp-lru-cache needs a couple of minor modifications. On the other hand it is small & simple to an extent that if I implemented it on my own (instead of googling) we might spend overall less time than if we decide to integrate cpp-lru-cache in libzim. |
The speed up is good. Promising. What kind of minor modifications ? But indeed the implementation is pretty simple. Maybe it is simpler to take the two methods and replace our put/get method with the ones of |
Two remarks:
|
One more datapoint. Having performed in the context of a different PR a similar benchmark on a different machine using an earlier version of the "same" ZIM file, I've then observed about 2x shorter runtimes. The difference was unlikely to be attributed to the difference in hardware, so I also ran that earlier ZIM file on my desktop.
The newer file contains only slightly more data than the old one, but is twice more costly to process. Any ideas what might have caused this? |
Thanks for the ping. It doesn't look like any distro has packaged cpp-lru-cache. It also seems inactive with no commits since 2017. It's only about 70 lines, so given that we also need to modify it, I'd suggest just including that header file in the libzim code (with proper attribution/licensing of course), treating it as one of "our" files rather than trying to use it as a library. |
I don't know, zim structures seems relatively identical. |
@veloman-yunkan You have been able to explain the oddity you reported earier? |
@kelson42 No, I haven't looked into it yet. One hypothesis is the cache overflow - if the new & slightly larger data requires a larger resident set size, this may lead to significantly higher number of cache misses, provided that the RSS (in terms of utilized cache entries) of the earlier/smaller data was close to the cache size limit. |
@kelson42 But the actual reason shouldn't prevent us from switching to the better cache. I will debug the noticed oddity later.
@mgautierfr I am not going to spend time on this for the same reason. It is obvious that cpp-lru-cache is significantly faster and we should switch to it in any case.
You can see them in #405. But I am going to improve the new cache a little more. |
I repeated the slow runs twice in a row and the run-time figures stayed the same. |
This hypothesis is likely wrong. Separately varying the dirent cache size from 512 (default) to 1024 and the cluster cache size from 16 (default) to 24 has very weak (under 3%) effect on the run time of |
I agree with @veloman-yunkan, zimcheck is slower with the new file, whatever we use the new or old cache system. |
Current implementation of the LRU-cache in
src/cache.h
can be unacceptably slow for a large number of elements and certain access patterns, since itsget()
andput()
operations contain branches with O(N) time complexity. Given the current usage of the cache insrc/fileimple.cpp
the linear complexity is due to theCache::_getOldest()
member function.The text was updated successfully, but these errors were encountered: