-
-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Items in an in-memory cluster as separate objects #395
Comments
This would potentially save a bit memory cache. But not necessary improve speed. First the cluster cache is only used for compressed cluster. If we decompress a cluster to get one article we have two options :
Yet again, we need measurements before changing this. |
@veloman-yunkan Now that the most of the work has been done in the cache. Should we do these measurements now? |
I support this |
@mgautierfr @veloman-yunkan What is the status here? I believe this is not implemented, but do we still need it? Would that still bring an improvement? |
I don't know if we need this. Caching item's data individually will allow the cache to drop the unused item. We will win memory but in the same time we may have to decompress cluster data several times. |
Then let's not chase it. |
Currently the internal representation of an in-memory cluster (
zim::Cluster
, http://github.com/openzim/libzim/tree/6.2.0/src/cluster.h) is a single buffer (behind azim::Reader
interface) where different ranges correspond to different articles/items. A more optimal representation is one with a separate buffer/blob for every article/item, which will allow more granular caching on article/item level (the problem with coarse caching of clusters is that a cluster may need to be kept in cache because of a single small item, the other items not being accessed at all).The text was updated successfully, but these errors were encountered: