Performance questions #16

dpc · 2021-02-10T06:45:38Z

Hi,

I'm reading through the design docs, and I have couple of questions.

First, I wonder what's the ballpark of a speedup that hammersbald gives relative to rocksdb.

Second, I'm not very familiar with linear hashing (probably need to do my homework), but I wonder - wouldn't LSM be better for maintaining a key to offset mappings? Seems like consistent hashing involves a mutable data structure, that (I'm guessing) would need some synchronization here and there, while LSM is kind of append-only (just like rest of the data in hammersbald) parallelizes great, and both CPUs and modern fast storage seem to evolve into heavy parallelization direction. See https://itnext.io/modern-storage-is-plenty-fast-it-is-the-apis-that-are-bad-6a68319fbc1a & https://crates.io/crates/glommio . Also, with LSM maybe one could completely avoid having to deal with recovery-log when coupled to blockchain indexing. The way I understand it LSM can't really corrupt itself (since it's append only), it can only truncate the data. The same data that available in the blockchain anyway, and in case of a crash can just get re-inserted.

Edit: Oh. I guess the read amplification is what is making LSM not great. I think my mindset is to focused on initial blockchain indexing performance assuming everything was already validated, and I ignored the lookups.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance questions #16

Performance questions #16

dpc commented Feb 10, 2021 •

edited

Loading

Performance questions #16

Performance questions #16

Comments

dpc commented Feb 10, 2021 • edited Loading

dpc commented Feb 10, 2021 •

edited

Loading