In-memory database with persistence #189

rhashimoto · 2024-06-29T19:36:26Z

rhashimoto
Jun 29, 2024
Maintainer

TL;DR IDBMirrorVFS is a new example VFS that keeps all SQLite files in memory while persisting to IndexedDB. It allows multiple concurrent readers with a writer (like WAL), and works on all contexts (Window, dedicated worker, shared worker, service worker). It is very fast, but is limited to databases that can fit into available memory.

One of the key tricks that OPFSPermutedVFS uses is to use BroadcastChannel to send write transaction information from the writing connection to all other connections (in addition to storing it in IndexedDB). I earlier noted that this could also be used to implement a smart page cache where only pages that were actually changed would be invalidated.

I did make a brief attempt to add such a smart cache to OPFSPermutedVFS, but it wasn't obvious how to test and demo it. The conditions to show it off require multiple connections and an access pattern that mainly read the unchanged parts of the database (which would still be in the cache). So I never finished it because I didn't have a ready way to see if and how well it worked.

Nevertheless, that is the basic idea behind the new example IDBMirrorVFS. It can be thought of as a VFS that keeps everything in cache. Changes to the database are broadcast and each connection updates its cache.

The advantages of this approach are speed and concurrency. Since everything is cached in memory, reads and writes are synchronous and fast. It also allows WAL-like synchronization, i.e. where read transactions don't need to sync with write transactions or each other. The disadvantage is that everything must fit into memory, so it can only accommodate databases up to a certain size (depending on browser limits and application memory usage), and opening a database requires reading the entire file.

This idea could be used with any browser storage but because its main feature is speed, it seemed like a good match for IndexedDB. Using IndexedDB for persistence means that the database can always be in the same context as the application code that needs it, unlike OPFS which only runs in a dedicated Worker, so messaging costs can be avoided.

Here are some performance comparisons with IDBBatchAtomicVFS using the contention demo (which mainly measures latency, not throughput). All runs use PRAGMA synchronous=normal to trade durability for performance.

Here IDBMirrorVFS is vastly faster with a single reader:

[12:02:50.944] build: asyncify
[12:02:50.944] config: IDBMirrorVFS
[12:02:50.944] nWriters: 0
[12:02:50.944] nReaders: 1
[12:02:50.944] nSeconds: 1
[12:02:57.047] launch workers
[12:02:57.609] start
[12:02:58.609] worker 0 reader 39628 iterations
[12:02:58.609] complete

[12:03:35.408] build: asyncify
[12:03:35.408] config: IDBBatchAtomicVFS
[12:03:35.408] nWriters: 0
[12:03:35.408] nReaders: 1
[12:03:35.408] nSeconds: 1
[12:04:44.020] launch workers
[12:04:44.570] start
[12:04:45.570] worker 0 reader 2602 iterations
[12:04:45.570] complete

Both allow concurrent readers, so IDBMirrorVFS extends its massive edge:

[12:08:09.281] build: asyncify
[12:08:09.281] config: IDBMirrorVFS
[12:08:09.281] nWriters: 0
[12:08:09.281] nReaders: 2
[12:08:09.281] nSeconds: 1
[12:08:11.834] launch workers
[12:08:12.156] start
[12:08:13.156] worker 0 reader 37095 iterations
[12:08:13.156] worker 1 reader 37479 iterations
[12:08:13.156] complete

[12:08:51.469] build: asyncify
[12:08:51.469] config: IDBBatchAtomicVFS
[12:08:51.469] nWriters: 0
[12:08:51.469] nReaders: 2
[12:08:51.469] nSeconds: 1
[12:08:53.851] launch workers
[12:08:54.194] start
[12:08:55.194] worker 0 reader 1904 iterations
[12:08:55.194] worker 1 reader 1911 iterations
[12:08:55.194] complete

With a single writer, IDBMirrorVFS is still significantly faster but not nearly as much:

[12:17:30.109] build: asyncify
[12:17:30.109] config: IDBMirrorVFS
[12:17:30.109] nWriters: 1
[12:17:30.109] nReaders: 0
[12:17:30.109] nSeconds: 1
[12:17:32.227] launch workers
[12:17:32.386] start
[12:17:33.386] worker 0 writer 1487 iterations
[12:17:33.386] complete

[12:18:20.210] build: asyncify
[12:18:20.210] config: IDBBatchAtomicVFS
[12:18:20.210] nWriters: 1
[12:18:20.210] nReaders: 0
[12:18:20.210] nSeconds: 1
[12:18:22.082] launch workers
[12:18:22.510] start
[12:18:23.510] worker 0 writer 547 iterations
[12:18:23.511] complete

Writes are not concurrent (for either VFS):

[12:19:47.339] build: asyncify
[12:19:47.339] config: IDBMirrorVFS
[12:19:47.339] nWriters: 2
[12:19:47.339] nReaders: 0
[12:19:47.339] nSeconds: 1
[12:19:50.347] launch workers
[12:19:50.675] start
[12:19:51.677] worker 0 writer 556 iterations
[12:19:51.680] worker 1 writer 555 iterations
[12:19:51.680] complete

[12:20:42.459] build: asyncify
[12:20:42.459] config: IDBBatchAtomicVFS
[12:20:42.459] nWriters: 2
[12:20:42.459] nReaders: 0
[12:20:42.459] nSeconds: 1
[12:20:44.669] launch workers
[12:20:45.002] start
[12:20:46.004] worker 1 writer 344 iterations
[12:20:46.006] worker 0 writer 74 iterations
[12:20:46.006] complete

Here a mix of 2 readers and 1 writer, where IDBMirrorVFS shows off its reader/writer concurrency:

[12:21:36.010] build: asyncify
[12:21:36.010] config: IDBMirrorVFS
[12:21:36.010] nWriters: 1
[12:21:36.010] nReaders: 2
[12:21:36.010] nSeconds: 1
[12:21:37.875] launch workers
[12:21:38.212] start
[12:21:39.212] worker 1 reader 23087 iterations
[12:21:39.212] worker 0 writer 1263 iterations
[12:21:39.214] worker 2 reader 21847 iterations
[12:21:39.214] complete

[12:22:03.430] build: asyncify
[12:22:03.430] config: IDBBatchAtomicVFS
[12:22:03.430] nWriters: 1
[12:22:03.430] nReaders: 2
[12:22:03.430] nSeconds: 1
[12:22:05.340] launch workers
[12:22:05.692] start
[12:22:06.694] worker 1 reader 585 iterations
[12:22:06.695] worker 2 reader 541 iterations
[12:22:06.695] worker 0 writer 376 iterations
[12:22:06.695] complete

In case you're interested here is OPFSPermutedVFS, the previous holder of the concurrency crown, with 2 readers and 1 writer:

[12:29:45.496] build: asyncify
[12:29:45.496] config: OPFSPermutedVFS
[12:29:45.496] nWriters: 1
[12:29:45.496] nReaders: 2
[12:29:45.496] nSeconds: 1
[12:29:47.843] launch workers
[12:29:48.210] start
[12:29:49.210] worker 2 reader 1600 iterations
[12:29:49.210] worker 0 writer 839 iterations
[12:29:49.210] worker 1 reader 1585 iterations
[12:29:49.210] complete

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

In-memory database with persistence #189

{{title}}

Replies: 0 comments

Select a reply

In-memory database with persistence #189

rhashimoto Jun 29, 2024 Maintainer

Replies: 0 comments

rhashimoto
Jun 29, 2024
Maintainer