-
Notifications
You must be signed in to change notification settings - Fork 214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(misc-tools): create a selective db clone tool #8494 #10528
base: master
Are you sure you want to change the base?
Conversation
ref: #8494 The tool SwingSet/misc-tools/db-clone.js allows the user to clone only a desired subset of content from SwingStore db into a separeate db file, based on filtering criteria. Command-line options: --backup creates a consistent full backup of a source db, even when the db is opened and in active use by SwingSet. Cannot be combined with any other option. Remaining options can be combined together and with selection filters: --transcripts[=all] copy transcripts, all or according to other filters. --snapshots[=all|auto|<id1>[,<id2>[, ...]]] copy snapshots, all or according to other filters. "auto" selects only snapshots mentioned within the transcript items being copied. --bundles[=all|auto|<id1>[,<id2>[, ...]]] copy bundles, all or according to other filters. "auto" selects only bundles mentioned within the transcript items being copied. Selection filters: --vats=[all|<id1>[,<id2>[, ...]]] Select only vats listed. "all" has the same effect as not using --vats option at all. --startPos=<pos> Select only transcript items with position >= startPos. --endPos=<pos> Select only transcript items with position <= endPos. Other options: --stats print out a short before/after summary. --debug print debugging infromation. Examples: Clone whole database: node db-clone.js SwingStore.sqlite clone.sqlite --backup Clone transcripts of vat v123 starting from position 1000 and through 1200, include only snapshots and bundles found within those transcripts: node db-clone.js SwingStore.sqlite clone.sqlite \ --vat=v123 --startPos=1000 --endPos=1200 \ --transcripts --snapshots=auto --bundles=auto Clone all bundles and all snapshots, show stats: node db-clone.js SwingStore.sqlite clone.sqlite \ --snapshots=all --bundles=all --stats
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't go through the actual implementation yet, but I see a decent amount of duplication with statements that already exist in the swing-store package. Any reason not to move that tool there and refactor to import common abstractions instead?
// create tables regardless of whether any items had been found or not | ||
destDb.exec(`CREATE TABLE IF NOT EXISTS transcriptItems ( | ||
vatID TEXT, position INTEGER, item TEXT, incarnation INTEGER, PRIMARY KEY (vatID, position) | ||
)`); | ||
destDb.exec(`CREATE TABLE IF NOT EXISTS transcriptSpans ( | ||
vatID TEXT, startPos INTEGER, endPos INTEGER, hash TEXT, isCurrent INTEGER CHECK (isCurrent = 1), incarnation INTEGER, | ||
PRIMARY KEY (vatID, startPos), UNIQUE (vatID, isCurrent) | ||
)`); | ||
|
||
destDb.exec( | ||
`CREATE INDEX IF NOT EXISTS currentTranscriptIndex ON transcriptSpans (vatID, isCurrent)`, | ||
); | ||
|
||
const insertTranscriptItems = destDb.prepare( | ||
`INSERT OR IGNORE INTO transcriptItems (vatID, position, item, incarnation) VALUES (?, ?, ?, ?)`, | ||
); | ||
const insertTranscriptSpans = destDb.prepare( | ||
`INSERT OR IGNORE INTO transcriptSpans (vatID, startPos, endPos, hash, isCurrent, incarnation) VALUES (?, ?, ?, ?, ?, ?)`, | ||
); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm worried about maintenance. Is there any way to replicate the schema of the source db ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes and no.
Yes: well, it is obviously available in the source DB itself.
No: we are filtering and looking at very particular columns to massage the data, which implies that we do know the schema upfront.
and now back to yes: this low-level data massaging should be abstracted away into a separate layer, probably below the *Store layer. Ideally it should also allow for SQL-native copy between the databases without having to go through JavaSctipt. See ATTACH DATABASE for example.
I wanted to see what the actual tool would end up looking like and what functionality would it need before getting bogged down with all of the details of such critical component as a SwingStore and its *Store subcomponents. The main requirement was to be able to get the data I need out of production with as little chance of interference (both, performance and code dependencies) as possible. The key stumbling block is that current *Store implementations require write access to the source DB. |
Description
ref: #8494
The db-clone.js tool allows the user to clone only a desired subset of content from SwingStore db into a separate db file, based on filtering criteria. It is intended to be used as primary data extraction and packaging tool for the transcript replay tool[s].
Command-line options:
--backup
creates a consistent full backup of a source db, even
when the db is opened and in active use by SwingSet.
Cannot be combined with any other option.
Remaining options can be combined together and with selection filters:
--transcripts[=all]
copy transcripts, all or according to other filters.
--snapshots[=all|auto|[,[, ...]]]
copy snapshots, all or according to other filters.
"auto" selects only snapshots mentioned within the transcript
items being copied.
--bundles[=all|auto|[,[, ...]]]
copy bundles, all or according to other filters.
"auto" selects only bundles mentioned within the transcript
items being copied.
Selection filters:
--vats=[all|[,[, ...]]]
Select only vats listed.
"all" has the same effect as not using --vats option at all.
--startPos=
Select only transcript items with position >= startPos.
--endPos=
Select only transcript items with position <= endPos.
Other options:
--stats
print out a short before/after summary.
--debug
print debugging information.
Examples:
Clone whole database:
node db-clone.js SwingStore.sqlite clone.sqlite --backup
Clone transcripts of vat v123 starting from position 1000 and through 1200, include only snapshots and bundles found within those transcripts:
node db-clone.js SwingStore.sqlite clone.sqlite
--vat=v123 --startPos=1000 --endPos=1200
--transcripts=yes --snapshots=auto --bundles=auto
Clone all bundles and all snapshots, show stats:
node db-clone.js SwingStore.sqlite clone.sqlite
--snapshots=all --bundles=all --stats
Security Considerations
The tool opens a source database in read-only mode and requires access only to what a local node admin would have access to.
Scaling Considerations
The tool creates a read transaction for the duration of operation, this can have a potential performance impact if used to clone a live production database.
Documentation Considerations
The tool is intended to be used by developers. See description above.
Testing Considerations
TBD; Manual testing so far.
Upgrade Considerations
None.