You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, updating the database is the same as creating a new one- all records are deleted and re-ingested. This works well to ensure that objects that are deleted are properly handled. However, once the database reaches a certain size this can be an expensive operation.
Instead, we may want to figure out how to handle a delta-style ingest, that is, only process those JSON documents that have been updated. This may be tricky and may require several iterations.
I do think for purposes of Production-level databases and testing one may want to build the entire database, so this is more thinking about the development aspect or user's local copies were they may not want/need a production-ready instance. Or for instances were the production-level database is so large we only want to apply deltas.
I can think of several aspects we may want to check out:
Perform no deletions, only insert JSON files that have been produced. This is already supported, but when saving the database by default this saves all JSON output. It's also not clear if we'd have foreign key violations, particularly if reference tables have been updated.
Use git diff to determine which JSON documents have changed. Do not delete or change anything else. This requires git installed and for the data to be version controlled, both likely true in development situations.
Figure out a way to export only records that have changed when saving the database. I do not know if there is a way to capture all changes to the DB since a connection was made. It might be architecture-dependent, but it starts getting into DB data migration tools of which several exist but our DB/Tool architecture wasn't built with them in mind.
The text was updated successfully, but these errors were encountered:
Currently, updating the database is the same as creating a new one- all records are deleted and re-ingested. This works well to ensure that objects that are deleted are properly handled. However, once the database reaches a certain size this can be an expensive operation.
Instead, we may want to figure out how to handle a delta-style ingest, that is, only process those JSON documents that have been updated. This may be tricky and may require several iterations.
I do think for purposes of Production-level databases and testing one may want to build the entire database, so this is more thinking about the development aspect or user's local copies were they may not want/need a production-ready instance. Or for instances were the production-level database is so large we only want to apply deltas.
I can think of several aspects we may want to check out:
The text was updated successfully, but these errors were encountered: