You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As noted in [this ticket]#57) loading performance of the owl-load task is rather poor when the database is large. Fuseki offers the ability to load the data at startup and experiments have shown that this is several orders of magnitude faster. So what we need is a new gradle plugin that can load the database from a set of OWL files.
Detailed Description
The intent is that this step would replace owl-load in a workflow. The omlToOwl step already puts all of the OWL files in the build/owl folder. The jena tdbloader can create a tdb database (before fuseki starts) in the filesystem given a list of OWL files.
So what we need this plugin adapter to do is to take the build/owl folder as one argument and perhaps the .fuseki folder as the other, it needs to enumerate all of the owl files in the given folder and then build the tdb database from those OWL files.
Our configuration uses a union graph and Maged says we need to also load data into the union graph. If the loader can't get that detail from the fuseki config file (fuseki.ttl) then it should simply be an option for the plugin (I don't want to have to call the loader twice -- one invocation should do all loading).
(I expect that this would obsolete owl-load as I can't think of any use cases where we would want to use the slow method if the fast method works and I can't think of any use cases where we might want to build the database and then load more owl files)
Acceptance Criteria
All queries produce the same results in a workflow where we only replace owl-load with the new owl-build-database step
Sub-task List
Task 1
Task 2
Task 3
The text was updated successfully, but these errors were encountered:
Note that creating the tdb database in the build folder adds several Gigs of binary file data to the build folder. We don't need/want to save this data to normalized or auxiliary branch so simplest thing to do is just delete it at end of the build.
(maybe also mark this in .gitignore).
Also need to make sure that the container volumes are big enough to hold this data.
User Story
As noted in [this ticket]#57) loading performance of the owl-load task is rather poor when the database is large. Fuseki offers the ability to load the data at startup and experiments have shown that this is several orders of magnitude faster. So what we need is a new gradle plugin that can load the database from a set of OWL files.
Detailed Description
The intent is that this step would replace owl-load in a workflow. The omlToOwl step already puts all of the OWL files in the build/owl folder. The jena tdbloader can create a tdb database (before fuseki starts) in the filesystem given a list of OWL files.
So what we need this plugin adapter to do is to take the build/owl folder as one argument and perhaps the .fuseki folder as the other, it needs to enumerate all of the owl files in the given folder and then build the tdb database from those OWL files.
Our configuration uses a union graph and Maged says we need to also load data into the union graph. If the loader can't get that detail from the fuseki config file (fuseki.ttl) then it should simply be an option for the plugin (I don't want to have to call the loader twice -- one invocation should do all loading).
(I expect that this would obsolete owl-load as I can't think of any use cases where we would want to use the slow method if the fast method works and I can't think of any use cases where we might want to build the database and then load more owl files)
Acceptance Criteria
Sub-task List
The text was updated successfully, but these errors were encountered: