-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consistent graph replication - RDF Dataset Canonicalization #51
Comments
I think this can be applied generically to TREE (tree client)? |
Hmm, I was thinking more to include a hash on each member (version object), that would represent the state of the full represented graph after applying the change: Of course, the hashes would only be valid in tail of the log due to retention deleting objects that have newer state further in the log. |
I actually use that over here, to transform data dumps into an LDES feed: https://github.com/pietercolpaert/DCAT-AP-Dumps-To-Feeds/blob/main/index.ts#L59 I’m not sure however what would be the influence on the LDES spec itself? DO you expect this hash to be present in the member? Do you want a path to point to that property? |
Yes, I would see it as metadata of an event, similar like its timestamp. The hash would indicate the state of the graph after applying the member (or members in case of a transaction). This way we can assure graph integrity over time, the client can validate it holds an exact replica of the graph published/intended. |
When a client requires hard guarantees on consistency, the logic described in the RDF Dataset Canonicalization could be used to provided hashes of the state that should be reached after applying a fragment, or even better, a transaction.
This becomes relevant in cases where LDES is used as a replication protocol for named graphs (the client should have an exact copy of the named graph the publisher intended). For instance, consistency could be lost if a client is offline longer than allowed by the retention period, which could result in missed delete operations (tombstone events). If a checksum mismatch is detected, the client must restart replication from the start of the log to arrive at consistent state.
Reference: https://www.w3.org/TR/rdf-canon/
The text was updated successfully, but these errors were encountered: