An experiment to use Blazegraph as a data backend for an ActivityPub server.
The ActivityPub is a protocol is a decentralized social networking protocol based on ActivityStreams. ActivityStreams specifies the representation of activities and actors in social networks. Activities and actors are represented as Linked data, serialized as JSON-LD.
Using Linked Data as the foundational data model in an ActivityPub server removes any limitations on the type of data the server can handle and allows interesting queries on data in the social network (as well as data that is linked to from the network).
However not all data in an ActivityPub server is public. Activities might be addressed to individual actors or groups and should not be publicly accessible.
Goal of this experiment is to annotate individual RDF triples with access control information, enforcing the privacy of non-public data and still offering full linked data query abilities to actors (the ability to run SPARQL queries on the data the actor is authorized to access).
Blazegraph is a graph database/RDF triplestore. It has many interesting features, among them Reification Done Right (RDR), a fancy word for adding metadata to triples. We add access control information with RDR.
Blazegraph is a graph database that supports annotating triples with metadata by reifying triples. That is triples can themselves become subject or object of a triple.
Blazegraph can be obtained from GitHub.
To start the database:
java -jar blazegraph.jar
This document is written in Emacs Org-mode and uses restclient (and ob-restclient) for inline requests to the REST API.
Blazegraph supports multiple namespaces that correspond to dedicated repositories. The default namespace is not in RDR mode.
We create a RDR enabled namespace with name rdr
:
POST http://localhost:9999/blazegraph/namespace
Content-Type: text/plain
com.bigdata.rdf.sail.namespace=rdr
com.bigdata.rdf.store.AbstractTripleStore.statementIdentifiers = true
While playing around it might be useful to delete the entire namespace (and re-create) it:
DELETE http://localhost:9999/blazegraph/namespace/rdr
The Web Access Control (WAC) specification is a way of describing an access control system.
We create two authorization modes for our users Alice and Bob:
POST http://localhost:9999/blazegraph/namespace/rdr/sparql
Content-Type: application/x-turtle
@prefix acl: <http://www.w3.org/ns/auth/acl#> .
@prefix ex: <http://example.com/> .
<http://example.com/alice/authorization/full>
a acl:Authorization;
acl:agent ex:alice;
acl:mode acl:Read,
acl:Write,
acl:Control.
<http://example.com/alice/authorization/read>
a acl:Authorization;
acl:agent ex:alice;
acl:mode acl:Read.
<http://example.com/bob/authorization/full>
a acl:Authorization;
acl:agent ex:bob;
acl:mode acl:Read,
acl:Write,
acl:Control.
<http://example.com/bob/authorization/read>
a acl:Authorization;
acl:agent ex:bob;
acl:mode acl:Read.
We can also define a mode of access for public data (by setting acl:agentClass
to foaf:Agent
). We could reuse the special ActivityPub public collection as Authorization.
POST http://localhost:9999/blazegraph/namespace/rdr/sparql
Content-Type: application/x-turtle
@prefix acl: <http://www.w3.org/ns/auth/acl#> .
@prefix as: <https://www.w3.org/ns/activitystreams#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
as:Public
a acl:Authorization;
acl:agentClass foaf:Agent;
acl:mode acl:Read.
If Alice would want to create a note addressed to the public, the ActivityPub Create
activity might look like this:
{
"@context": "https://www.w3.org/ns/activitystreams",
"type": "Create",
"id": "http://example.com/activity/1",
"actor": "http://example.com/alice",
"object": {
"type": "Note",
"id": "http://example.com/note/1",
"content": "This is a note",
},
"to": ["https://www.w3.org/ns/activitystreams#Public"]
}
In turtle:
@prefix as: <https://www.w3.org/ns/activitystreams#> . @prefix ex: <http://example.com/> . <http://example.com/activity/1> a as:Create . <http://example.com/activity/1> as:actor ex:alice . <http://example.com/activity/1> as:object <http://example.com/note/1> . <http://example.com/activity/1> as:to as:Public. <http://example.com/note/1> a as:Note . <http://example.com/note/1> as:content "This is a note" .
We annotate all these triples indicating that they are readable by the public and Alice has full authorization.
POST http://localhost:9999/blazegraph/namespace/rdr/sparql
Content-Type: application/x-turtle-RDR
@prefix as: <https://www.w3.org/ns/activitystreams#> .
@prefix acl: <http://www.w3.org/ns/auth/acl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix ex: <http://example.com/> .
as:Public acl:accessTo <<<http://example.com/activity/1> rdf:type as:Create>> .
as:Public acl:accessTo <<<http://example.com/activity/1> as:actor ex:alice>> .
as:Public acl:accessTo <<<http://example.com/activity/1> as:object <http://example.com/note/1>>> .
as:Public acl:accessTo <<<http://example.com/activity/1> as:to as:Public>> .
as:Public acl:accessTo <<<http://example.com/note/1> rdf:type as:Note>> .
as:Public acl:accessTo <<<http://example.com/note/1> as:content "This is a note">> .
<http://example.com/alice/authorization/full>
acl:accessTo <<<http://example.com/activity/1> rdf:type as:Create>> .
<http://example.com/alice/authorization/full>
acl:accessTo <<<http://example.com/activity/1> as:actor ex:alice>> .
<http://example.com/alice/authorization/full>
acl:accessTo <<<http://example.com/activity/1> as:object <http://example.com/note/1>>> .
<http://example.com/alice/authorization/full>
acl:accessTo <<<http://example.com/activity/1> as:to as:Public>> .
<http://example.com/alice/authorization/full>
acl:accessTo <<<http://example.com/note/1> rdf:type as:Note>> .
<http://example.com/alice/authorization/full>
acl:accessTo <<<http://example.com/note/1> as:content "This is a note">> .
Notes:
- Using
a
does not seem to work in the<< >>
parts of RDR. Usingrdf:type
explicitly works. - It should be possible to do this much nicer with a SPARQL Update query.
If Alice wants to send Bob a private note (not addressed to the public), the Create
acvitity might look like this:
{
"@context": "https://www.w3.org/ns/activitystreams",
"type": "Create",
"id": "http://example.com/activity/2",
"actor": "http://example.com/alice",
"object": {
"type": "Note",
"id": "http://example.com/note/2",
"content": "This is a note",
},
"to": ["https://example.com/bob"]
}
We write this down in turtle and annotate with access control information saying that Bob has read access to the note (and activity) whereas Alice has full access:
POST http://localhost:9999/blazegraph/namespace/rdr/sparql
Content-Type: application/x-turtle-RDR
@prefix as: <https://www.w3.org/ns/activitystreams#> .
@prefix acl: <http://www.w3.org/ns/auth/acl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix ex: <http://example.com/> .
<http://example.com/bob/authorization/read>
acl:accessTo <<<http://example.com/activity/2> rdf:type as:Create>> .
<http://example.com/bob/authorization/read>
acl:accessTo <<<http://example.com/activity/2> as:actor ex:alice>> .
<http://example.com/bob/authorization/read>
acl:accessTo <<<http://example.com/activity/2> as:object <http://example.com/note/2>>> .
<http://example.com/bob/authorization/read>
acl:accessTo <<<http://example.com/activity/2> as:to as:Public>> .
<http://example.com/bob/authorization/read>
acl:accessTo <<<http://example.com/note/2> rdf:type as:Note>> .
<http://example.com/bob/authorization/read>
acl:accessTo <<<http://example.com/note/2> as:content "This is a note">> .
<http://example.com/alice/authorization/full>
acl:accessTo <<<http://example.com/activity/2> rdf:type as:Create>> .
<http://example.com/alice/authorization/full>
acl:accessTo <<<http://example.com/activity/2> as:actor ex:alice>> .
<http://example.com/alice/authorization/full>
acl:accessTo <<<http://example.com/activity/2> as:object <http://example.com/note/1>>> .
<http://example.com/alice/authorization/full>
acl:accessTo <<<http://example.com/activity/2> as:to as:Public>> .
<http://example.com/alice/authorization/full>
acl:accessTo <<<http://example.com/note/2> rdf:type as:Note>> .
<http://example.com/alice/authorization/full>
acl:accessTo <<<http://example.com/note/2> as:content "This is a note">> .
Now that we have some data in the database, let’s query.
Easiest to point your browser to http://localhost:9999/blazegraph/#query and copy-paste the queries.
PREFIX acl: <http://www.w3.org/ns/auth/acl#> PREFIX as: <https://www.w3.org/ns/activitystreams#> SELECT ?s ?p ?o WHERE { BIND( <<?s ?p ?o>> AS ?t ) . as:Public acl:accessTo ?t . }
PREFIX acl: <http://www.w3.org/ns/auth/acl#> PREFIX as: <https://www.w3.org/ns/activitystreams#> SELECT ?s ?p ?o WHERE { BIND( <<?s ?p ?o>> AS ?t ) . <http://example.com/bob/authorization/read> acl:accessTo ?t . }
TODO This does not include public data. How would a query look for all data Bob can read including public data?
Say Bob wants to run an arbitary SPARQL query, how can the query be transformed to account for access control?
- Evaluation of Metadata Representations in RDF stores
- An experiment to use Annotated RDF for Access Control
- DATAtourisme: An open data platform for touristic information that uses Blazegraph.
- CommonsPub: A project to build a generic ActivityPub server
This experiment is conducted as part of the openEngiadina project.
For questions, feedback and comments, contact pukkamustard (pukkamustard [at] posteo [dot] net).