We added an extension to YARRRML to generate RML that generates a Linked Data Event Stream (LDES). LDES specifies how to model and publish changes in documents as a stream of events. Each event, called member in LDES speak, is a version of an original document.
We provide an ldes
key in subjects
mappings
to generate necessary LDES members and metadata.
We will explain the different options for generating LDES by showing some examples. The YARRRML mappings and output are abbreviated (prefixes and sources are omitted) to focus on the relevant parts, but the complete examples can be found here.
All examples use the same input data: temperature readings from two sensors:
SensorID,Timestamp,Temperature
1,2023-01-01T08:00:00,8
2,2023-01-01T08:00:00,9
1,2023-01-01T09:00:00,9
2,2023-01-01T09:00:00,9
A basic LDES can be generated by providing just the ldes
key without values.
YARRML:
mappings:
temperature-reading:
sources: data-source
subjects:
- value: ex:$(SensorID)
ldes:
targets:
- [out.ttl~void, turtle]
po:
- [a, ex:Thermometer]
- [ex:temp, $(Temperature)]
- [ex:ts, $(Timestamp), xsd:dateTime]
Output:
<1#0>
ex:temp "8" ;
ex:ts "2023-01-01T08:00:00"^^xsd:dateTime ;
a ex:Thermometer .
<2#0>
ex:temp "9" ;
ex:ts "2023-01-01T08:00:00"^^xsd:dateTime ;
a ex:Thermometer .
ex:eventstream
a ldes:EventStream ;
tree:member <1#0>, <2#0> ;
tree:shape <shape.shacl> .
By default, the generated subject IRI is checked for uniqueness to determine if a new member needs to be generated.
In this case the subject IRI is based on the SensorID
, so there are only two members: one with id 1
and one with
id 2
.
Before diving into the details of every property, we show an example that uses all properties that define how an LDES
gets generated, except memberIdFunction
.
YARRRML:
mappings:
temperature-reading:
sources: data-source
subjects:
- value: ex:$(SensorID)
targets:
- [ out.ttl~void, turtle ]
ldes:
id: ex:myldes
# basically generate a member for each record
watchedProperties: [$(SensorID), $(Timestamp), $(Temperature)]
shape: ex:shape.shacl
timestampPath: [ex:ts, $(Timestamp), xsd:dateTime]
versionOfPath: [ex:hasVersion, ex:$(SensorID)]
po:
- [a, ex:Thermometer]
- [ex:temp, $(Temperature)]
Output:
<1#0>
ex:hasVersion <1> ;
ex:temp "8" ;
ex:ts "2023-01-01T08:00:00" ;
a ex:Thermometer .
<1#1>
ex:hasVersion <1> ;
ex:temp "9" ;
ex:ts "2023-01-01T09:00:00" ;
a ex:Thermometer .
<2#0>
ex:hasVersion <2> ;
ex:temp "9" ;
ex:ts "2023-01-01T08:00:00" ;
a ex:Thermometer .
<2#1>
ex:hasVersion <2> ;
ex:temp "9" ;
ex:ts "2023-01-01T09:00:00" ;
a ex:Thermometer .
ex:myldes
a ldes:EventStream ;
ldes:timestampPath ex:ts ;
ldes:versionOfPath ex:hasVersion ;
tree:member <1#0>, <1#1>, <2#0>, <2#1> ;
tree:shape <shape.shacl> .
The id
turns the IRI of the LDES EventSteam metadata to ex:myldes
.
We define a custom LDES id
and shape
IRI.
We also define a timestampPath
and a versionOfPath
and specify what member triples they generate.
The watchedProperties
key is used to define which data records end up as members in the LDES.
The watched properties, given as an array, are compared between members that would have the same subject IRI generated
by the subject value template:
- If at least one of these properties change, the generated subject IRI will be made unique and the member is added.
- If the watched properties remain the same, or if none are given:
- If the subject IRI template generates a unique IRI: add the new member.
- If the subject IRI template doesn't generate a unique IRI: discard the new member because in this case this member is considered a duplicate of a previous one.
Here are some examples:
In this case we're only interested in generating a new member if the temperature changes for a sensor:
YARRRML:
mappings:
temperature-reading:
sources: data-source
subjects:
- value: ex:$(SensorID)
ldes:
watchedProperties: [$(Temperature)]
targets:
- [out.ttl~void, turtle]
po:
- [a, ex:Thermometer]
- [ex:temp, $(Temperature)]
- [ex:ts, $(Timestamp), xsd:dateTime]
Output:
<1#0>
ex:temp "8" ;
ex:ts "2023-01-01T08:00:00"^^xsd:dateTime ;
a ex:Thermometer .
<1#1>
ex:temp "9" ;
ex:ts "2023-01-01T09:00:00"^^xsd:dateTime ;
a ex:Thermometer .
<2#0>
ex:temp "9" ;
ex:ts "2023-01-01T08:00:00"^^xsd:dateTime ;
a ex:Thermometer .
ex:eventstream
a ldes:EventStream ;
tree:member <1#0>, <1#1>, <2#0> ;
tree:shape <shape.shacl> .
There are two members for sensor 1 (two readings with different temperature values) and only one for sensor 2 (same values for temperature in each reading).
The previous example showed how to create new members if temperature changes. This example creates new members if the timestamp changes.
YARRRML:
mappings:
temperature-reading:
sources: data-source
subjects:
- value: ex:$(SensorID)
ldes:
watchedProperties: [$(Timestamp)]
targets:
- [out.ttl~void, turtle]
po:
- [a, ex:Thermometer]
- [ex:temp, $(Temperature)]
- [ex:ts, $(Timestamp), xsd:dateTime]
Output:
<1#0>
ex:temp "8" ;
ex:ts "2023-01-01T08:00:00"^^xsd:dateTime ;
a ex:Thermometer .
<1#1>
ex:temp "9" ;
ex:ts "2023-01-01T09:00:00"^^xsd:dateTime ;
a ex:Thermometer .
<2#0>
ex:temp "9" ;
ex:ts "2023-01-01T08:00:00"^^xsd:dateTime ;
a ex:Thermometer .
<2#1>
ex:temp "9" ;
ex:ts "2023-01-01T09:00:00"^^xsd:dateTime ;
a ex:Thermometer .
ex:eventstream
a ldes:EventStream ;
tree:member <1#0>, <1#1>, <2#0>, <2#1> ;
tree:shape <shape.shacl> .
This time every reading produces a member because for every sensor each reading has a different timestamp.
In example 1. No configuration, no watchedProperties
given, so member generation depends on
the subject template, given by the value
key in subjects
.
Since the template uses the SensorID
it only generates members when a reading of a new sensor arrives.
versionOfPath
specifies LDESs ldes:versionOfPath
predicate and object.
- If not present, no
ldes:versionOfPath
is generated. - If a predicate and IRI template are given, then
ldes:versionOfPath
is defined by the predicate and the value that is defined by that template. E.g.:versionOfPath: [dcterms:isVersionOf, ex:$(SensorID)]
- If only a predicate is given, then the versionOfPath is defined by that predicate and the value is defined by:
- the corresponding object mapping for the predicate, if any, or
- the subject template.
E.g.:
the value template is in this case the subject template:
versionOfPath: [dcterms:isVersionOf]
ex:$(SensorID)
- If an empty array is given, then the predicate defaults to
dcterms:isVersionOf
and the value template defaults to:- the corresponding object mapping for the predicate, if any, or
- the subject value template.
E.g.:
versionOfPath: []
Here are some examples:
This example shows that the default ldes:versionOfPath
with a predicate dcters:isVersionOf
is generated.
The corresponding predicate and objects are generated for each member.
YARRRML:
mappings:
temperature-reading:
sources: data-source
subjects:
- value: ex:$(SensorID)
ldes:
versionOfPath: []
targets:
- [out.ttl~void, turtle]
po:
- [a, ex:Thermometer]
- [ex:temp, $(Temperature)]
- [ex:ts, $(Timestamp), xsd:dateTime]
Output:
<1#0>
ex:temp "8" ;
ex:ts "2023-01-01T08:00:00"^^xsd:dateTime ;
dcterms:isVersionOf <1> ;
a ex:Thermometer .
<2#0>
ex:temp "9" ;
ex:ts "2023-01-01T08:00:00"^^xsd:dateTime ;
dcterms:isVersionOf <2> ;
a ex:Thermometer .
ex:eventstream
a ldes:EventStream ;
ldes:versionOfPath dcterms:isVersionOf ;
tree:member <1#0>, <2#0> ;
tree:shape <shape.shacl> .
The next example shows how a versionOfPath
property with a given predicate results in members using that
predicate, without having to define it in the predicateobject
mappings.
YARRRML:
mappings:
temperature-reading:
sources: data-source
subjects:
- value: ex:$(SensorID)
ldes:
versionOfPath: [ex:hasOriginal]
targets:
- [out.ttl~void, turtle]
po:
- [a, ex:Thermometer]
- [ex:temp, $(Temperature)]
- [ex:ts, $(Timestamp), xsd:dateTime]
Output:
<1#0>
ex:hasOriginal <1> ;
ex:temp "8" ;
ex:ts "2023-01-01T08:00:00"^^xsd:dateTime ;
a ex:Thermometer .
<2#0>
ex:hasOriginal <2> ;
ex:temp "9" ;
ex:ts "2023-01-01T08:00:00"^^xsd:dateTime ;
a ex:Thermometer .
ex:eventstream
a ldes:EventStream ;
ldes:versionOfPath ex:hasOriginal ;
tree:member <1#0>, <2#0> ;
tree:shape <shape.shacl> .
This example shows a versionOfPath
with a custom predicate and object referring to
another IRI than the derived from the subject
template.
YARRRML:
mappings:
temperature-reading:
sources: data-source
subjects:
- value: ex:$(SensorID)
ldes:
versionOfPath: [ex:hasOriginal, ex:original/$(SensorID)]
targets:
- [out.ttl~void, turtle]
po:
- [a, ex:Thermometer]
- [ex:temp, $(Temperature)]
- [ex:ts, $(Timestamp), xsd:dateTime]
Output:
<1#0>
ex:hasOriginal <original/1> ;
ex:temp "8" ;
ex:ts "2023-01-01T08:00:00"^^xsd:dateTime ;
a ex:Thermometer .
<2#0>
ex:hasOriginal <original/2> ;
ex:temp "9" ;
ex:ts "2023-01-01T08:00:00"^^xsd:dateTime ;
a ex:Thermometer .
ex:eventstream
a ldes:EventStream ;
ldes:versionOfPath ex:hasOriginal ;
tree:member <1#0>, <2#0> ;
tree:shape <shape.shacl> .
timestampPath
specifies the predicate and optionally object used to indicate the LDESs ldes:timestampPath
.
- If no
timestampPath
is present, noldes:timestampPath
will be generated. - If only a predicate is given, it has to be present in the
predicateobject
mappings. In that case the object is defined there. E.g.:In this case a predicateobject mapping must exist, e.g.:timestampPath: [ex:ts]
po: [[ex:ts, $(Timestamp)]]
- If a predicate and an object are given, an implicit
predicateobject
mapping with the given object will be added. E.g.:This is equivalent to the previous example, but no explicittimestampPath: [ex:ts, $(Timestamp)]
predicateobjectmapping
must be defined.
Here are some examples:
This example defines a timestampPath
using an existing predicateobject
mapping for ex:ts
:
YARRRML:
mappings:
temperature-reading:
sources: data-source
subjects:
- value: ex:$(SensorID)
ldes:
timestampPath: [ex:ts]
targets:
- [out.ttl~void, turtle]
po:
- [a, ex:Thermometer]
- [ex:temp, $(Temperature)]
- [ex:ts, $(Timestamp), xsd:dateTime]
Output:
<1#0>
ex:temp "8" ;
ex:ts "2023-01-01T08:00:00"^^xsd:dateTime ;
a ex:Thermometer .
<2#0>
ex:temp "9" ;
ex:ts "2023-01-01T08:00:00"^^xsd:dateTime ;
a ex:Thermometer .
ex:eventstream
a ldes:EventStream ;
ldes:timestampPath ex:ts ;
tree:member <1#0>, <2#0> ;
tree:shape <shape.shacl> .
It is possible to define a custom timestampPath
, where the predicate and object are not present in the
predicateobject
mappings.
YARRRML:
mappings:
temperature-reading:
sources: data-source
subjects:
- value: ex:$(SensorID)
ldes:
timestampPath: [ex:ts, $(Timestamp), xsd:dateTime]
targets:
- [out.ttl~void, turtle]
po:
- [a, ex:Thermometer]
- [ex:temp, $(Temperature)]
Output:
<1#0>
ex:temp "8" ;
ex:ts "2023-01-01T08:00:00"^^xsd:dateTime ;
a ex:Thermometer .
<2#0>
ex:temp "9" ;
ex:ts "2023-01-01T08:00:00"^^xsd:dateTime ;
a ex:Thermometer .
ex:eventstream
a ldes:EventStream ;
ldes:timestampPath ex:ts ;
tree:member <1#0>, <2#0> ;
tree:shape <shape.shacl> .
The IRI of the generated 'ldes:EventStream' object defaults to http://example.org/eventStream
. This is often not
what you want. This IRI is easily cutomized with the id
key:
YARRRML:
mappings:
temperature-reading:
sources: data-source
subjects:
- value: ex:$(SensorID)
ldes:
id: http://ldes.org/thisisanldeswithacustomid
targets:
- [out.ttl~void, turtle]
po:
- [a, ex:Thermometer]
- [ex:temp, $(Temperature)]
- [ex:ts, $(Timestamp), xsd:dateTime]
Output:
<1#0>
ex:temp "8" ;
ex:ts "2023-01-01T08:00:00"^^xsd:dateTime ;
a ex:Thermometer .
<2#0>
ex:temp "9" ;
ex:ts "2023-01-01T08:00:00"^^xsd:dateTime ;
a ex:Thermometer .
<http://ldes.org/thisisanldeswithacustomid>
a ldes:EventStream ;
tree:member <1#0>, <2#0> ;
tree:shape <shape.shacl> .
The shape
key allows to refer to a SHACL shape that can be used to validate members, for instance
by an LDES Server implementation.
It defaults to ex:shape.shacl
, but can be customized.
Note that the shape itself is not generated by the LDES extension.