Skip to content

Commit

Permalink
# Conflicts:
Browse files Browse the repository at this point in the history
#	CHANGELOG.md
#	README.md
#	docker/compose.yaml
#	pom.xml
#	src/main/scala/io/rml/framework/Main.scala
#	src/main/scala/io/rml/framework/core/function/model/Function.scala
#	src/main/scala/io/rml/framework/engine/statement/FunctionMapGeneratorAssembler.scala
#	src/test/scala/io/rml/framework/SandboxTests.scala
  • Loading branch information
ghsnd committed Oct 3, 2022
1 parent 73e83e8 commit 836a329
Show file tree
Hide file tree
Showing 36 changed files with 631 additions and 97 deletions.
16 changes: 16 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,21 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html).

## [2.4.1] - 2022-09-03

### Added
* Possibility to run stand-alone, with Flink embedded.

### Changed
* Updated Flink from version 1.14.4 to 1.14.5
* Updated Docker example

### Fixed
* Passing omitting optional parameter `function-descriptions` resulted in program crash.
* Functions used on other levels than Predicate Map or Object Map caused RMLStreamer to crash
* Functions returning `null` no longer cause RMLStreamer logging an error message.
* Multiple values for same function argument (array) were not passed correctly to function

## [2.4.0] - 2022-05-30

### Changed
Expand Down Expand Up @@ -189,3 +204,4 @@ can be set with the program argument `--baseIRI`.
[2.2.2]: https://github.com/RMLio/RMLStreamer/compare/v2.2.1...v2.2.2
[2.3.0]: https://github.com/RMLio/RMLStreamer/compare/v2.2.2...v2.3.0
[2.4.0]: https://github.com/RMLio/RMLStreamer/compare/v2.3.0...v2.4.0
[2.4.1]: https://github.com/RMLio/RMLStreamer/compare/v2.4.0...v2.4.1
4 changes: 4 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
FROM eclipse-temurin:19_36-jre-alpine
RUN mkdir /opt/app
COPY target/RMLStreamer-*.jar /opt/app/RMLStreamer.jar
ENTRYPOINT ["java", "-jar", "/opt/app/RMLStreamer.jar"]
73 changes: 69 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,54 @@ using [RML](http://rml.io/). The difference with other RML implementations is th

Documentation regarding the use of (custom) functions can be found [here](documentation/README_Functions.md).

### Quick start
### Quick start (standalone)

* Download `RMLStreamer-<version>-standalone.jar` from the [latest release](https://github.com/RMLio/RMLStreamer/releases/latest).
* Run it as
```
$ java -jar RMLStreamer-<version>-standalone.jar <commands and options>
```

See [Basic commands](#basic-commands) (where you replace `$FLINK_BIN run <path to RMLStreamer jar>` with `java -jar RMLStreamer-<version>-standalone.jar`)
and [Complete RMLStreamer usage](#complete-rmlstreamer-usage) for
examples, possible commands and options.

### Quick start (Docker - the fast way to test)

This runs the stand-alone version of RMLStreamer in a Docker container.
This is a good way to quickly test things or run RMLStreamer on a single machine,
but you don't have the features of a Flink cluster set-up (distributed, failover, checkpointing).
If you need those features, see [docker/README.md](docker/README.md).

#### Example usage:

```
$ docker run -v $PWD:/data --rm rmlio/rmlstreamer toFile -m /data/mapping.ttl -o /data/output
```

#### Build your own image:

This option builds RMLStreamer from source and puts that build into a Docker container ready to run.
The main purpose is to have a one-time job image.

```
$ ./buildDocker.sh
```

If the build succeeds, you can invoke it as follows.
If you go to the directory where your data and mappings are,
you can run something like (change tag to appropriate version):

```
$ docker run -v $PWD:/data --rm rmlstreamer:2.4.1 toFile -m /data/mapping.ttl -o /data/output.ttl
```

### Moderately quick start (Docker - the recommended way)

If you want to get the RMLStreamer up and running within 5 minutes using Docker, check out [docker/README.md](docker/README.md)

### Not so quick start (deploying on a cluster)

If you want to deploy it yourself, read on.

If you want to develop, read [these instructions](documentation/README_DEVELOPMENT.md).
Expand All @@ -19,9 +63,13 @@ If you want to develop, read [these instructions](documentation/README_DEVELOPME
RMLStreamer runs its jobs on Flink clusters.
More information on how to install Flink and getting started can be found [here](https://ci.apache.org/projects/flink/flink-docs-release-1.14/try-flink/local_installation.html).
At least a local cluster must be running in order to start executing RML Mappings with RMLStreamer.
Please note that this version works with Flink 1.14.4 with Scala 2.11 support, which can be downloaded [here](https://archive.apache.org/dist/flink/flink-1.14.4/flink-1.14.4-bin-scala_2.11.tgz).
Please note that this version works with Flink 1.14.5 with Scala 2.11 support, which can be downloaded [here](https://archive.apache.org/dist/flink/flink-1.14.5/flink-1.14.5-bin-scala_2.11.tgz).

### Grabbing RMLStreamer...

### Building RMLStreamer
Download `RMLStreamer-<version>.jar` from the [latest release](https://github.com/RMLio/RMLStreamer/releases/latest).

### ... or building RMLStreamer

In order to build a jar file that can be deployed on a Flink cluster, you need:
- a Java JDK >= 11 and <= 13 (We develop and test on JDK 11)
Expand All @@ -45,6 +93,11 @@ so.

The resulting `RMLStreamer-<version>.jar`, found in the `target` folder, can be deployed on a Flink cluster.

**Note**: To build a *stand-alone* RMLStreamer jar, add `-P 'stand-alone'` to the build command, e.g.:
```
$ mvn clean package -DskipTests -P 'stand-alone'
```

### Executing RML Mappings

*This section assumes the use of a CLI. If you want to use Flink's web interface, check out
Expand Down Expand Up @@ -272,4 +325,16 @@ To adjust the log level for RMLStreamer specifically, add two lines like this:
logger.rmlstreamer.name = io.rml.framework
logger.rmlstreamer.level = DEBUG

```
```

### Benchmark

RMLStreamer is benchmarked with this [repo.](https://github.com/s-minoo/rmlstreamer-benchmark-rust)

### References ([preprint](./paper/RMLStreamer_ISWC.pdf))
[1] S. Min Oo, G. Haesendonck, B. De Meester, A. Dimou. RMLStreamer - an RDF stream
generator from streaming heterogeneous data. The Semantic Web – ISWC 2022. Springer International Publishing, (2022)




11 changes: 11 additions & 0 deletions buildDocker.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
#!/usr/bin/env bash

version=$(grep -F '<version>' pom.xml | head -n 1 | cut -d '>' -f 2 | cut -d '<' -f 1)

### 1. Build RMLStreamer stand-alone
echo "Building stand-alone RMLStreamer version $version"
mvn clean package -DskipTests -P 'stand-alone'

### 2. Build the docker container
docker build --tag "rmlstreamer:$version" . && \
echo "Successfully built rmlstreamer:$version !"
11 changes: 5 additions & 6 deletions docker/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,28 +6,27 @@ Get RMLStreamer up and running using Docker. No Java, no Flink, no Maven require
Check out Flink's elaborate [documentation](https://ci.apache.org/projects/flink/flink-docs-release-1.14/docs/deployment/resource-providers/standalone/docker/) on this topic for more options, configurations, ...*

## 1. Prerequisites
- [Docker or Docker Engine](https://www.docker.com/), version 19 or higher
- [docker-compose](https://docs.docker.com/compose/), version 3 or higher
- Have the `docker-compose.yml` file locally (either by cloning this repo or just by copying the file).
- [Docker or Docker Engine](https://www.docker.com/), version 20 or higher (which includes the `docker compose` command)
- Have the `compose.yaml` file locally (either by cloning this repo or just by copying the file).
This is just *a* possible configuration; you can adjust it to your needs.
- An RMLStreamer jar file, either from the [latest release](https://github.com/RMLio/RMLStreamer/releases/latest),
or one that you build yourself (in which case you *do* need Java and Maven).

## 2. Start the Flink cluster

Go to the directory where you put the `docker-compose.yml` file.
Go to the directory where you put the `compose.yaml` file.

Run:

```
$ docker-compose up
$ docker compose up
```

This will start one Flink Job Manager and one Flink Task Manager. If you want to scale up and add more Task Managers,
just run the following command (in another terminal):

```
$ docker-compose up --scale taskmanager=2
$ docker compose up --scale taskmanager=2
```

Replace the `2` with the number of Task Managers you actually want to have.
Expand Down
7 changes: 3 additions & 4 deletions docker/docker-compose.yml → docker/compose.yaml
Original file line number Diff line number Diff line change
@@ -1,8 +1,7 @@
version: '3'
services:

jobmanager:
image: flink:1.14.4-scala_2.11-java11
image: flink:1.14.5-scala_2.11-java11
expose:
- "6123"
ports:
Expand All @@ -14,7 +13,7 @@ services:
- data:/mnt/data

taskmanager:
image: flink:1.14.4-scala_2.11-java11
image: flink:1.14.5-scala_2.11-java11
expose:
- "6121"
- "6122"
Expand All @@ -30,4 +29,4 @@ services:

volumes:
# This volume will show with 'docker volume ls' as 'docker_data'
data:
data: {}
Binary file added paper/RMLStreamer_ISWC.pdf
Binary file not shown.
18 changes: 8 additions & 10 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -28,15 +28,15 @@ SOFTWARE.

<groupId>io.rml</groupId>
<artifactId>RMLStreamer</artifactId>
<version>2.4.0</version>
<version>2.4.1</version>
<packaging>jar</packaging>

<name>RMLStreamer</name>
<url>https://rml.io/</url>

<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<flink.version>1.14.4</flink.version>
<flink.version>1.14.5</flink.version>
<slf4j.version>1.7.32</slf4j.version>
<log4j.version>2.17.0</log4j.version>
<jena.version>4.3.1</jena.version>
Expand Down Expand Up @@ -387,7 +387,7 @@ SOFTWARE.
<dependency>
<groupId>com.github.FnOio</groupId>
<artifactId>function-agent-java</artifactId>
<version>v0.1.0</version>
<version>v0.2.0</version>
<exclusions>
<exclusion>
<groupId>org.apache.jena</groupId>
Expand All @@ -398,22 +398,22 @@ SOFTWARE.
<dependency>
<groupId>com.github.fnoio</groupId>
<artifactId>grel-functions-java</artifactId>
<version>v0.7.3</version>
<version>v0.8.2</version>
</dependency>
<dependency>
<groupId>com.github.fnoio</groupId>
<artifactId>idlab-functions-java</artifactId>
<version>v0.1.0</version>
<version>v0.1.2</version>
</dependency>

</dependencies>

<!-- This profile helps to make things run out of the box in IntelliJ -->
<!-- This profile helps to make things run stand-alone or in IntelliJ -->
<!-- Its adds Flink's core classes to the runtime class path. -->
<!-- Otherwise they are missing in IntelliJ, because the dependency is 'provided' -->
<!-- Otherwise they are missing because the dependency is 'provided' -->
<profiles>
<profile>
<id>add-dependencies-for-IDEA</id>
<id>stand-alone</id>
<activation>
<property>
<name>idea.version</name>
Expand Down Expand Up @@ -478,8 +478,6 @@ SOFTWARE.
<artifactSet>
<excludes>
<exclude>com.google.code.findbugs:jsr305</exclude>
<exclude>org.slf4j:*</exclude>
<exclude>log4j:*</exclude>
</excludes>
</artifactSet>
<filters>
Expand Down
71 changes: 71 additions & 0 deletions src/main/java/io/rml/framework/flink/util/DummyFunctionAgent.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
package io.rml.framework.flink.util;

import be.ugent.idlab.knows.functions.agent.Agent;
import be.ugent.idlab.knows.functions.agent.Arguments;

import java.io.IOException;
import java.lang.reflect.Method;
import java.util.List;

/**
* This is a "dummy" function agent that just returns the value of the
* first argument, or null of there are no arguments.
* <p>
* MIT License
* <p>
* Copyright (C) 2017 - 2022 RDF Mapping Language (RML)
* <p>
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
* <p>
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
* <p>
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
**/
public class DummyFunctionAgent implements Agent {
@Override
public Object execute(String functionId, Arguments arguments) throws Exception {
return arguments.size() == 0 ? null : arguments.get(arguments.getArgumentNames().iterator().next());
}

@Override
public Object execute(String s, Arguments arguments, boolean b) throws Exception {
return execute(s, arguments, false);
}

@Override
public void executeToFile(String s, Arguments arguments, String s1) throws Exception {
// not used
}

@Override
public void executeToFile(String s, Arguments arguments, String s1, boolean b) throws Exception {
// not used
}

@Override
public void writeModel(String s) throws IOException {
// not used
}

@Override
public String loadFunction(Method method) {
return null;
}

@Override
public List<String> getParameterPredicates(String s) {
return null;
}
}
48 changes: 48 additions & 0 deletions src/main/java/io/rml/framework/flink/util/IOUtils.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
package io.rml.framework.flink.util;

import org.apache.commons.io.input.BOMInputStream;

import java.io.BufferedReader;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStreamReader;
import java.nio.charset.StandardCharsets;
import java.nio.file.Path;

/**
* MIT License
* <p>
* Copyright (C) 2017 - 2022 RDF Mapping Language (RML)
* <p>
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
* <p>
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
* <p>
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
**/
public class IOUtils {

public static String readFirstLineFromUTF8FileWithBOM(final Path filePath) throws IOException {
try(BufferedReader in = new BufferedReader(
new InputStreamReader(
new BOMInputStream(
new FileInputStream(filePath.toFile())), StandardCharsets.UTF_8)
)
){
return in.readLine();
}
}

}
2 changes: 1 addition & 1 deletion src/main/resources/log4j.properties
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@

log4j.rootLogger=WARN, console
log4j.logger.io.rml=DEBUG
log4j.logger.io.rml=INFO
log4j.logger.org.apache.flink=WARN
log4j.logger.kafka=ERROR
log4j.logger.org.I0Itec.zkclient=ERROR
Expand Down
Loading

0 comments on commit 836a329

Please sign in to comment.