All dataset-related classes are located in the package org.silkframework.dataset
.
Used to represent a dataset task in a project. Extends TaskSpec
and is on the same level as LinkSpec
, TransformSpec
, etc.
Currently, provides two properties
- The Dataset Plugin instance (e.g.,
CsvDataset
) - The URI-property
In the future, it can be extended with other dataset type agnostic properties.
A dataset plugin, e.g., CsvDataset
.
In the future, should only be declarative (i.e., hold all dataset specific properties). It should not contain execution/access methods.
Executes a specific Dataset plugin. There might be multiple executors for the same dataset (e.g., for different execution types).
Contains access methods to read and write data from/to a dataset.
Given a dataset, the corresponding DatasetAccess
instance is provided by its DatasetExecutor
. It can be retrieved using the ExecutorRegistry
.
Multiple dataset executors may be defined for the same Dataset.
For instance, given the CsvDataset
class, LocalCsvDatasetExecutor
and SparkCsvDatasetExecutor
my provide different DatasetAccess implementations.