From 7e193341de85a1e3459e6208cd2d553d28d217fc Mon Sep 17 00:00:00 2001 From: Daniel Obraczka Date: Wed, 13 Mar 2024 14:14:00 +0100 Subject: [PATCH] Update README --- README.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/README.md b/README.md index 477bc27..fac8650 100644 --- a/README.md +++ b/README.md @@ -49,6 +49,8 @@ print(ds.intra_ent_links[0]) print(ds.intra_ent_links[1]) ``` +Alternatively this dataset (among others) is also available in [`sylloge`](https://github.com/dobraczka/sylloge). + # Dataset structure There are 3 entity resolution tasks in this repository: imdb-tmdb, imdb-tvdb, tmdb-tvdb, all contained in the `data` folder. The data structure mainly follows the structure used in [OpenEA](https://github.com/nju-websoft/OpenEA). @@ -56,6 +58,12 @@ Each folder contains the information of the knowledge graphs (`attr_triples_*`,` Furthermore, there exists a file for each dataset with intra-dataset links called `*_intra_ent_links`. For the binary cases each dataset has a `cluster` file in the respective folder. Each line here is a cluster with comma-seperated members of the cluster. This includes intra- and inter-dataset links. For the multi-source setting, you can use the `multi_source_cluster` file in the `data` folder. +Using [`sylloge`](https://github.com/dobraczka/sylloge) you can also easily load this dataset as a multi-source task: + +``` +from sylloge import MovieGraphBenchmark +ds = MovieGraphBenchmark(graph_pair='multi') +``` # Citing This dataset was first presented in this paper: