Skip to content

Latest commit

 

History

History
69 lines (50 loc) · 2.43 KB

README.md

File metadata and controls

69 lines (50 loc) · 2.43 KB

Datasets (Tasks)

This repository contains supplementary material for Bermuda Triangles: GNNs Fail to Detect Simple Topological Structures paper at ICLR 2021 Workshop on Geometrical and Topological Representation Learning. This data set is same as the one available on Dataset Platter

Triangle Presense

In this task we generate relatively complex graphs which contain either exactly 1 or exactly 0 cycles of length 3 (triangle structures).

Clique Distance

In this task we generate a BA graph with two cliques attached. We assign class 1 to graphs where the distance between cliques is larger than a threshold and class 0 othrewise.

Structure

  • scripts folder contains code for creation of datasets and converting to other formats. See the readme inside the folder for more information.
  • triangles folder contains triangles dataset: 10k training data, 1k test data and 1k validation data.
  • clique folder contains clique distance dataset with the same data distribution as clique distance.
  • images folder contains visualizations of graphs (non-filtered)

Data Format

Both datasets are stored in our internal format. It consists of two files: sp (adjacency matrix) and cl (graph classification labels). We provide a pair of these files for each split. Script folder also contains a converter to TU dataset format.

SP file

SP file contains adjacency matrices for each data set. It is a tab-separated file where each line correspond to an edge in a graph. The SP file has the following columns:

  • Graph ID (arbitrary string)
  • Start Node ID
  • End Node ID
  • Edge weight (always 1)

Graphs corresponding to each ID must be stored successively without interleaving with other graphs, but the order of edges can be arbitrary.

CL file

It is also a tab-separated file with two columns:

  • Graph ID
  • Graph Label

The order of entries must be the same with the SP file.

Citation

@inproceedings{
tolmachev2021bermuda,
title={Bermuda Triangles: {GNN}s Fail to Detect Simple Topological Structures},
author={Arseny Tolmachev and Akira Sakai and Masaru Todoriki and Koji Maruhashi},
booktitle={ICLR 2021 Workshop on Geometrical and Topological Representation Learning},
year={2021},
url={https://openreview.net/forum?id=Vz_Nl9MSQnu}
}

License Information

All data is licensed under CC0. All code is licensed under MIT license.