This repository contains code accompanying the paper A Deep Learning Blueprint for Relational Databases
TL;DR: (or video)
A modular message-passing scheme reflecting the relational model for end-to-end deep learning from databases
The system allows to easily connect to any database through a simple connection string (with SQL Alchemy), load information from the DB (with Pandas), automatically analyze its schema structure and data columns' semantics, and efficiently load and embed the data into learnable (torch) tensor representations.
The subsequent modular neural message-passing scheme operating on top of the resulting (two-level) multi-relational hypergraph representation then builds on Pytorch Geometric, allowing to easily utilize any of its modules in the respective functional interfaces (transformation, combination, aggregation) of the deep relational blueprint:
For more information, please read the paper and/or feel free to reach out directly to us!
If you like the idea, you can cite the paper as:
@inproceedings{zahradnik2023deep,
title={A Deep Learning Blueprint for Relational Databases},
author={Zahradn{\'\i}k, Luk{\'a}{\v{s}} and Neumann, Jan and {\v{S}}{\'\i}r, Gustav},
booktitle={NeurIPS 2023 Second Table Representation Learning Workshop},
year={2023}
}
db_transformer
- the main module containing the:data
- loading, analysis, conversion, and embeddingdb
- connection, inspection, and schema detection- and the transformer-based instantiation of the blueprint
experiments
- presented in the paper, including baselines from:- Tabular models
- Propositionalization
- Statistical Relational Learning
- Neural-symbolic integration
and additionally some:
datasets
- some selected DB datasets for debuggingexamples
- example scripts on data schema detection/conversion
There is also the PyNeuraLogic framework that allows for a more flexible deep relational learning with the DB relations, operations, queries, and more.
- using differentiable relational logic, it allows to skip the intermediate transformation into (hyper)graph tensors, and operate directly with the relational (DB) representation.