AREkit (Attitude and Relation Extraction Toolkit) -- is a python toolkit, devoted to document level Attitude and Relation Extraction between text objects from mass-media news.
This toolkit aims at memory-effective data processing in Relation Extraction (RE) related tasks.
Figure: AREkit pipelines design. More on ARElight: Context Sampling of Large Texts for Deep Learning Relation Extraction paper
In particular, this framework serves the following features:
- ➿ pipelines and iterators for handling large-scale collections serialization without out-of-memory issues.
- 🔗 EL (entity-linking) API support for objects,
- ➰ avoidance of cyclic connections,
- 📏 distance consideration between relation participants (in
terms
orsentences
), - 📑 relations annotations and filtering rules,
- *️⃣ entities formatting or masking, and more.
The core functionality includes:
- API for document presentation with EL (Entity Linking, i.e. Object Synonymy) support for sentence level relations preparation (dubbed as contexts);
- API for contexts extraction;
- Relations transferring from sentence-level onto document-level, and more.
pip install git+https://github.com/nicolay-r/[email protected]
Please follow the tutorial section on project Wiki for mode details.
A great research is also accompanied by the faithful reference. if you use or extend our work, please cite as follows:
@inproceedings{rusnachenko2024arelight,
title={ARElight: Context Sampling of Large Texts for Deep Learning Relation Extraction},
author={Rusnachenko, Nicolay and Liang, Huizhi and Kolomeets, Maxim and Shi, Lei},
booktitle={European Conference on Information Retrieval},
year={2024},
organization={Springer}
}