Name		Name	Last commit message	Last commit date
parent directory ..
data		data
model		model
pretrained_lm		pretrained_lm
utils		utils
LICENSE		LICENSE
README.md		README.md
SSAN.png		SSAN.png
args.py		args.py
dataset.py		dataset.py
predict.sh		predict.sh
relation_extraction.py		relation_extraction.py
run_ssan.py		run_ssan.py
train.sh		train.sh

README.md

SSAN

Introduction

This is the PaddlePaddle implementation of the SSAN model (see our AAAI2021 paper: Entity Structure Within and Throughout: Modeling Mention Dependencies for Document-Level Relation Extraction).
SSAN (Structured Self-Attention Network) is a novel extension of Transformer to effectively incorporate structural dependencies between input elements. And in the scenerio of document-level relation extraction, we consider the structure of entities. Specificly, we propose a transformation module, that produces attentive biases based on the structure prior so as to adaptively regularize the attention flow within and throughout the encoding stage. We achieve SOTA results on several document-level relation extraction tasks.
This implementation is adapted based on ERNIE repo, you can find the main revision for SSAN model in ./model/SSAN_encoder.py#L123-L150.

Requirements

python3.7, paddlepaddle-gpu==1.6.3.post107, dataclasses
This implementation is tested on a single 32G V100 GPU with CUDA version=10.2 and Driver version=440.33.01.

Prepare Model and Dataset

Download pretrained ERNIE model.

cd ./pretrained_lm/
wget https://ernie.bj.bcebos.com/ERNIE_Base_en_stable-2.0.0.tar.gz
mkdir -p ./ernie2_base_en && tar -zxvf ERNIE_Base_en_stable-2.0.0.tar.gz -C ./ernie2_base_en
wget https://ernie.bj.bcebos.com/ERNIE_Large_en_stable-2.0.0.tar.gz
mkdir -p ./ernie2_large_en && tar -zxvf ERNIE_Large_en_stable-2.0.0.tar.gz -C ./ernie2_large_en

Download DocRED dataset into ./data, including train_annotated.json, dev.json and test.json.

Train

sh train.sh

Train and eval SSAN. Model will be saved in ./checkpoints, and the best threshold for relation prediction will be searched on dev set when evaluation.
By default you are running SSAN based on ERNIE Base, set --with_ent_structure to false and the model will fall back to ERNIE Base Baseline. If you want to train ERNIE Large models, just set model path to ./pretrained_lm/ernie2_large_en.

Predict

sh predict.sh

Set your checkpoint directory and threshold for prediction. The result will be saved as ./data/result.json.
You can compress and upload it to the official competition leaderboard at CodaLab.

cd ./data/
zip result.zip result.json

Results

Results on DocRED datasets:

Model	Dev F1	Test Ign F1	Test F1
ERNIE Base Baseline	58.54	55.58	57.71
SSAN_Biaffine	59.12(+0.58)	57.07(+1.49)	59.05(+1.34)
ERNIE Large Baseline	60.25	57.87	60.11
SSAN_Biaffine	61.58(+1.33)	58.96(+1.09)	61.17(+1.06)

We set learning rate = 3e-5, batch size = 4, and search for the best epochs among (40, 60, 80, 100) on development set.

Citation

If you use any source code included in this project in your work, please cite the following paper:

@article{xu2021entity,
  title={Entity Structure Within and Throughout: Modeling Mention Dependencies for Document-Level Relation Extraction},
  author={Xu, Benfeng and Wang, Quan and Lyu, Yajuan and Zhu, Yong and Mao, Zhendong},
  journal={arXiv preprint arXiv:2102.10249},
  year={2021}
}

Copyright and License

Copyright 2021 Baidu.com, Inc. All Rights Reserved Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AAAI2021_SSAN

AAAI2021_SSAN

README.md

SSAN

Introduction

Requirements

Prepare Model and Dataset

Train

Predict

Results

Citation

Copyright and License

Files

AAAI2021_SSAN

Directory actions

More options

Directory actions

More options

Latest commit

History

AAAI2021_SSAN

Folders and files

parent directory

README.md

SSAN

Introduction

Requirements

Prepare Model and Dataset

Train

Predict

Results

Citation

Copyright and License