Skip to content
/ spwim Public

Tree-structured Pairwise Word Interaction Modeling

Notifications You must be signed in to change notification settings

likicode/spwim

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SPWIM

Code for the paper Incorporating Contextual and Syntactic Structures Improves Semantic Similarity Modeling accepted at EMNLP 2019.

We start with the Pairwise Word Interaction Model (PWIM) of He and Lin (2016) as our base architecture. Our code is based on the implementation of PWIM at https://github.com/castorini/castor .

Datasets

  • SICK: We preprocessed the original dataset and generated the train, dev and test directories, each containing the following layout:

    • a.toks: Each line contains one question (the same question repeats in subsequent lines for as many times as the number of candidate answers).

    • b.toks: Each line contains one candidate answer.

    • sim.txt: Each line contains the label (0 or 1) for the question-answer pair in the corresponding line of a.toks and b.toks respectively.

    • id.txt: Each line contains the question-id for question at the corresponding line in a.toks

    • a.parents/ b.parents: Each line contains the output of Standford dependency parser, for head words and dependency arcs separately.

    • a.txt or b.txt: Each line contains original untokenized texts from the original source.

  • Word Embeddings: glove.840B.300d

Usage

Generate Tree Pickle File

$ cd vdpwi
$ python tree_loader.py --dataset_path data/sick/ --dataset sick --output_path ../

Training

$ python -m vdpwi --lr 0.0005 --optimizer rmsprop --momentum 0.05 --epochs 15  --dataset sick  --batch-size 16 --rnn-hidden-dim 256 --resultLoc result_file/ --fileID 0000 --log-interval 100 --treeFile sick_toks_tree.pkl --model_outfile sick_model

License

MIT

About

Tree-structured Pairwise Word Interaction Modeling

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published