- HMM
- BILSTM-CRF
The first column is the char, the second column is the label(BMEO), there is an empty line between two sentences
N B
B M
A E
D O
an empty line
Z O
Z O
Z O
Z O
Z O
NBAD<@>NBA
ZZZZZ<@>
python train.py train.in model -v validation.in -c char_emb -e 10 -g 2
- train.in, the path of the train file
- model, the path of the saved model
- v, the path of the validation file(optional, otherwise split the train set into train and val)
- c, the char embedding file(optional)
- e, the number of epoch(optional, default 100)
- g, the id of gpu(optional, default 0)
python test.py model test.in test.out -c char_emb -g 2
- model, the path of model file
- test.in, the path of test file
- test.out, the path of predict file of test
- c, the char embedding file(optional)
- g, the id of gpu(optional, default 0)
The first line of the embedding file is the number of char and embedding dimension, seperating by space, e.g 5 10. The remaining line is the char and embedding vector, seperating by space, e.g N dim1 ... dim 10
- python 2.7
- tensorflow 0.8
- numpy
- pandas