From 8383a87aaac353293adfbf8001b006b76ecd2244 Mon Sep 17 00:00:00 2001 From: panchbhai1969 Date: Mon, 21 Jan 2019 03:36:40 +0530 Subject: [PATCH] Readme change for #20, #12 and other issues encountered during ruuning the code by following the instructions on current README.md (These issues and changes are commented in the README.md file) --- README.md | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index 8810806..266cd2f 100644 --- a/README.md +++ b/README.md @@ -23,7 +23,9 @@ Install TensorFlow (e.g., `pip install tensorflow`). The template used in the paper can be found in a file such as `annotations_monument.tsv`. To generate the training data, launch the following command. + ```bash +mkdir data/monument_300 python generator.py --templates data/annotations_monument.csv --output data/monument_300 ``` @@ -35,16 +37,19 @@ python build_vocab.py data/monument_300/data_300.sparql > data/monument_300/voca ``` Count lines in `data_.*` + ```bash -NUMLINES= $(echo awk '{ print $1}' | cat data/monument_300/data_300.sparql | wc -l) +NUMLINES=$(echo awk '{ print $1}' | cat data/monument_300/data_300.sparql | wc -l) echo $NUMLINES # 7097 ``` Split the `data_.*` files into `train_.*`, `dev_.*`, and `test_.*` (usually 80-10-10%). + + ```bash cd data/monument_300/ -python ../../split_in_train_dev_test.py --lines $NUMLINES --dataset data.sparql +python ../../split_in_train_dev_test.py --lines $NUMLINES --dataset data_300.sparql ``` #### Pre-generated data @@ -53,7 +58,8 @@ Alternatively, you can extract pre-generated data from `data/monument_300.zip` a ### Training -Launch `train.sh` to train the model. The first parameter is the prefix of the data directory. The second parameter is the number of training epochs. + +Now go back to the initail directory and launch `train.sh` to train the model. The first parameter is the prefix of the data directory and the second parameter is the number of training epochs. ```bash sh train.sh data/monument_300 120000