We split our scripts into 2 separate folders, which are:
knowledge_embed
for dataset generationmodeling
for preprocessing, training, and automatic evaluationothers
for others scripts such as preparing the dataset, inserting data toneo4j
,human evaluation
, and other additional scripts
The provided code is just to get better understanding about the flow of the work and probably it is not runnable. We will publish the complete code on a public source code repository after the work is published.
We listed our dependencies on requirements.txt
, you can install the dependencies by running
pip install -r requirements.txt
In addition, our code also includes fp16
support with apex
. You can find the package from https://github.com/NVIDIA/apex.
Inside the knowledge_embed
folder, we split the scripts per dataset that are used for the experiments. Below are the details of each files inside the folders:
generate_delexicalization_babi.py
| script for generating templates given BABI5 dialoguesgenerate_dialogues_babi.py
| script for generating dialogue from BABI5 templates, knowledge is retrieved from provided knowledge base by usingpandas
revertible_string.py
contains string wrapper class for relexicalizationutils.py
contains common functions used in knowledge embedding phase
generate_delexicalization_CAMREST.py
| script for generating templates given CamRest dialoguesgenerate_dialogues_CAMREST.py
| script for generating dialogue from CamRest templates, knowledge is retrieved from provided knowledge base by usingpandas
revertible_string.py
contains string wrapper class for relexicalizationutils.py
contains common functions used in knowledge embedding phase
generate_delexicalization_SMD.py
| script for generating templates given SMD dialogues. There are a lot of noise in the generated templates, so we need do some manual fix before generating the dialoguesgenerate_dialogues_SMD_sql.py
| script for generating dialogue from SMD templates, knowledge is retrieved from provided knowledge base by usingSQLite
revertible_string.py
contains string wrapper class for relexicalizationutils.py
contains common functions used in knowledge embedding phase
generate_delex_MWOZ_ATTRACTION.py
| script for generating templates given MWoZ dialogues in attraction domaingenerate_delex_MWOZ_HOTEL.py
| script for generating templates given MWoZ dialogues in hotel domaingenerate_delex_MWOZ_RESTAURANT.py
| script for generating templates given MWoZ dialogues in restaurant domaingenerate_delex_MWOZ_TRAIN.py
| script for generating templates given MWoZ dialogues in train domaingenerate_redelex_augmented_MWOZ.py
| script for generating dialogue from MWoZ templates, knowledge is retrieved from provided knowledge base by usingSQLite
generate_MWOZ_dataset.py
| script for performing normalization and splitting for MWoZ dataset
generate_delexicalization_DIALKG.py
| script for generating templates given opendialkg dialoguesgenerate_dialogues_DIALKG.py
| script for generating dialogue from opendialkg templates, knowledge is retrieved fromneo4j
by usingCYPHER
queryrevertible_string.py
| string wrapper class for relexicalizationdialkg_utils.py
| common functions used in knowledge embedding phase
Inside the modeling
folder, we split the scripts per dataset that are used for the experiments. Below are the details of each files inside the folders:
main.py
| script for training the model and output the checkpoint of the last N epochs of the trained modelevaluate.py
| script for evaluating the model given the model checkpoint and the test set file. This script will output a generated system response used for scoringscorer_BABI5.py
| script for calculating automatic evaluation score for BABI5 datasetutils
folder containing scripts with common functions used in modelling phase
main.py
| script for training the model and output the checkpoint of the last N epochs of the trained modelevaluate.py
| script for evaluating the model given the model checkpoint and the test set file. This script will output a generated system response used for scoringscorer_CAMREST.py
| script for calculating automatic evaluation score for CamRest datasetsuccess_scorer_CAMREST.ipynb
| jupyter notebook containing script for calculating success score for different models on CamRest datasetutils
folder containing scripts with common functions used in modeling phase
main.py
| script for training the model and output the checkpoint of the last N epochs of the trained modelevaluate.py
| script for evaluating the model given the model checkpoint (GPT and GPT+KB) and the test set file. This script will output a generated system response used for scoringtime_evaluate_finetune.py
| script for measuring the finetuning duration on SMD datasetevaluate_finetune.py
| script for fine-tuning and evaluating the fine-tuned model (GPT2+KE) given the model checkpoint and the test set file. This script will output a generated system response used for scoringscorer_SMD.py
| script for calculating automatic evaluation score for SMD datasetutils
folder containing scripts with common functions used in modeling phase
main.py
| script for training the model and output the checkpoint of the last N epochs of the trained modelevaluate.py
| script for evaluating the model given the model checkpoint and the test set file. This script will output a generated system response used for scoringscorer_MWOZ.py
| script for calculating automatic evaluation score for all domains MWoZ datasetscorer_MWOZ_single.py
| script for calculating automatic evaluation score for single domain MWoZ datasetutils
folder containing scripts with common functions used in modeling phase
main.py
| script for training the model and output the checkpoint of the last N epochs of the trained modelevaluate.py
| script for evaluating the model given the model checkpoint and the test set file. This script will output a generated system response used for scoringscorer_DIALKG.py
| script for calculating automatic evaluation score for Opendialkg datasetutils
folder containing scripts with common functions used in modeling phase
Inside the others
folder, there are severals scripts in different formats. Below are the details of each files inside the folders:
setup.sh
| shell script for downloading the datasetload_neo4j.ipynb
| jupyter notebook containing script for injecting OpendialKG graphs intoneo4j
human_eval_script.ipynb
| jupyter notebook containing script for calculating human evaluation score
For the details regarding to the experiments, hyperparameters, and Evaluation results you can find it in the main paper of and suplementary materials of our work.