Training Semantic Parsing Models

Setup

Training code for semantic parsing models is in

train_model.py

Depending on your GPU driver version, you may need to downgrade your pytorch and CUDA versions. To update your conda env with these versions, run

conda create -n droidlet_env python==3.7 pytorch==1.7.1 torchvision==0.8.2 cudatoolkit=11.0 tensorboard -c pytorch
conda activate droidlet_env

For a list of pytorch and CUDA compatible versions, see: https://pytorch.org/get-started/previous-versions/

Parser Training Instructions

First, if you don't have dataset, you will need to run

$ python droidlet/tools/artifact_scripts/fetch_artifacts_from_aws.py --agent_name craftassist --artifact_name datasets --checksum_file datasets.txt

Then, to train NLU model, run

$ python droidlet/perception/semantic_parsing/nsp_transformer_model/train_model.py --batch_size 32 --data_dir droidlet/artifacts/datasets/annotated_data/ --dtype_samples 'annotated:1.0' --tree_voc_file droidlet/artifacts/models/nlu/ttad_bert_updated/caip_test_model_tree.json --output_dir $CHECKPOINT_PATH

Remember to modfiy CHECKPOINT_PATH to the directory where you want to store all saved models.

Feel free to experiment with the model parameters. The models and tree vocabulary files are saved under $CHECKPOINT_PATH, along with a log that contains training and validation accuracies after every epoch. Once you're done, you can choose which epoch you want the parameters for, and use that model.

$ cp $PATH_TO_BEST_CHECKPOINT_MODEL droidlet/artifacts/models/nlu/caip_test_model.pth

You can now use that model to run the agent.

Parser Evaluating Instructions

We support interative way to evaluate or query semantic parser via ipython,

ipython
from droidlet.perception.semantic_parsing.nsp_transformer_model.test_model_script import *

Then run the following to parse input arguments, build the model, tokenzier and dataset,

model, tokenizer = model_configure(args)
dataset = dataset_configure(args, tokenizer)

For query model, you can run

query_model("hello", args, model, tokenizer, dataset)

For evaluate model, you can run

eval_model(args, model, tokenizer, dataset)

List of scripts

train_model.py - The main training script for NLU model.
test_model_script.py - The evaluation script for NLU model, which supports query and evaluate modes.
caip_dataset.py - The CAIP dataset definition for NLU model.
decoder_with_loss.py - The definition of decoder part of NLU model.
encoder_decoder.py - The definition of encoder-decoder model of NLU.
modeling_bert.py - The customized bert related modules.
label_smoothing_loss.py - The definition of label smoothing loss.
optimizer_warmup.py - Custom wrapper for adam optimizer with warmup training.
tokenization_utils.py - Dictionary between span and values.
utils_caip.py - Utility for caip dataset.
utils_model.py - Utility for NLU model.
utils_parsing.py - Utility for semantic parsing.
query_model.py - The definition of NLU query model.

Data Processing Scripts

This is a suite of data processing scripts that are used to process datasets for training the semantic parser and ground truth lookup at agent runtime. This includes

process_templated_for_gt.py - Deduplicates templated datasets for use in ground truth.
create_annotated_split.py - Creates a train/test/valid split of a data type, i.e. annotated or templated.
process_templated_generations_for_train.py - Creates a train/test/valid split of templated data from templated generations created using the generation script.
remove_static_valid_commands.py: Sets aside commands to be used for validation.
update_valid_test_sets.py - Updates the valid splits with updated action dictionaries.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Training Semantic Parsing Models

Setup

Parser Training Instructions

Parser Evaluating Instructions

List of scripts

Data Processing Scripts

Files

README.md

Latest commit

History

README.md

File metadata and controls

Training Semantic Parsing Models

Setup

Parser Training Instructions

Parser Evaluating Instructions

List of scripts

Data Processing Scripts