Skip to content

Latest commit

 

History

History
32 lines (18 loc) · 2.1 KB

roadmap.md

File metadata and controls

32 lines (18 loc) · 2.1 KB

Roadmap

For bugs and other issues see the repository issues on GitHub.

General Improvements

  • Load training data from a file instead of storing it in memory to allow for quick resume of training without needing to regenerate the dataset from the source corpus.

  • Read training data from a file in chunks to allow for datasets beyond the size of available RAM.

  • Output attention alignments correctly in tensorboard summary logs.

Model Improvements

  • Support more RNN types such as GRUs and other LSTM variants. One of particular interest is Nested LSTM.

  • Use BERT in place of the encoder (and possibly in place of the RNN component of the decoder if that makes sense).

  • Use HRED for dialog context tracking instead of the current pre-pending technique.

  • Support MMI objective function to reduce generic but highly probably responses.

  • Implement a validation metric based on cosine similarity of the sentence embedding of the output sequence and the sentence embedding of the ground truth sequence.

Future Ideas

  • Extend the model with a binary classifier that can predict when a change in topic is occurring during a conversation. This would allow the chatbot to throttle use of dialog context (prepended conversation history or HRED in the future) more intelligently.

  • Create a seq2-seq-2seq model? where the "hidden" sequence is the equivalent of the "context vector" from traditional seq2seq architectures, except it is a sequence instead of a single vector. The hidden sequence can also serve as a layer of indirection for attention alignments, since it doesn't make sense to directly align the input and output sequences in a dialog model as we do in a translation model.

  • Implement a discriminator model to encode and re-rank each candidate generated by beam search.

  • Implement an online learning mechanism and persistent storage so the bot can update its training dataset and incorporate newly learned facts on the fly.

  • Create an Alexa skill for the bot :-)