This is a rough implementation of the paper "Text Segmentation by Cross Segment Attention" by Lukasik, et. al.
The model is to be trained on the partitions made in Wikipedia articles in the Wiki-727K dataset.
This model is also going to be trained on more granuarly segmented documents in the future such as atomizing segments in Wikipedia articles into paragraphs or class Powerpoint slides.