This is a repository of various language resources collected over the past several years.
All resources are distributed under LGPL and freely distributed. Included in each folder is a paper that describes how the data was collected. We ask that you properly cite if you will use these resources.
Please send an email before submitting this repository to any data cataloging, data aggregation, and benchmarking projects/initiatives. The proponents of the papers of these datasets would like to be acknowledged appropriately or involved in co-authorship.
For questions, you may reach the curator at:
Joseph Marvin Imperial
Faculty Member, Department of Computer Science
Lab Head, NU Human Language Technology Lab
[email protected]
All resources are distributed under LGPL.