Welcome to UI Data Science - Text Analytics! We conduct research through the University of Illinois on a variety of Text Analytics topics including:
- Reconstructing Documents from their n-grams
- Document classification
- Document geocoding
- Automated ontology development
Currently, we are surveying methods of obtaining and working with text data in the Python programming language. We make heavy use of NLTK both for obtaining and analyzing texts and we are exploring alternative sources of free text data including HathiTrust and Project Gutenberg.
We use the Wiki for more detailed descriptions of our projects (when we can be so descriptive) and the Resources directory for housing instructional materials we use to get students started with text analytics.