nlp document classfication datasets
For 20 newsgroup datasets (from data_{1} to data_{6}): datasets/newsgroup20
For Returers (from data_{7} to data_{9}) datasets/smallreturers
three small datasets: places vs orgs
people vs ogrs
places vs people
letter_dict is a a doctionary for the selected words in all documents.
dict_int is the map between key storeds in text data files and each word in letter_dict.