Skip to content

tempbrucefu/nlp-doc-dataset

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

nlp-doc-dataset

nlp document classfication datasets

For 20 newsgroup datasets (from data_{1} to data_{6}): datasets/newsgroup20

For Returers (from data_{7} to data_{9}) datasets/smallreturers

three small datasets: places vs orgs

people vs ogrs

places vs people

letter_dict is a a doctionary for the selected words in all documents.

dict_int is the map between key storeds in text data files and each word in letter_dict.

About

nlp document classfication datasets

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published