An R package consisting of dictionaries for text analysis and associated utilities. Designed to be used with quanteda but can be used more generally with any text analytic package (e.g. tidytext, tm, etc.).
# the devtools package needs to be installed for this to work
devtools::install_github("quanteda/dictionarytools")
To do list includes adding functions that: - allow us to convert a wild-card or regex dictionary into a fixed match dictionary, for the supported languages (English, initially); - expand a core word list through synonyms using the wordnet package; - expand a core word list through (e.g.) cosine similarities to other words from a corpus; - expand a core word list through word2vec vector proximities to other words from a corpus; - allow easy editing of dictionaries via a round-trip to the editor.
Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.