Skip to content

Latest commit

 

History

History
53 lines (41 loc) · 1.99 KB

README.md

File metadata and controls

53 lines (41 loc) · 1.99 KB

Natural Language Processing

This repository contains all the parallel and distributed assignments for the CSE4022 lab as of Fall'18.

  • Import an inaugural speech using nltk.corpus
  • Display top five frequent words
  • Create a word cloud (Result)

Experiment 2

  • Use nltk.corpus to plot a conditional frequency distribution (Result)
  • Import and use stanford's chinese segmenter (Result)
  • Exploring Corpus of Contemporary American english (COCA)
  • Remove stopwords from any corpus
  • Import CMU wordlist
  • Use wordnet
  • POS Tag tweets
  • Get two texts
  • Remove stop words
  • Map the text to vector spaces
  • Compute cosine(vec1, vec2)
  • Use SciPy
  • Take a call if you should do Stemming or not
  • Using CogComp to run NLP tools such as Part-of-Speech tagging, Chunking, Named Entity Recognition, etc.
  • Original Link
  • Credit CogComp
  • A Chinese word segmenter
  • Designed using Stanford NLP
  • A python GUI based application
  • Academic review of Apache OpenNLP
  • Sample code of OpenNLP in R

Authors