Skip to content

An NLP model to train on imdb movie reviews to predict if they are positive or negative.

Notifications You must be signed in to change notification settings

chaseadams509/NLP_review_predictions

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Dependencies:

pip install keras

pip install spacy

To run:

python classify_reviews.py

Options:

Hyper-parameters are used as constants on the top of the python file.
They can be changed by modifying them before running.

Description:

This learner uses the positive_reviews.txt and negative_reviews.txt from
the IMDB reviews dataset to predict if a review is positive or negative.
It uses the Keras framework for training, and the spaCy framework for
word embedding. 

Keras was chosen because it is a mature high-level framework that allows
for fast deep learning model development. It also uses Tensorflow under-
neath.

SpaCy was chosen for word embedding because it already has a large 
dictionary for embedding the reviews. Initially, Keras's sequence
class was used for testing. After the initial success, it was replaced
with SpaCy to demonstrate how the code could be easily swapped for better
modules.

One challenge was getting the representation between SpaCy and Keras to 
work together. To work between the two frameworks, objects were represented
as a numpy ndarray. This was done due to the ubiquitious nature of numpy
and it's easy of indexing and slicing.

Future Work:

The hyper-parameters could be passed as arguments into the script using 
argparse. This was not done due to the scope of the project being small.

Furthermore, the functions could be put into classes to make it easier
to swap out different methods of reading/embedding/training. But this was
not done because it would have looked like over-kill for a task of this 
size.

About

An NLP model to train on imdb movie reviews to predict if they are positive or negative.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages