Removing HTML tags
Removing Punctuations
Converting to lower case
Lemmatization
Removing stop words
SVM - 90.44%
Logistic Regression - 88.9%
Naive Bayes - 88.51%
Decision Tree - 69.33%
Random Forest - 85.35%
Gradient Boosting Classifier - 81.01%
XGBoost Classifier - 84.58%
Accuracy with 10 epochs - 87%-88%
Adding a Convolution layer - 87%-89% (The time required to train decreases significantly in this case)
Accuracy with 10 epochs - 86%-88%
Adding a Convolution layer - 85%-87% (The time required to train decreases significantly in this case)
NBSVM is an approach to text classification proposed by Wang and Manning (https://nlp.stanford.edu/pubs/sidaw12_simple_sentiment.pdf) that takes a linear model such as SVM (or logistic regression) and infuses it with Bayesian probabilities by replacing word count features with Naive Bayes log-count ratios.
Accuracy - 91.5%
(Note - This code was just for practice and I did not consider hyperparameter tuning for most cases)