Hate Speech Detection

Synopsis:

Performs classification of tweets into positive and negative (racist/sexist) using RoBERTa model of Tranformers library; compared with classification results from other classifiers

Pipeline of the ML model:

Data Pre-processing:

Data exploration followed by data visualization in the form of graphs and plots and WordClouds. Data Cleaning using NLTK library.

RoBERTa model:

Encoding the sentences into id's and attention masks using RoBERTa tokeniser. Creating the RoBERTa model by adding the additional layers to the existing imported model. Fitting the cleaned tweets to this model and getting the predictions. Evaluating our results using ROC scores and Accuracy.

Alternative models:

Cleaning the original model followed by vectorisation using TFiDF vectoriser. SMOTE application to balance the skewed data. Classification using Logistic Regression, Naive Bayes Classfier, Random Forest Classifier and SVM

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
hate-speech-detection.ipynb		hate-speech-detection.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hate Speech Detection

Synopsis:

Pipeline of the ML model:

Data Pre-processing:

RoBERTa model:

Alternative models:

About

Releases

Packages

Languages

HrishiMak/Hate-Speech-Detection

Folders and files

Latest commit

History

Repository files navigation

Hate Speech Detection

Synopsis:

Pipeline of the ML model:

Data Pre-processing:

RoBERTa model:

Alternative models:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages