Skip to content

Latest commit

 

History

History
18 lines (14 loc) · 951 Bytes

README.md

File metadata and controls

18 lines (14 loc) · 951 Bytes

Sentiment-analysis-of-Covid-tweets-using-Bert

Dataset Information

The dataset is taken from kaggle and it contains 6 columns User Id , TweetIn , Sentiment , Location , Screen Name

Methodology

Out of the all the columns onli two columns were considered namely TweetsIn and Sentiment.Then thorough data cleaning was performed to remove the emoji's,hashtags,mentions and other media,links.On,further cleaning maximum Non-english sentences were remove and the dataframe was made ready for modeling.

Model

The model used was BERT which stands for Bidirectional Encoder Representations from Transformers.The pre-trained Bert Model was imported from Hugging Face🤗 transformer library.

Results

training accuracy: 0.9522 
validation accuarcy:0.9255

The Confusion matrix for the predictions on Test set

2022-01-13