Fine-tuning-Hugging-face-text-classification-model

About project

The objective of this challenge is to develop a machine learning model to assess if a Twitter post related to vaccinations is positive, neutral, or negative. This solution could help governments and other public health actors monitor public sentiment towards COVID-19 vaccinations and help improve public health policy, vaccine communication strategies, and vaccination programs across the world.

About project data

The data comes from tweets collected and classified through Crowdbreaks.org [Muller, Martin M., and Marcel Salathe. "Crowdbreaks: Tracking Health Trends Using Public Social Media Data and Crowdsourcing." Frontiers in public health 7 (2019).]. Tweets have been classified as pro-vaccine (1), neutral (0) or anti-vaccine (-1). The tweets have had usernames and web addresses removed.

The objective of this challenge is to develop a machine learning model to assess if a twitter post that is related to vaccinations is positive, neutral, or negative.

Variable definition:

tweet_id: Unique identifier of the tweet

safe_tweet: Text contained in the tweet. Some sensitive information has been removed like usernames and urls

label: Sentiment of the tweet (-1 for negative, 0 for neutral, 1 for positive)

agreement: The tweets were labeled by three people. Agreement indicates the percentage of the three reviewers that agreed on the given label. You may use this column in your training, but agreement data will not be shared for the test set.

Files available for download are:

Train.csv - Labelled tweets on which to train your model

Test.csv - Tweets that you must classify using your trained model

Project Models

Two models were finetuned and trained from huggging face in this project to Achieve my desired model:

bert-based-uncased
roberta-base

Model 1___training metrics and prediction

You can find my first model and its performance in hugging face using the link below;

https://huggingface.co/Gyimah3/Finetuned_bert

Model 2___training metrics and prediction

You can find my second model and its performance in hugging face using the link below;

https://huggingface.co/Gyimah3/Finetuned_roberta

Training metrics_time series

Training metrics_scalars

Model App

I created a gradio app for the first model for predictions given a covid tweet text

The link to the app can be found below:

https://huggingface.co/spaces/Gyimah3/Gradio_App_for_Finetuned_bert

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
data		data
training_metrics screenshot-scalers		training_metrics screenshot-scalers
training_metrics screenshots-time series		training_metrics screenshots-time series
traning_metrics values(Scalars)		traning_metrics values(Scalars)
Gradio_app.ipynb		Gradio_app.ipynb
Gradio_app.py		Gradio_app.py
README.md		README.md
Sentiment-Analysis.ipynb		Sentiment-Analysis.ipynb
gradio.png		gradio.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fine-tuning-Hugging-face-text-classification-model

About project

About project data

Project Models

Model 1___training metrics and prediction

Model 2___training metrics and prediction

Model App

About

Releases

Packages

Languages

Gyimah3/Fine-tuning-Hugging-face-text-classification-model

Folders and files

Latest commit

History

Repository files navigation

Fine-tuning-Hugging-face-text-classification-model

About project

About project data

Project Models

Model 1___training metrics and prediction

Model 2___training metrics and prediction

Model App

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages