Skip to content

Files

Latest commit

4353392 · Aug 5, 2024

History

History

Cyberbullying Classification

Cyberbullying Classification

Overview

The goal is to analyze tweets to classify them into categories of cyberbullying and non-cyberbullying using NLP techniques and machine learning models.

Dataset

The dataset contains over 47,000 tweets labeled into six categories: Age, Ethnicity, Gender, Religion, Other type of cyberbullying, and Not cyberbullying.

Link to the dataset: Cyberbullying Classification Dataset

Models Used

  1. Logistic Regression
  2. Naive Bayes
  3. Random Forest Classifier
  4. Voting Classifier (Ensemble Model): Combines predictions from the above models using a majority voting scheme.

Contribution

Contributions are welcome! Feel free to submit issues, feature requests, or pull requests to improve the system.