https://drive.google.com/file/d/1TcGdGBmZaJvctPvitZOddEsPduOl6VCY/view?usp=drive_link
This repository contains codes and documentation for a project focused on the identification and classification of sexual harassment using machine learning techniques. It also includes a web application and a web scraping tool.
-
Model1
- Contains code for training a DistilBERT model to identify sexual harassment in text.
- Files:
- Code for training the DistilBERT model.
- Code for making predictions using the trained model.
-
Model2
- Contains code for classifying the type of sexual harassment.
- Subfolders:
- Model2_part1: Prediction of whether there is Commenting involved. (84% Accuracy)
- Model2_part2: Prediction of whether there is Staring involved. (82% Accuracy)
- Model2_part3: Prediction of whether there is Touching involved. (79.98% Accuracy)
-
FlaskServer
- Contains Flask application integrating all four machine learning models.
-
NodeServer
- Contains code for connecting to MongoDB database and handling API requests.
- Files:
app.js
: File handling API requests and sending responses.
-
React-Server
- Contains CSS and React components.
- Subfolder:
- src: React components.
- Contains Python code used to scrape tweets from Twitter.
- The SHAP(SHapley Additive exPlanations) for all 3 sub models of Model-2 with visual diagram
- The LIME(Local interpretable model-agnostic explanations) for all 3 sub models of Model-2
- The Humming Score of this model is aroun 85 - 90 %
- PDF document detailing the architecture of the project.
- PDF document containing visualizations related to the project.
- Directory containing the scraped tweets.
- Data used to train Model2 as a whole.
- Results of the machine learning models predictions.
- Clone the repository.
- Navigate to the MachineLearning Folder to use the BlackBox of three ML models
- Navigate to the WebApp Folder to use the MERN Stack Application
- Navigate to the WebScrapping Folder to use the Twitter Tweet Scrapping Data