Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gemini Generated Essays Analysis #462

Open
abhisheks008 opened this issue Dec 30, 2023 · 8 comments
Open

Gemini Generated Essays Analysis #462

abhisheks008 opened this issue Dec 30, 2023 · 8 comments
Labels
Up-for-Grabs ✋ Issues are open to the contributors to be assigned

Comments

@abhisheks008
Copy link
Owner

ML-Crate Repository (Proposing new issue)

🔴 Project Title : Gemini Generated Essays Analysis
🔴 Aim : The aim of this project is to analyze the essays generated by Gemini software using ML.
🔴 Dataset : https://www.kaggle.com/datasets/mouadberqia/gemini-generated-essays
🔴 Approach : Try to use 3-4 algorithms to implement the models and compare all the algorithms to find out the best fitted algorithm for the model by checking the accuracy scores. Also do not forget to do a exploratory data analysis before creating any model.


📍 Follow the Guidelines to Contribute in the Project :

  • You need to create a separate folder named as the Project Title.
  • Inside that folder, there will be four main components.
    • Images - To store the required images.
    • Dataset - To store the dataset or, information/source about the dataset.
    • Model - To store the machine learning model you've created using the dataset.
    • requirements.txt - This file will contain the required packages/libraries to run the project in other machines.
  • Inside the Model folder, the README.md file must be filled up properly, with proper visualizations and conclusions.

🔴🟡 Points to Note :

  • The issues will be assigned on a first come first serve basis, 1 Issue == 1 PR.
  • "Issue Title" and "PR Title should be the same. Include issue number along with it.
  • Follow Contributing Guidelines & Code of Conduct before start Contributing.

To be Mentioned while taking the issue :

  • Full name :
  • GitHub Profile Link :
  • Participant ID (If not, then put NA) :
  • Approach for this Project :
  • What is your participant role? (Mention the Open Source Program name. Eg. HRSoC, GSSoC, GSOC etc.)

Happy Contributing 🚀

All the best. Enjoy your open source journey ahead. 😎

@abhisheks008 abhisheks008 added the Up-for-Grabs ✋ Issues are open to the contributors to be assigned label Dec 30, 2023
@abhisheks008 abhisheks008 added Intermediate Points 30 - SSOC 2024 JWOC This issue/pull request will be considered for JWOC 2k22. labels Jan 11, 2024
@jayakedia10
Copy link

Full name : Jaya Kedia
GitHub Profile Link : https://github.com/jayakedia10
Participant ID: NA
Approach for this Project : The algorithms I have planned to use for implementing the models are Natural Language Processing (NLP) algorithms like Naive Bayes, Support Vector Machines (SVM) or Random Forest, RNNs.
What is your participant role?
JWoC 2024

@abhisheks008 Please assign me this issue.

@abhisheks008
Copy link
Owner Author

Assigned under JWOC @jayakedia10

@abhisheks008 abhisheks008 added Assigned 💻 Issue has been assigned to a contributor and removed Up-for-Grabs ✋ Issues are open to the contributors to be assigned labels Jan 24, 2024
@abhisheks008 abhisheks008 added Up-for-Grabs ✋ Issues are open to the contributors to be assigned and removed Assigned 💻 Issue has been assigned to a contributor Intermediate Points 30 - SSOC 2024 JWOC This issue/pull request will be considered for JWOC 2k22. labels Mar 2, 2024
@milanprajapati571
Copy link

Full name: Milan Prajapati

GitHub Profile Link: GitHub_Profile

Participant ID (If not, then put NA): NA

Approach for this Project:

1. Exploratory Data Analysis (EDA): Analyze the dataset using visualizations and summary statistics to gain insights.
2. Data Preprocessing: Clean and prepare the data by tokenizing text, removing stopwords, and applying other text preprocessing techniques.
3. Model Implementation: Implement 3-4 machine learning algorithms, such as Logistic Regression, Random Forest, SVM, and Naive Bayes, for sentiment analysis.
4. Model Comparison: Evaluate the models using accuracy scores and other relevant metrics to identify the best-performing model.
5. Documentation: Document the entire process, including EDA, preprocessing steps, model implementations, comparisons, and conclusions in the README.md file.

What is your participant role? : SSoC (Social Summer of Code)

Sir, can You Please assign this project to me...?

@EshuPatel
Copy link

Full name: Eshu Patel

GitHub Profile Link:](https://github.com/EshuPatel)

Participant ID: NA

Approach for this Project:

  1. Performing EDA to analyze the dataset using visualizations and summary statistics to gain insights.
  2. Data being wrangled, and noisy data being removed.
  3. ML algos implemented for implementing NLP.
  4. Checking the model against accuracy scores and other metrics.
  5. Documentation of the entire project summary.

Participant role: SSoC (Social Summer of Code)

Sir, can You Please assign this project to me...?

@abhisheks008
Copy link
Owner Author

@milanprajapati571 @EshuPatel I need a brief approach for this problem statement with a planning of implementing 7-8 models.

@HarshRaj29004
Copy link
Contributor

Name: Harsh Raj

GitHub Profile: https://github.com/HarshRaj29004

Participant ID: NA

Approach: I will perform preprocessing on dataset and, will check for grammatical error and readibility of text.
Predict quality of text using random forest, Gradient Boosting Machines, Neural Networks.

What is your participant role? SSOC'24

@EshuPatel
Copy link

I need a brief approach for this problem statement with a planning of implementing 7-8 models.

Sure sir, Brief approach is given below:

  • After cleaning, perform EDA on the dataset to gain more insights about data.
  • After preprocessing the data, we will implement various sentiment analysis models that incudes Naive-Bayes approach, Multinomial Naive Bayes, CNN, Random Forest, Logistic Regression, SVM, Text blob, Vader and Spacy.
  • And then will perform a comparison depicting which model will provide most accurate results.

@abhisheks008
Copy link
Owner Author

Assigned @EshuPatel

@abhisheks008 abhisheks008 added Assigned 💻 Issue has been assigned to a contributor Intermediate Points 30 - SSOC 2024 SSOC and removed Up-for-Grabs ✋ Issues are open to the contributors to be assigned labels Jun 12, 2024
@abhisheks008 abhisheks008 added Up-for-Grabs ✋ Issues are open to the contributors to be assigned and removed Assigned 💻 Issue has been assigned to a contributor Intermediate Points 30 - SSOC 2024 SSOC labels Aug 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Up-for-Grabs ✋ Issues are open to the contributors to be assigned
Projects
None yet
Development

No branches or pull requests

5 participants