Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emotion Recognition from Audio using Deep Learning #830

Open
ChethanaPotukanam opened this issue Jul 6, 2024 · 7 comments
Open

Emotion Recognition from Audio using Deep Learning #830

ChethanaPotukanam opened this issue Jul 6, 2024 · 7 comments
Assignees
Labels

Comments

@ChethanaPotukanam
Copy link

Deep Learning Simplified Repository (Proposing new issue)

🔴 Project Title :
Emotion Recognition from Audio using Deep Learning
🔴 Aim :
To build a deep learning model that can analyze audio recordings and classify the emotions expressed. This can have applications in areas such as customer service, mental health monitoring, and entertainment.
🔴 Dataset :
Various publicly available datasets for emotion recognition in audio, such as RAVDESS, TESS, CREMA-D, etc.
🔴 Approach : Try to use 3-4 algorithms to implement the models and compare all the algorithms to find out the best fitted algorithm for the model by checking the accuracy scores. Also do not forget to do a exploratory data analysis before creating any model.


📍 Follow the Guidelines to Contribute in the Project :

  • You need to create a separate folder named as the Project Title.
  • Inside that folder, there will be four main components.
    • Images - To store the required images.
    • Dataset - To store the dataset or, information/source about the dataset.
    • Model - To store the machine learning model you've created using the dataset.
    • requirements.txt - This file will contain the required packages/libraries to run the project in other machines.
  • Inside the Model folder, the README.md file must be filled up properly, with proper visualizations and conclusions.

🔴🟡 Points to Note :

  • The issues will be assigned on a first come first serve basis, 1 Issue == 1 PR.
  • "Issue Title" and "PR Title should be the same. Include issue number along with it.
  • Follow Contributing Guidelines & Code of Conduct before start Contributing.

To be Mentioned while taking the issue :

  • Full name : Chethana Potukanam
  • GitHub Profile Link : https://github.com/ChethanaPotukanam
  • Email ID : [email protected]
  • Participant ID (if applicable):
  • Approach for this Project :
    Load the Dataset
    Exploratory Data Analysis (EDA): Visualise common patterns and features in audio signals.
    Feature Extraction: Extract features such as MFCC, Chroma, Mel Spectrogram, etc.
    Model Implementation: Convolutional Neural Network (CNN) , Recurrent Neural Network (RNN) , Long Short-Term ,
    Memory (LSTM) , Bidirectional LSTM (BiLSTM)
    Train and Evaluate Each Model
    Compare Performance using accuracy and loss metrics.
  • What is your participant role? (Mention the Open Source program) GSSoC24

Happy Contributing 🚀

All the best. Enjoy your open source journey ahead. 😎

Copy link

github-actions bot commented Jul 6, 2024

Thank you for creating this issue! We'll look into it as soon as possible. Your contributions are highly appreciated! 😊

@abhisheks008 abhisheks008 added Status: Assigned Assigned issue. level 2 Level 2 for GSSOC gssoc Girlscript Summer of Code 2024 labels Jul 7, 2024
@abhisheks008
Copy link
Owner

Assigned @ChethanaPotukanam

@abhisheks008 abhisheks008 linked a pull request Jul 14, 2024 that will close this issue
12 tasks
@abhisheks008 abhisheks008 added Status: Up for Grabs Up for grabs issue. and removed Status: Assigned Assigned issue. level 2 Level 2 for GSSOC gssoc Girlscript Summer of Code 2024 labels Aug 11, 2024
@Uknowme-h
Copy link

can i work on this ?

@abhisheks008
Copy link
Owner

can i work on this ?

Please share your approach.

@T3CH-Pyth0n
Copy link

@abhisheks008 could you please assign me this issue?

my approach is as follows:-

using the ravdess dataset for emotional speech audio
Feature Engineering : convert audio into Mel spectrogram format

  1. Using CNN to classify the audio according to emotions ( using VGG-16 architecture and ResNet50 as well)

  2. using CNN to extract features from the spectrogram and then apply LSTM / Bi-LSTM on the encodings.

  3. using HuggingFace speech2text and then using spacy universal sentence encoder to convert the resulting text into encoding vectors which can be predicted using an ANN.

this will be followed by evaluating the model using metrics and visualizing heatmaps of confusion matrices to analyse the error distribution

name : Moksh patel
GitHub profile link : https://github.com/T3CH-Pyth0n
event : Kharagpur Winter of Code ( KWoC)

@abhisheks008
Copy link
Owner

Hi @T3CH-Pyth0n sorry for replying late. Assigning this issue to you.

@abhisheks008 abhisheks008 added Status: Assigned Assigned issue. KWOC and removed Status: Up for Grabs Up for grabs issue. labels Dec 22, 2024
@T3CH-Pyth0n
Copy link

@abhisheks008 ill be altering the approach a bit, but I'll still implement 3-4 models. does that work?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants