Voice Activity Detection and Gender Classification Pipeline

This project implements an end-to-end pipeline for detecting voice activity, reducing noise, normalizing audio, and classifying the speaker's gender using pre-trained models.

Overview

The pipeline integrates the following tasks:

Voice Activity Detection (VAD): Detects segments in the audio where speech is present using the Silero VAD model.
Noise Reduction: Applies noise reduction techniques to enhance the audio quality.
Audio Normalization: Normalizes the audio for consistent volume levels.
Gender Classification: Classifies the detected voice as either male or female using a pre-trained Wav2Vec2 model.

Key Features

Voice Detection: Automatically identifies whether an audio file contains speech or not.
Noise Reduction: Improves audio clarity by reducing background noise.
Audio Normalization: Ensures the audio is at a consistent volume for accurate processing.
Gender Classification: Predicts the gender of the speaker with high accuracy using a fine-tuned speech model.

Workflow

Load an audio file (WAV format).
Detect speech segments using Silero VAD.
If speech is detected:
- Apply noise reduction.
- Normalize the audio.
- Classify the gender of the speaker.
Output the predicted gender along with any detected voice segments.

Models Used

Silero VAD: A pre-trained model for voice activity detection.
Wav2Vec2: A fine-tuned pre-trained model for gender classification based on speech.

The link to the pre-trained model is included in the documentation

Dataset

The dataset used for training and testing the gender classification model can be downloaded from this link.

Applications

Speech Processing: Use this pipeline for tasks involving speech detection and speaker identification.
Audio Preprocessing: Clean and normalize audio before further analysis or modeling.
Gender Analytics: Gain insights into the gender of speakers in audio datasets.

Requirements

Python 3.x
Torch
Transformers
Torchaudio
Noisereduce

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Docmentation.docx		Docmentation.docx
Model FineTuning.ipynb		Model FineTuning.ipynb
Model inference.ipynb		Model inference.ipynb
Preprocessing.ipynb		Preprocessing.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Voice Activity Detection and Gender Classification Pipeline

Overview

Key Features

Workflow

Models Used

Dataset

Applications

Requirements

About

Releases

Packages

Languages

Mostafa-Emad77/Voice-Gender-Classification-

Folders and files

Latest commit

History

Repository files navigation

Voice Activity Detection and Gender Classification Pipeline

Overview

Key Features

Workflow

Models Used

Dataset

Applications

Requirements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages