A web application that recognizes and translates hand gestures into commands or text using a webcam. This project leverages computer vision and machine learning techniques to provide real-time hand gesture recognition capabilities.
- Recognizes predefined gestures.
- Displays corresponding action or text on the screen.
hand_gesture_recognition/
├── dataset/
│ └── <gesture_name>/
│ └── *.png
├── src/
│ ├── frontend/
│ │ ├── index.html
│ │ └── static/
│ │ ├── css/
│ │ │ └── styles.css
│ │ └── js/
│ │ └── script.js
│ ├── backend/
│ │ ├── app.py
│ │ ├── api/
│ │ │ └── endpoints.py
│ │ └── models/
│ │ └── gesture_model.h5
│ └── scripts/
│ └── add_gestures.py
├── requirements.txt
├── README.md
└── .gitignore
-
Create a virtual environment and activate it: python -m venv env source env/bin/activate # On Windows use
env\Scripts\activate
-
Install dependencies: pip install -r requirements.txt
-
Run the Flask app: python src/backend/app.py
-
Phase 1: Initial Setup and Data Collection The initial phase involved setting up the project structure and collecting data for training the gesture recognition model. The dataset was organized into different folders, each representing a different gesture.
-
Phase 2: Model Training During this phase, a convolutional neural network (CNN) was developed and trained using the collected gesture images. The model was fine-tuned for better accuracy and saved in the models directory.
-
Phase 3: Backend Development A Flask-based backend was developed to handle the gesture recognition logic. The backend includes API endpoints for processing the images and returning the recognized gestures.
-
Phase 4: Frontend Development The frontend was developed using HTML, CSS, and JavaScript. The web interface allows users to interact with the application and view the recognized gestures in real-time.
-
Phase 5: Integration and Testing The final phase involved integrating the frontend and backend, followed by extensive testing to ensure the application works seamlessly. Bug fixes and performance improvements were made during this phase.
- Improved model accuracy through additional training data and hyperparameter tuning.
- Enhanced frontend with better UI/UX for a smoother user experience.
- Added more predefined gestures and corresponding actions.
- Fixed bugs related to real-time gesture recognition and model loading.
Contributions are welcome! Please follow the steps below to contribute to this project:
- Fork the repository on GitHub.
- Create a new branch for your feature or bugfix.
- Make your changes and commit them with descriptive messages.
- Push your changes to your forked repository.
- Submit a pull request to the main repository.
This project is licensed under the MIT License. See the LICENSE file for more details.
For any questions or inquiries, please contact me at [email protected].