This project aims to integrate multiple data modalities to enhance the accuracy and contextuality of responses in AI-driven QA systems.
This system uses a novel approach by combining text and visual information to process user queries more effectively. By leveraging advances in AI and machine learning, we provide richer, more accurate answers that integrate seamlessly into various applications.
- Multimodal Data Integration: Utilizes both text and image data for comprehensive query understanding.
- Advanced RAG Techniques: Employs state-of-the-art Retriever-Augmented Generation models to enhance answer quality.
- Wide Application Range: From educational tools to customer service enhancements, this system is versatile.
To get started with this project, clone this repository and follow the setup instructions below.
- Python 3.8+
- PyTorch 1.7+
- Datasets from [link to datasets]
git clone https://github.com/himanshu-skid19/Multimodal-RAG-QA.git
cd Multimodal-RAG-QA
pip install -r requirements.txt
- Make sure you have Docker installed and running
- Run the following:
docker pull qdrant/qdrant
docker run -d --name qdrant_server -p 6333:6333 qdrant/qdrant
- Himanshu Singhal - @himanshu-skid19
- Anushka Gupta - @anushkacodez
- Atul Jha - @Atul-04
- Parth Agarwal - @Parth-Agarwal216