- Julia Orteu
- Elena Alegret
- Joan Saurina
- Sergi Tomas
Our project aims to identify duplicates and similarities within a dataset of approximately 140,000 fashion images. This is part of the Inditex Hackathon Challenge 2024, where precision, efficiency, scalability, and completeness are crucial criteria.
The goal is to design an algorithm capable of finding similar clothing images, effectively identifying visual duplicates or near-duplicates. This can streamline e-commerce data management and provide more accurate recommendations for users.
The computational challenge involves comparing tens of thousands of images, which results in billions of comparisons. We use open-source AI algorithms, state-of-the-art technologies, and deep learning methods to tackle this challenge.
- Open-source algorithms and models preferred.
- Python is the recommended programming language.
- Solutions will be evaluated based on precision, efficiency, scalability, and presentation quality.
-
Deep Learning-Based Fashion Search: Inspired by the "Tiered Deep Similarity Search for Fashion" framework, our solution utilizes deep metric learning and multi-task CNNs to create an attribute-guided metric learning model that efficiently identifies clothing similarity.
-
Attribute-Guided Metric Learning (AGML): Incorporates multiple similarity tiers for fashion search based on categories, brands, and styles.
-
Efficient Workflow: The solution employs a structured workflow that includes downloading data, defining similarity criteria, training the algorithm, inference, and visualization.
- Download Data: Accesses the dataset containing over 140,000 fashion images with varying angles and perspectives.
- Define Similarity Criteria: Incorporates color, shapes, textures, patterns, and embeddings.
- Find and Train Algorithm: Uses a triplet-based network for learning discriminative feature embeddings.
- Inference: Identifies similarities based on learned embeddings.
- Visualization: Provides visual results via a dashboard.
Figure 1: Similarity Network Architecture Diagram Figure 2: Loss Model Curve
This project features a dashboard that demonstrates the results of our model's implementation. The dashboard was built using Streamlit and can be launched from the app
directory.
Figure 3: Example of our Interface
- Navigate to the Application Directory
Go to theapp
directory where the application files are stored:cd app
- Launch the Dashboard
Run the following command to launch the Streamlit application:streamlit run app.py
To set up the required dependencies for this project, ensure you have Python installed and run:
pip install -r requirements.txt