QIANets is a cutting-edge model compression framework leveraging quantum-inspired techniques to reduce the size and inference time of deep learning models without sacrificing accuracy. By integrating concepts such as quantum pruning, tensor decomposition, and annealing-based matrix factorization, QIANets achieves highly efficient model compression for convolutional neural networks (CNNs), tested on GoogLeNet, ResNet-18, and DenseNet architectures.
- Quantum-Inspired Pruning: Selectively removes non-essential model weights inspired by quantum measurement principles.
- Tensor Decomposition: Breaks down large weight matrices for more compact representations, inspired by quantum state factorization.
- Annealing-Based Matrix Factorization: Optimizes compression via quantum-inspired annealing, achieving higher compression without degrading model performance.
- Low Latency & High Efficiency: Reduces inference times on CNNs while maintaining comparable accuracy to the original model.
- Overview
- Features
- Getting Started
- Installation
- Usage
- Quantum-Inspired Techniques
- Results
- Contributing
- License
- Contact
Before running this project, ensure you have the following:
- Python 3.x
- TensorFlow or PyTorch
- Basic knowledge of quantum computing (helpful, but not mandatory)
-
Clone the repository:
git clone https://github.com/edwardmagongo/Quantum-Inspired-Model-Compression cd QIANets
-
Install dependencies:
pip install -r requirements.txt
-
Set up your environment for quantum-inspired computations:
- [Optional] Install quantum computing libraries such as Qiskit for deeper exploration of the quantum principles.
-
Train a base CNN model (e.g., ResNet-18) using the provided dataset:
python train.py --dataset <dataset> --model <model-type>
-
Compress the model using quantum-inspired techniques:
python compress.py --model <trained-model-path> --compression-rate <rate>
-
Evaluate the compressed model:
python evaluate.py --model <compressed-model-path> --dataset <dataset>
-
Train a model:
python train.py --dataset cifar10 --model resnet18
-
Compress using a 75% rate:
python compress.py --model models/resnet18.h5 --compression-rate 0.75
-
Evaluate:
python evaluate.py --model models/resnet18_compressed.h5 --dataset cifar10
- Quantum-Inspired Pruning: We draw from quantum measurement theory to prune unimportant weights based on probabilistic outcomes, reducing model size while maintaining fidelity.
- Tensor Decomposition: Inspired by the decomposition of quantum states, this technique factorizes large weight matrices into smaller, efficient components.
- Annealing-Based Matrix Factorization: Employs a quantum-inspired annealing process to optimize the factorization, balancing accuracy and compression efficiency.
In extensive testing on CNN models such as GoogLeNet and DenseNet, QIANets achieved:
- 50-70% reduction in inference times
- Compression rates of up to 80% without significant loss in accuracy
- Faster deployment of models on resource-constrained devices (e.g., mobile phones)
The approach shows significant promise for edge AI applications and large-scale deployment in real-time systems.
We welcome contributions! To get involved:
-
Fork the repo and create a new branch:
git checkout -b feature/your-feature
-
Commit your changes:
git commit -m "Add your feature"
-
Push your branch:
git push origin feature/your-feature
-
Open a Pull Request for review.
This project is licensed under the MIT License. See the LICENSE file for details.
For questions or collaboration opportunities, reach out to:
Edward Magongo
Email: [email protected]
Thank you for your interest in QIANets!