This is the official repository for the ICASSP 2023 paper "On Designing Light-Weight Object Trackers Through Network Pruning: Use CNNs or Transformers?" by Saksham Aggarwal, Taneesh Gupta, Pawan K. Sahu, Arnav Chavan, Rishabh Tiwari, Dilip K. Prasad and Deepak K. Gupta
Object trackers deployed on low-power devices need to be light-weight, however, most of the current state-of-the-art (SOTA) methods rely on using compute-heavy backbones built using CNNs or Transformers. Large sizes of such models do not allow their deployment in low-power conditions and designing compressed variants of large tracking models is of great importance. This paper demonstrates how highly compressed light-weight object trackers can be designed using neural architectural pruning of large CNN and Transformer based trackers. Further, a comparative study on architectural choices best suited to design light-weight trackers is provided. A comparison between SOTA trackers using CNNs, Transformers as well as the combination of the two is presented to study their stability at various compression ratios. Finally results for extreme pruning scenarios going as low as 1% in some cases are shown to study the limits of network pruning in object tracking. This work provides deeper insights into designing highly efficient trackers from existing SOTA methods.
Put the tracking datasets in ./data. It should look like:
${PROJECT_ROOT}
-- data
-- lasot
|-- airplane
|-- basketball
|-- bear
...
-- got10k
|-- test
|-- train
|-- val
-- coco
|-- annotations
|-- images
-- trackingnet
|-- TRAIN_0
|-- TRAIN_1
...
|-- TRAIN_11
|-- TEST
Note: We only train on GOT10k train dataset and evaluate on test data of GOT10k, LaSOT, OTB and TrackingNet
You can find tracker-specific codebase and its details below:
- Thanks to the authors of OSTrack, STARK, PyTracking and VIT-Slim, which helped us to quickly implement our ideas.
- We use the implementation of the ViT from the Timm repository.
If our work is useful for your research, please consider cite:
@INPROCEEDINGS{aggarwal2022designing,
title={On designing light-weight object trackers through network pruning: Use CNNs or transformers?},
author={Aggarwal, Saksham and Gupta, Taneesh and Sahu, Pawan Kumar and Chavan, Arnav and Tiwari, Rishabh and Prasad, Dilip K and Gupta, Deepak K},
booktitle={ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
year={2023}}