Vision Transformers with Hierarchical Attention

This work is first titled "Transformer in Convolutional Neural Networks".

Installation

This repository exactly follows the code and the training settings of PVT.

Image classification on the ImageNet-1K dataset

Methods	Size	#Params	#FLOPs	Acc@1	Pretrained Models
HAT-Net-Tiny	224 x 224	12.7M	2.0G	79.8	Google / Github
HAT-Net-Small	224 x 224	25.7M	4.3G	82.6	Google / Github
HAT-Net-Medium	224 x 224	42.9M	8.3G	84.0	Google / Github
HAT-Net-Large	224 x 224	63.1M	11.5G	84.2	Google / Github

Citation

If you are using the code/models provided here in a publication, please consider citing:

@article{liu2024vision,
  title={Vision Transformers with Hierarchical Attention},
  author={Liu, Yun and Wu, Yu-Huan and Sun, Guolei and Zhang, Le and Chhatkuli, Ajad and Van Gool, Luc},
  journal={Machine Intelligence Research},
  volume={21},
  pages={670--683},
  year={2024},
  publisher={Springer}
}

@article{liu2021transformer,
  title={Transformer in Convolutional Neural Networks},
  author={Liu, Yun and Sun, Guolei and Qiu, Yu and Zhang, Le and Chhatkuli, Ajad and Van Gool, Luc},
  journal={arXiv preprint arXiv:2106.03180},
  year={2021}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Vision Transformers with Hierarchical Attention

This work is first titled "Transformer in Convolutional Neural Networks".

Installation

Image classification on the ImageNet-1K dataset

Citation

Files

README.md

Latest commit

History

README.md

File metadata and controls

Vision Transformers with Hierarchical Attention

This work is first titled "Transformer in Convolutional Neural Networks".

Installation

Image classification on the ImageNet-1K dataset

Citation