Skip to content

Latest commit

 

History

History
59 lines (59 loc) · 2.4 KB

README.md

File metadata and controls

59 lines (59 loc) · 2.4 KB

transformer_pipeline

transformer_pipeline

Support

Performance Comparisons

Performance comparisons on ImageNet1K

method top-1 accuracy
ViT-B"384 77.9
SwinV1-B"384 84.2
CSWin-B"384 85.4
iRPE base DeiT-B"224 82.4
DAT-B"384 84.8
CvT-21"384 84.9
CrossViT-18"384 83.9
SwinV2-B"384 87.1

Figures

Attention is all you need Alt text


Vision Transformer Alt text


SwinTransformer Alt text


CSwinTransformer Alt text


DETR Alt text


iRPE: Rethinking Position Encoding Alt text


Deformable Attention Transformer Alt text


CvT: Introducing Convolutions to Vision Transformers Alt text


CrossViT Alt text SwinTrack Alt text Stark Alt text