timm to pytorch conversion for vit model fix #26908

staghado · 2023-10-18T15:31:11Z

This PR fixes this issue #26219 with timm to PyTorch conversion. It removes the need for hard coded values for model dims by using the attributes of the timm model without needing the model name.

It does the following things :

Extract model dims from the timm model directly, no need for ifs
Decides whether the converted model will be a classification model or only a feature extractor using the num_classes attribute of the timm model.
In the case of a feature extractor only model : remove the pooling layers from the PyTorch model and compare the output to the last hidden state instead.

This works for a large number of models in the ViT family.

@ArthurZucker, @amyeroberts, @rwightman

ArthurZucker

Thanks for the cleanup, LGTM but I need a second look from @rwightman 🤗

src/transformers/models/vit/convert_vit_timm_to_pytorch.py

HuggingFaceDocBuilderDev · 2023-10-19T09:42:39Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

src/transformers/models/vit/convert_vit_timm_to_pytorch.py

rwightman · 2023-10-19T17:40:52Z

Overall looking much better, should be more robust now.

A few things, there are a number of vit configurations supported in timm that are not, to my knowledge, supported in transformers. Should there be an attempt to detect? Thinking of some examples

if fc_norm is present (norm after pooling)
use of global average pooling in combination (or without) class token
non-overlapping position and class token embedding
CLIP style vit with norm_pre layer present
SigLIP style vit with attn_pool layer present
and soon, use of 'registers' via reg_token param
use of layer scale in ViT model blocks

The git history is completely messed up!

ArthurZucker · 2023-10-25T08:08:07Z

Hey! Make sure to rebase to only have your changes! 😉

…pping

staghado · 2023-10-25T14:29:06Z

Hey! Make sure to rebase to only have your changes! 😉

I have reset the branch's history and left only my changes which fix the issue here.

staghado · 2023-10-25T14:37:34Z

Overall looking much better, should be more robust now.

A few things, there are a number of vit configurations supported in timm that are not, to my knowledge, supported in transformers. Should there be an attempt to detect? Thinking of some examples

if fc_norm is present (norm after pooling)

use of global average pooling in combination (or without) class token

non-overlapping position and class token embedding

CLIP style vit with norm_pre layer present

SigLIP style vit with attn_pool layer present

and soon, use of 'registers' via reg_token param

use of layer scale in ViT model blocks

I have tried to add some checks before trying to convert the model from timm to huggingface.
Checks to be added :
1. non-overlapping position and class token embedding
2. use of 'registers' via reg_token param
3. check when a model has a convolution feature extractor like ResNet50

I have tested the script on the pre-trained ViTs and only the following give errors:

vit_base_r50_s16_224.orig_in21k (contains a resnet block)
vit_base_r50_s16_384.orig_in21k_ft_in1k (contains a resnet block)
vit_small_r26_s32_224.augreg_in21k
vit_small_r26_s32_224.augreg_in21k_ft_in1k
vit_small_r26_s32_384.augreg_in21k_ft_in1k
vit_tiny_r_s16_p8_224.augreg_in21k
vit_tiny_r_s16_p8_224.augreg_in21k_ft_in1k
vit_tiny_r_s16_p8_384.augreg_in21k_ft_in1k

rwightman · 2023-11-04T22:05:03Z

@staghado looking good, those hybrid resnet-vit models should be possible to catch (see if below) with a meaningful error .. other than looks ready to go

if not isinstance(model.patch_embed, timm.layers.PatchEmbed) ...

rwightman

looks good from the timm perspective

amyeroberts

Thanks for adding this and making the script more general!

Just a small question on an outstanding to-do. Otherwise LGTM!

src/transformers/models/vit/convert_vit_timm_to_pytorch.py

staghado · 2023-11-18T10:38:55Z

@ArthurZucker

ArthurZucker

Thanks for improving this script! 🚀

staghado added 2 commits October 18, 2023 16:48

timm to pytorch conversion for vit model fix

1ed2469

remove unecessary print statments

330eaf2

ArthurZucker requested a review from rwightman October 19, 2023 09:25

ArthurZucker reviewed Oct 19, 2023

View reviewed changes

src/transformers/models/vit/convert_vit_timm_to_pytorch.py Outdated Show resolved Hide resolved

src/transformers/models/vit/convert_vit_timm_to_pytorch.py Show resolved Hide resolved

ArthurZucker previously approved these changes Oct 19, 2023

View reviewed changes

ArthurZucker mentioned this pull request Oct 19, 2023

Add FastViT model #26172

Closed

5 tasks

rwightman reviewed Oct 19, 2023

View reviewed changes

src/transformers/models/vit/convert_vit_timm_to_pytorch.py Show resolved Hide resolved

staghado force-pushed the timm-pytorch-conversion-fix branch from 44432e3 to 330eaf2 Compare October 25, 2023 14:21

Detect non-supported ViTs in transformers & better handle id2label ma…

d5ce0e8

…pping

staghado requested review from rwightman and ArthurZucker November 3, 2023 17:45

detect non supported hybrid resnet-vit models in conversion script

fcedbfd

rwightman approved these changes Nov 5, 2023

View reviewed changes

amyeroberts approved these changes Nov 6, 2023

View reviewed changes

src/transformers/models/vit/convert_vit_timm_to_pytorch.py Outdated Show resolved Hide resolved

remove check for overlap between cls token and pos embed

6448619

ArthurZucker approved these changes Nov 20, 2023

View reviewed changes

ArthurZucker merged commit 93f2de8 into huggingface:main Nov 20, 2023
3 checks passed

not-lain mentioned this pull request Dec 29, 2023

Converting TIMM to HF Vision transformers #26219

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

timm to pytorch conversion for vit model fix #26908

timm to pytorch conversion for vit model fix #26908

staghado commented Oct 18, 2023 •

edited

Loading

ArthurZucker left a comment

HuggingFaceDocBuilderDev commented Oct 19, 2023

rwightman commented Oct 19, 2023 •

edited

Loading

ArthurZucker commented Oct 25, 2023

staghado commented Oct 25, 2023 •

edited

Loading

staghado commented Oct 25, 2023

rwightman commented Nov 4, 2023

rwightman left a comment

amyeroberts left a comment

staghado commented Nov 18, 2023

ArthurZucker left a comment

timm to pytorch conversion for vit model fix #26908

timm to pytorch conversion for vit model fix #26908

Conversation

staghado commented Oct 18, 2023 • edited Loading

ArthurZucker left a comment

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Oct 19, 2023

rwightman commented Oct 19, 2023 • edited Loading

ArthurZucker commented Oct 25, 2023

staghado commented Oct 25, 2023 • edited Loading

staghado commented Oct 25, 2023

rwightman commented Nov 4, 2023

rwightman left a comment

Choose a reason for hiding this comment

amyeroberts left a comment

Choose a reason for hiding this comment

staghado commented Nov 18, 2023

ArthurZucker left a comment

Choose a reason for hiding this comment

staghado commented Oct 18, 2023 •

edited

Loading

rwightman commented Oct 19, 2023 •

edited

Loading

staghado commented Oct 25, 2023 •

edited

Loading