Replies: 1 comment
-
@D0miH if you pass open_clip/src/open_clip/factory.py Lines 208 to 253 in 49eac2f |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi all,
thank you so much for this awesome library!!
I am using the vision transformer CLIP models and I would like to test some properties with different number of patches.
The problem I have now is that there is only a limited number of models available with different patch sizes. For example, there are only the ViT-B/16 and the ViT-B/32 models, which have the same number of trainable parameters but different patch sizes.
Therefore, I would like to emulate different patch sizes by scaling the input images and fine-tuning the model. I know that the ViTs expect input images of size
224x224
. However, in the original CLIP paper (sec. 3.2) they are using higher resolution images for fine-tuning.So now to my question:
Thank you so much for your help!
Beta Was this translation helpful? Give feedback.
All reactions