Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Florence2 support #31506

Closed
wants to merge 35 commits into from
Closed

Add Florence2 support #31506

wants to merge 35 commits into from

Conversation

D4ve-R
Copy link

@D4ve-R D4ve-R commented Jun 20, 2024

What does this PR do?

Add support for microsoft/Florence2

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@D4ve-R
Copy link
Author

D4ve-R commented Jun 20, 2024

Command to check diff from original impl

FILE=configuration_florence2.py
git diff --no-index <(curl -s https://huggingface.co/microsoft/Florence-2-large/raw/main/$FILE) src/transformers/models/florence/$FILE

@amyeroberts
Copy link
Collaborator

@D4ve-R Let us know when it's ready for review!

Copy link
Author

@D4ve-R D4ve-R left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey @amyeroberts i'm almost done. Have to finish writing the test cases.
Can you take a look at the code in src/transformers/models/florence?
I'm stuck in one place, see comment in this review, maybe you have some tips or suggestions how to fix them. Thank you!

Edit: I resolved all my questions

src/transformers/__init__.py Outdated Show resolved Hide resolved
src/transformers/models/florence/processing_florence2.py Outdated Show resolved Hide resolved
Copy link
Contributor

@SangbumChoi SangbumChoi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comment since I'm very interested in this PR! Overall it seems like going in a right path :)

src/transformers/models/florence/processing_florence2.py Outdated Show resolved Hide resolved
src/transformers/models/florence/__init__.py Outdated Show resolved Hide resolved
tests/models/florence/test_modeling_florence2.py Outdated Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also we can add some #Copied from logic for the definition and class also!
e.g.

# Copied from transformers.models.deformable_detr.modeling_deformable_detr.MultiScaleDeformableAttentionFunction

I linked groundingdino since it is a similar task

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure what you mean. Could you point out where you want to include this in the code?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some definition or class function can be directly copied from original repo. For example

# Copied from transformers.models.beit.modeling_beit.BeitDropPath with Beit->GroundingDino

We can use DropPath as same in this file and use #Copied from logic
This Copied from logic is for the make fix-copies reuse the definition for existing method.

BTW all the function is named independent to florence2. I'm not sure this is okay? (e.g. DropPath -> Florence2DropPath)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use DropPath as same in this file and use #Copied from logic
This Copied from logic is for the make fix-copies reuse the definition for existing method.

Ok now i know what you mean. Thank you for the tip! Changed it for now, but looking at the code in timm, it's slightly different, so i don't know if this is the right thing to do. This might introduce/persist already fixed bugs. https://github.com/huggingface/pytorch-image-models/blob/main/timm/layers/drop.py#L150

BTW all the function is named independent to florence2. I'm not sure this is okay? (e.g. DropPath -> Florence2DropPath)

Reading a bit more in transformers library, I think the naming conventions only apply for classes that are meant to be imported and/or when they are copied. Imo prefixing everything with Florence2 is not necessary.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok now i know what you mean. Thank you for the tip! Changed it for now, but looking at the code in timm, it's slightly different, so i don't know if this is the right thing to do. This might introduce/persist already fixed bugs. https://github.com/huggingface/pytorch-image-models/blob/main/timm/layers/drop.py#L150

The best way is probably to use is_timm_available and import DropPath from timm, same as in the original implementation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW all the function is named independent to florence2. I'm not sure this is okay? (e.g. DropPath -> Florence2DropPath)

Maybe @amyeroberts or some other maintainers can give us some hints how to handle this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was also explaining one of the patterns that repo should keep on it! You do not always have to use this #Copied method if it is not applicable. :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few things to unpack here :)

I think the naming conventions only apply for classes that are meant to be imported and/or when they are copied. Imo prefixing everything with Florence2 is not necessary.

This isn't quite right.

There are certain objects which we import and don't add a prefix or reimplement. If we're implementing any custom class/layer/submodule within the modeling file, then it should be prefixed with the model's name. For example in CLIP, BaseModelOutputWithPooling is used directly as it's a common class. Whereas there are also model-specific outputs which are defined and prefixed with CLIP in their name.

In the case of drop path, we want to define our own class, with the Florence2 prefix instead of importing from timm (we should try to avoid dependencies form other libraries as much as possible).

If the functionality is the same, we should copy from another model like Beit. If it's different we should implement from scratch in this modeling file.

With regards to the # Copied from pattern, this is the classical way to utilise modules from other parts of the library.

We've recently introduced a new, improved way of adding new models which are very similar to other models in the library using the "diff converter", which replaces # Copied from.

This involves defining a diff_model_name.py file, which defines modules which are copied, overridden and newly implemented. The modeling file is then automatically generated.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed feedback, this was really helpful!
I'll look into the "diff converter" method. Appreciate the guidance!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's pretty new, so let us know if it's unclear or there's any unexpected behaviour!

@amyeroberts
Copy link
Collaborator

@D4ve-R Thanks for all your work on this so far!

I think we can close this PR. The Florence 2 model is available directly on the hub to be used within transformers e.g.: https://huggingface.co/microsoft/Florence-2-large

Apologies for not noticing this sooner

@D4ve-R
Copy link
Author

D4ve-R commented Jun 28, 2024

@amyeroberts no worries! It was really fun and a great experience to learn something.

@SangbumChoi
Copy link
Contributor

SangbumChoi commented Jun 29, 2024

@D4ve-R Hi, even though the PR might be closed, I would recommend you to keep this branch alive in your repo. I think this might be valuable when we also consider finetuning florence2 model! Nice work :)

@D4ve-R D4ve-R closed this Jun 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants