Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CogVLM #27718

Closed
wants to merge 30 commits into from
Closed

Add CogVLM #27718

wants to merge 30 commits into from

Conversation

NielsRogge
Copy link
Contributor

@NielsRogge NielsRogge commented Nov 27, 2023

What does this PR do?

This PR adds CogVLM natively into the Transformers library (it's already usable with trust_remote_code=True, but with this PR one can run it without the xformers, einops and triton dependencies).

To do:

Comment on lines +50 to +57
def __init__(self, image_processor, tokenizer, image_size, patch_size):
super().__init__(image_processor, tokenizer)
self.image_size = image_size
self.patch_size = patch_size
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @ydshieh for this model, I need to store 2 attributes to the processor, however we currently don't have a processor_config.json file. Can we add support for this in from_pretrained and save_pretrained?

Comment on lines +50 to +57
def __init__(self, image_processor, tokenizer, image_size, patch_size):
super().__init__(image_processor, tokenizer)
self.image_size = image_size
self.patch_size = patch_size
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @ydshieh for this model, I need to store 2 attributes to the processor, however we currently don't have a processor_config.json file. Can we add support for this in from_pretrained and save_pretrained?

@NielsRogge
Copy link
Contributor Author

A cleaner implementation I'm working on is here: https://github.com/NielsRogge/transformers/tree/add_cogvlm_cleaner. It implements the model like llava, by adding the image tokens inside the model, rather than creating them in the processor class.

@ydshieh ydshieh mentioned this pull request Dec 19, 2023
@NielsRogge NielsRogge mentioned this pull request Dec 22, 2023
5 tasks
@NielsRogge
Copy link
Contributor Author

Closing this one in favor of the PR above.

@NielsRogge NielsRogge closed this Dec 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants