Add CogVLM (cleaner) #28196

NielsRogge · 2023-12-22T08:38:03Z

What does this PR do?

This PR adds CogVLM, in a cleaner way. Follow-up of #27718.

Debugging logits (for branch with matching logits see add_cogvlm_cleaner_with_matching_logits):

image features from CLIP backbone match in case xops.memory_efficient_attention is used. So this needs to be replaced by native PyTorch =>
debug LLM forward -> had to do with RMSNorm being copied from llama, which uses self.weight * hidden_states.to(input_dtype) instead of (self.weight * hidden_states).to(input_dtype).
support EagerAttention besides SdpaAttention (only latter gives matching logits)
support key/value caching
Make sure logits match with 1e-4.

zucchini-nlp

Thanks for adding the model!

The model logic is a bit unclear to me in some parts, left a few comments. Maybe we can add more tests to make it clear

docs/source/en/perf_infer_gpu_one.md

src/transformers/models/cogvlm/modeling_cogvlm.py

zucchini-nlp · 2024-04-29T17:54:04Z

src/transformers/models/cogvlm/modeling_cogvlm.py

+        return output
+
+
+class CogvlmRotaryEmbedding(torch.nn.Module):


copied from mistral

src/transformers/models/cogvlm/modeling_cogvlm.py

zucchini-nlp · 2024-04-29T18:18:11Z

src/transformers/models/cogvlm/modeling_cogvlm.py

+                # and we need to add 1 to take into account the one extra token that is going to
+                # be sent through the model
+                position_ids = build_position_ids(token_type_ids, attention_mask) + 2 + 1
+            position_ids = position_ids[:, -1:]


Would be nice if we make this dependent on input's seq length, instead of assuming 1. For cases of speculative decoding, when it starts working for VLMs

src/transformers/models/cogvlm/modeling_cogvlm.py

src/transformers/models/cogvlm/test_pipeline.py

github-actions · 2024-05-26T08:06:12Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

NielsRogge · 2024-06-03T13:39:45Z

Hi @amyeroberts would be great if you could review this PR, quite some updates were made, making it ready for review.

amyeroberts · 2024-06-04T10:43:38Z

Let's have a first round of reviews from the people already familiar with the model and PR :) cc @zucchini-nlp

zucchini-nlp

Thanks for working on it, left a few comments. I guess this supports the new CogVLM also?

src/transformers/models/cogvlm/configuration_cogvlm.py

zucchini-nlp · 2024-06-05T06:46:44Z

src/transformers/models/cogvlm/modeling_cogvlm.py

+        # import xformers.ops as xops
+
+        # out = xops.memory_efficient_attention(
+        #     queries,
+        #     keys,
+        #     values,
+        #     scale=self.scale,
+        # )
+
+        # output = self.dense(out.view(batch_size, sequence_length, -1))
+        # output = self.output_dropout(output)
+        # return output
+


comments to be removed

src/transformers/models/cogvlm/modeling_cogvlm.py

github-actions · 2024-06-30T08:07:40Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

NielsRogge and others added 30 commits December 6, 2023 11:47

First draft

1e16091

Improve conversion script

c75720b

More improvements

ed527a0

More improvements

e633dca

Add config attributes, improve conversion script

988d430

Make conversion work

79cd06c

Rename images to pixel_values

8be1ded

Add processor

202fcc2

Remove einops dependency

b76b1b9

Remove xformers dependency

185151d

Improve vision config

1b4de2a

Update test

98d47a2

Fix more tests, update conversion script

4f1aa8b

Fix more tests

17581cc

Fix more tests, add docstrings

d10cbca

Improve variable names, docstrings

5efde22

Improve more variable names

7ddd120

Leverage _prepare_4d_causal_attention_mask

4071e89

Rename classes

e6bd4ed

Remove script

a80529f

Update README and docs

38ed9bf

Use native torch rotary embeddings

79f981d

Remove triton dependency

2ea6b18

Remove file

7f1e274

Make fixup

d3c5fc3

Make fixup

456a439

Merge branch 'main' into add_cogvlm

3410c80

Add cleaner implementation

c52848d

More improvements

660cc0f

Add position_ids

57e433d

NielsRogge requested a review from amyeroberts April 29, 2024 07:55

zucchini-nlp reviewed Apr 30, 2024

View reviewed changes

NielsRogge added 3 commits May 1, 2024 15:19

Address comments

1eed441

Add copied from

c7ab4a9

Remove unused argument

ae612b1

NielsRogge added 14 commits May 27, 2024 10:53

Prepare everything in the processor

4424455

Remove script

f3540b6

Fix merge

e85cd55

Fix ruff

3a2555e

Apply ruff

a01bb5a

Fix more processor tests

3aba8f3

Fix more model tests

8e56e85

Remove copied from

3c26099

Fix processor tests

87bc77a

Fix typo

fccc433

Merge remote-tracking branch 'upstream/main' into add_cogvlm_cleaner

78001e1

Merge remote-tracking branch 'upstream/main' into add_cogvlm_cleaner

3f1a821

Undo gemma updates

d6a9fa5

Fix test

016d793

zucchini-nlp reviewed Jun 5, 2024

View reviewed changes

NielsRogge added 2 commits June 5, 2024 18:39

Remove archive map

93a9426

Address comment

44d7038

zucchini-nlp mentioned this pull request Jun 28, 2024

Error running inference on CogVLM2 when distributing it on multiple GPUs: Expected all tensors to be on the same device, but found at least two devices #31676

Closed

4 tasks

NielsRogge added 2 commits July 1, 2024 10:13

Fix merge

93a5d5f

Fix image processor

a593d3f

github-actions bot closed this Jul 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add CogVLM (cleaner) #28196

Add CogVLM (cleaner) #28196

NielsRogge commented Dec 22, 2023 •

edited

Loading

zucchini-nlp left a comment

zucchini-nlp Apr 29, 2024

zucchini-nlp Apr 29, 2024

github-actions bot commented May 26, 2024

NielsRogge commented Jun 3, 2024

amyeroberts commented Jun 4, 2024

zucchini-nlp left a comment

zucchini-nlp Jun 5, 2024

github-actions bot commented Jun 30, 2024

Add CogVLM (cleaner) #28196

Add CogVLM (cleaner) #28196

Conversation

NielsRogge commented Dec 22, 2023 • edited Loading

What does this PR do?

zucchini-nlp left a comment

Choose a reason for hiding this comment

zucchini-nlp Apr 29, 2024

Choose a reason for hiding this comment

zucchini-nlp Apr 29, 2024

Choose a reason for hiding this comment

github-actions bot commented May 26, 2024

NielsRogge commented Jun 3, 2024

amyeroberts commented Jun 4, 2024

zucchini-nlp left a comment

Choose a reason for hiding this comment

zucchini-nlp Jun 5, 2024

Choose a reason for hiding this comment

github-actions bot commented Jun 30, 2024

NielsRogge commented Dec 22, 2023 •

edited

Loading