feat: adding mplugdocowl #31059

danaaubakirova · 2024-05-27T12:38:15Z

What does this PR do?

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

danaaubakirova · 2024-05-27T12:43:09Z

src/transformers/models/mplugdocowl/modeling_mplugdocowl.py

+        self.vocab_size = config.text_config.vocab_size
+        #initialize LlamaAttention
+        #replace_llama_modality_adaptive()
+        transformers.models.llama.modeling_llama.LlamaAttention = MultiwayAttention


I am not sure whether this changes the model architecture, because it still doesn't have the modules that I need. Like Multiway Attention or new Decoder.

Hi, was seeing this PR so might help here :)

Usually we avoid using inheritance or classes from different models. Each model is typically implemented in a single modeling_xxx.py script which is self-contained and does not depend on any other modeling files.

This means that we'll need to define a MPlugDowOwlMultiwayAttention class, a MPlugDowOwlTextDecoderLayer class, and so on.

One can leverage #Copied from statements above a class or method in case a class or method is the same as another one, e.g. MistralAttention copies from LlamaAttention since they're the same.

Hello Niels! @molbap provided a similar feedback:) Yes, I just created a separate file for the language model and defined separate classes. This indeed helps! Thank you!

Great! No worries :)

…DocOwl Vision

…ngs in VisionModel

src/transformers/models/mplugdocowl/convert_mplugdocowl_weights_to_hf.py

src/transformers/models/mplugdocowl/proccessor_new.py

src/transformers/models/mplugdocowl/image_processing_mplugdocowl.py

molbap

Nice progress!

src/transformers/models/mplugdocowl/proccessor_new.py

src/transformers/models/mplugdocowl/image_processing_mplugdocowl.py

…wl.py Co-authored-by: Pablo Montalvo <[email protected]>

…into mplugdocowl

HuggingFaceDocBuilderDev · 2024-06-04T13:57:55Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

…classname to inits

…sue with default_to_square

molbap · 2024-06-10T14:37:00Z

src/transformers/models/mplugdocowl/vision_mplugdocowl.py

small comment, should be named with ...modeling... somewhere for glance value

molbap

Some comments on RoPE, will add another review later

molbap · 2024-06-19T07:02:46Z

src/transformers/models/llama/modeling_llama.py

need a checkout here to clear diff

src/transformers/models/mplugdocowl/language_modeling_mplugdocowl.py

molbap · 2024-06-19T07:20:45Z

src/transformers/models/mplugdocowl/language_modeling_mplugdocowl.py

+class MPLUGDocOwlRotaryEmbedding(nn.Module):
+    def __init__(self, dim, max_position_embeddings=2048, base=10000, device=None, scaling_factor=1.0):
+        super().__init__()
+        self.scaling_factor = scaling_factor
+        self.dim = dim
+        self.max_position_embeddings = max_position_embeddings
+        self.base = base
+        inv_freq = 1.0 / (self.base ** (torch.arange(0, self.dim, 2, dtype=torch.int64).float().to(device) / self.dim))
+        self.register_buffer("inv_freq", inv_freq, persistent=False)
+        # For BC we register cos and sin cached
+        self.max_seq_len_cached = max_position_embeddings
+        t = torch.arange(self.max_seq_len_cached, device=device, dtype=torch.int64).type_as(self.inv_freq)
+        t = t / self.scaling_factor
+        freqs = torch.outer(t, self.inv_freq)
+        # Different from paper, but it uses a different permutation in order to obtain the same calculation
+        emb = torch.cat((freqs, freqs), dim=-1)
+        self.register_buffer("_cos_cached", emb.cos().to(torch.get_default_dtype()), persistent=False)
+        self.register_buffer("_sin_cached", emb.sin().to(torch.get_default_dtype()), persistent=False)


This seems to be an older implementation of RoPE. The class name seems to say this is basic RoPE but it applies linear scaling

t = torch.arange(self.max_seq_len_cached, device=device, dtype=torch.int64).type_as(self.inv_freq) t = t / self.scaling_factor

Which is now done on the position_ids, and in the correct class. Better use the more recent one of llama codebase since this is llama-based

And from Omni configuration - looks like default RoPE scaling is None, which in this case would instantiate this class, that has a linear scaling

make sure this is not impacting generation, it could change a few things if positional embs are messed up

molbap · 2024-06-19T07:21:08Z

docs/source/en/model_doc/mplugdocowl.md

+The mPLUGDocOwl model was proposed in [<INSERT PAPER NAME HERE>](<INSERT PAPER LINK HERE>) by <INSERT AUTHORS HERE>.
+<INSERT SHORT SUMMARY HERE>
+
+The abstract from the paper is the following:
+
+*<INSERT PAPER ABSTRACT HERE>*
+
+Tips:
+
+<INSERT TIPS ABOUT MODEL HERE>
+
+This model was contributed by [INSERT YOUR HF USERNAME HERE](https://huggingface.co/<INSERT YOUR HF USERNAME HERE>).
+The original code can be found [here](<INSERT LINK TO GITHUB REPO HERE>).


still todo, add the paper authors, abstract, tips, your contributor hf handle, original gh repo

almost done, tips need to be added

src/transformers/models/mplugdocowl/language_modeling_mplugdocowl.py

…owl.py fix: removed cos, sin cached Co-authored-by: Pablo Montalvo <[email protected]>

molbap

My comments weren't published as I had poor connectivity... most are outdated but submitting in case something slips by

src/transformers/models/mplugdocowl/modeling_mplugdocowl.py

molbap · 2024-06-19T12:42:24Z

src/transformers/models/mplugdocowl/modeling_mplugdocowl.py

+                "past_key_values": past_key_values,
+                "use_cache": kwargs.get("use_cache"),
+                "attention_mask": attention_mask,
+                "pixel_values": pixel_values,


a bit confused here why pixel_values are passed? Images are probably not needed - inputs_embeds are created from images and text, and are then passed to the model forward() call

We need to keep them because to get the image_features, pixel_values need to be not None. Same thing is happening in llava code.

src/transformers/models/mplugdocowl/language_modeling_mplugdocowl.py

molbap · 2024-06-19T13:26:57Z

src/transformers/models/mplugdocowl/language_modeling_mplugdocowl.py

+
+    def _init_rope(self):
+        if self.config.rope_scaling is None:
+            self.rotary_emb = MPLUGDocOwlRotaryEmbedding(


here IIUC this is not the correct RoPE class, it adds scaling

molbap · 2024-06-19T13:29:12Z

src/transformers/models/mplugdocowl/language_modeling_mplugdocowl.py

+def _get_unpad_data(attention_mask):
+    seqlens_in_batch = attention_mask.sum(dim=-1, dtype=torch.int32)
+    indices = torch.nonzero(attention_mask.flatten(), as_tuple=False).flatten()
+    max_seqlen_in_batch = seqlens_in_batch.max().item()
+    cu_seqlens = F.pad(torch.cumsum(seqlens_in_batch, dim=0, dtype=torch.int32), (1, 0))
+    return (
+        indices,
+        cu_seqlens,
+        max_seqlen_in_batch,
+    )


only useful for FA2

src/transformers/models/mplugdocowl/modelling_vision_mplugdocowl.py

Co-authored-by: Pablo Montalvo <[email protected]>

…wl.py Co-authored-by: Pablo Montalvo <[email protected]>

…s_to_hf.py Co-authored-by: Pablo Montalvo <[email protected]>

…owl.py Co-authored-by: Pablo Montalvo <[email protected]>

feat: adding mplugdocowl

b311e5e

danaaubakirova marked this pull request as draft May 27, 2024 12:39

danaaubakirova commented May 27, 2024

View reviewed changes

danaaubakirova added 4 commits May 27, 2024 14:01

feat: added separate file for the mPLUGDocOwl language model

aa0ec04

feat: added vision encoder for mplugdocowl

cc7e9b3

fix: changed the attention mechanism in clip vision, renamed to MPLUG…

204daba

…DocOwl Vision

feat: added hreducer and new things in config, changed vision embeddi…

6e144e5

…ngs in VisionModel

danaaubakirova commented May 28, 2024

View reviewed changes

src/transformers/models/mplugdocowl/convert_mplugdocowl_weights_to_hf.py Show resolved Hide resolved

danaaubakirova added 2 commits May 29, 2024 09:12

fix: converted hreducer module related tensors to contiguous

9f94d2c

feat: added shape adaptive module

19ffc83

molbap reviewed May 31, 2024

View reviewed changes

src/transformers/models/mplugdocowl/proccessor_new.py Outdated Show resolved Hide resolved

feat: added new image_processing script

85dce8d

danaaubakirova commented Jun 3, 2024

View reviewed changes

src/transformers/models/mplugdocowl/image_processing_mplugdocowl.py Outdated Show resolved Hide resolved

molbap reviewed Jun 3, 2024

View reviewed changes

danaaubakirova and others added 4 commits June 4, 2024 15:12

Update src/transformers/models/mplugdocowl/image_processing_mplugdoco…

0f5fb87

…wl.py Co-authored-by: Pablo Montalvo <[email protected]>

fix: small fix

53aca6d

Merge branch 'mplugdocowl' of github.com:danaaubakirova/transformers …

cb25b05

…into mplugdocowl

feat: added the additional keys to the output of the data

1debae3

danaaubakirova added 2 commits June 6, 2024 14:19

feat: made major modifications to image_processing script. added the …

66b849d

…classname to inits

feat: refactored shape_adaptive_cropping function and resolved the is…

1716668

…sue with default_to_square

molbap reviewed Jun 10, 2024

View reviewed changes

danaaubakirova added 4 commits June 11, 2024 16:19

feat: testing forward

452ebf5

feat: corrected image tag

1e7f386

fix: attention mask handling is fixed, .forward works

8577f35

feat: updates in vision architecture

f546fbc

molbap reviewed Jun 19, 2024

View reviewed changes

danaaubakirova and others added 2 commits June 19, 2024 10:54

Update src/transformers/models/mplugdocowl/language_modeling_mplugdoc…

edc358d

…owl.py fix: removed cos, sin cached Co-authored-by: Pablo Montalvo <[email protected]>

fix: renaming the model

9003d59

molbap reviewed Jun 21, 2024

View reviewed changes

danaaubakirova and others added 27 commits June 27, 2024 13:18

small fixes

8aded38

update

19e0a35

small fix

8300463

Update tests/models/mplugdocowl/test_modeling_mplugdocowl.py

f0c87d8

Co-authored-by: Pablo Montalvo <[email protected]>

Update src/transformers/models/mplugdocowl/modeling_mplugdocowl.py

b75b2b9

Co-authored-by: Pablo Montalvo <[email protected]>

Update tests/models/mplugdocowl/test_modeling_mplugdocowl.py

2aae5ca

Co-authored-by: Pablo Montalvo <[email protected]>

Update tests/models/mplugdocowl/test_modeling_mplugdocowl.py

105b5e1

Co-authored-by: Pablo Montalvo <[email protected]>

Update tests/models/mplugdocowl/test_modeling_mplugdocowl.py

7a2f434

Co-authored-by: Pablo Montalvo <[email protected]>

Update tests/models/mplugdocowl/test_modeling_mplugdocowl.py

205e345

Co-authored-by: Pablo Montalvo <[email protected]>

Update src/transformers/models/mplugdocowl/processing_mplugdocowl.py

0f5ba22

Co-authored-by: Pablo Montalvo <[email protected]>

Update src/transformers/models/mplugdocowl/processing_mplugdocowl.py

c0e241a

Co-authored-by: Pablo Montalvo <[email protected]>

Update src/transformers/models/mplugdocowl/processing_mplugdocowl.py

1555e04

Co-authored-by: Pablo Montalvo <[email protected]>

Update src/transformers/models/mplugdocowl/image_processing_mplugdoco…

219d866

…wl.py Co-authored-by: Pablo Montalvo <[email protected]>

Update src/transformers/models/mplugdocowl/convert_mplugdocowl_weight…

4600f75

…s_to_hf.py Co-authored-by: Pablo Montalvo <[email protected]>

Update src/transformers/models/mplugdocowl/language_modeling_mplugdoc…

cb55d49

…owl.py Co-authored-by: Pablo Montalvo <[email protected]>

model card is updated. tips to be added

c4c711c

fix

3007178

added documentation,updated rotary embedding function, added ModelTest

cdcf2f6

updated

cc7681f

fixes

c8c8b14

update

6897da5

deleted test.py

0f0e517

filled in the types and docstrings

046e2bd

nit

1c498fc

fixes

6b5af5e

update

e8cebb5

new

dd0f8ce

danaaubakirova mentioned this pull request Jul 16, 2024

Adding mplugdocowl #31792

Open

5 tasks

huggingface deleted a comment from github-actions bot Jul 29, 2024

huggingface deleted a comment from github-actions bot Sep 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: adding mplugdocowl #31059

feat: adding mplugdocowl #31059

danaaubakirova commented May 27, 2024

danaaubakirova May 27, 2024

NielsRogge May 27, 2024 •

edited

Loading

danaaubakirova May 27, 2024

NielsRogge May 27, 2024

molbap left a comment

HuggingFaceDocBuilderDev commented Jun 4, 2024

molbap Jun 10, 2024

molbap left a comment

molbap Jun 19, 2024

molbap Jun 19, 2024

molbap Jun 19, 2024

molbap Jun 26, 2024

molbap Jun 19, 2024

molbap Jun 26, 2024

danaaubakirova Jul 2, 2024

molbap left a comment

molbap Jun 19, 2024

danaaubakirova Jun 27, 2024

molbap Jun 19, 2024

molbap Jun 19, 2024

feat: adding mplugdocowl #31059

Are you sure you want to change the base?

feat: adding mplugdocowl #31059

Conversation

danaaubakirova commented May 27, 2024

What does this PR do?

Before submitting

Who can review?

Choose a reason for hiding this comment

NielsRogge May 27, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

molbap left a comment

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Jun 4, 2024

Choose a reason for hiding this comment

molbap left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

molbap left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

NielsRogge May 27, 2024 •

edited

Loading