Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: adding mplugdocowl #31059

Draft
wants to merge 55 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 7 commits
Commits
Show all changes
55 commits
Select commit Hold shift + click to select a range
b311e5e
feat: adding mplugdocowl
danaaubakirova May 27, 2024
aa0ec04
feat: added separate file for the mPLUGDocOwl language model
danaaubakirova May 27, 2024
cc7e9b3
feat: added vision encoder for mplugdocowl
danaaubakirova May 27, 2024
204daba
fix: changed the attention mechanism in clip vision, renamed to MPLUG…
danaaubakirova May 28, 2024
6e144e5
feat: added hreducer and new things in config, changed vision embeddi…
danaaubakirova May 28, 2024
9f94d2c
fix: converted hreducer module related tensors to contiguous
danaaubakirova May 29, 2024
19ffc83
feat: added shape adaptive module
danaaubakirova May 31, 2024
85dce8d
feat: added new image_processing script
danaaubakirova Jun 3, 2024
0f5fb87
Update src/transformers/models/mplugdocowl/image_processing_mplugdoco…
danaaubakirova Jun 4, 2024
53aca6d
fix: small fix
danaaubakirova Jun 4, 2024
cb25b05
Merge branch 'mplugdocowl' of github.com:danaaubakirova/transformers …
danaaubakirova Jun 4, 2024
1debae3
feat: added the additional keys to the output of the data
danaaubakirova Jun 4, 2024
66b849d
feat: made major modifications to image_processing script. added the …
danaaubakirova Jun 6, 2024
1716668
feat: refactored shape_adaptive_cropping function and resolved the is…
danaaubakirova Jun 10, 2024
452ebf5
feat: testing forward
danaaubakirova Jun 11, 2024
1e7f386
feat: corrected image tag
danaaubakirova Jun 12, 2024
8577f35
fix: attention mask handling is fixed, .forward works
danaaubakirova Jun 13, 2024
f546fbc
feat: updates in vision architecture
danaaubakirova Jun 18, 2024
edc358d
Update src/transformers/models/mplugdocowl/language_modeling_mplugdoc…
danaaubakirova Jun 19, 2024
9003d59
fix: renaming the model
danaaubakirova Jun 19, 2024
9f688d9
grand fix: fixed hreducer, the firstgenerated token is correct. forw…
danaaubakirova Jun 21, 2024
30c8a2b
fix: need to fix prepare_inputs_for_generation()
danaaubakirova Jun 24, 2024
5483f82
fix: fixed prepare_inputs_for_generation()
danaaubakirova Jun 24, 2024
413ddad
Merge branch 'main' into mplugdocowl
danaaubakirova Jun 25, 2024
7546063
testing phase
danaaubakirova Jun 25, 2024
e3cc222
removed copied from ..
danaaubakirova Jun 25, 2024
4f4f219
small fixes
danaaubakirova Jun 25, 2024
661bd75
removed some things from the config
danaaubakirova Jun 26, 2024
8aded38
small fixes
danaaubakirova Jun 27, 2024
19e0a35
update
danaaubakirova Jun 27, 2024
8300463
small fix
danaaubakirova Jun 27, 2024
f0c87d8
Update tests/models/mplugdocowl/test_modeling_mplugdocowl.py
danaaubakirova Jun 27, 2024
b75b2b9
Update src/transformers/models/mplugdocowl/modeling_mplugdocowl.py
danaaubakirova Jun 27, 2024
2aae5ca
Update tests/models/mplugdocowl/test_modeling_mplugdocowl.py
danaaubakirova Jun 27, 2024
105b5e1
Update tests/models/mplugdocowl/test_modeling_mplugdocowl.py
danaaubakirova Jun 27, 2024
7a2f434
Update tests/models/mplugdocowl/test_modeling_mplugdocowl.py
danaaubakirova Jun 27, 2024
205e345
Update tests/models/mplugdocowl/test_modeling_mplugdocowl.py
danaaubakirova Jun 27, 2024
0f5ba22
Update src/transformers/models/mplugdocowl/processing_mplugdocowl.py
danaaubakirova Jun 27, 2024
c0e241a
Update src/transformers/models/mplugdocowl/processing_mplugdocowl.py
danaaubakirova Jun 27, 2024
1555e04
Update src/transformers/models/mplugdocowl/processing_mplugdocowl.py
danaaubakirova Jun 27, 2024
219d866
Update src/transformers/models/mplugdocowl/image_processing_mplugdoco…
danaaubakirova Jun 27, 2024
4600f75
Update src/transformers/models/mplugdocowl/convert_mplugdocowl_weight…
danaaubakirova Jun 27, 2024
cb55d49
Update src/transformers/models/mplugdocowl/language_modeling_mplugdoc…
danaaubakirova Jun 27, 2024
c4c711c
model card is updated. tips to be added
danaaubakirova Jun 28, 2024
3007178
fix
danaaubakirova Jun 28, 2024
cdcf2f6
added documentation,updated rotary embedding function, added ModelTest
danaaubakirova Jun 28, 2024
cc7681f
updated
danaaubakirova Jul 1, 2024
c8c8b14
fixes
danaaubakirova Jul 2, 2024
6897da5
update
danaaubakirova Jul 2, 2024
0f0e517
deleted test.py
danaaubakirova Jul 2, 2024
046e2bd
filled in the types and docstrings
danaaubakirova Jul 2, 2024
1c498fc
nit
danaaubakirova Jul 2, 2024
6b5af5e
fixes
danaaubakirova Jul 2, 2024
e8cebb5
update
danaaubakirova Jul 2, 2024
dd0f8ce
new
danaaubakirova Jul 3, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 47 additions & 0 deletions docs/source/en/model_doc/mplugdocowl.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.

⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.

-->

# mPLUGDocOwl

## Overview

The mPLUGDocOwl model was proposed in [<INSERT PAPER NAME HERE>](<INSERT PAPER LINK HERE>) by <INSERT AUTHORS HERE>.
<INSERT SHORT SUMMARY HERE>

The abstract from the paper is the following:

*<INSERT PAPER ABSTRACT HERE>*

Tips:

<INSERT TIPS ABOUT MODEL HERE>

This model was contributed by [INSERT YOUR HF USERNAME HERE](https://huggingface.co/<INSERT YOUR HF USERNAME HERE>).
The original code can be found [here](<INSERT LINK TO GITHUB REPO HERE>).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Todo

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still todo, add the paper authors, abstract, tips, your contributor hf handle, original gh repo

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

almost done, tips need to be added



## MPLUGDocOwlConfig

[[autodoc]] MPLUGDocOwlConfig

## MPLUGDocOwlProcessor

[[autodoc]] MPLUGDocOwlProcessor

## MPLUGDocOwlForConditionalGeneration

[[autodoc]] MPLUGDocOwlForConditionalGeneration
- forward
18 changes: 18 additions & 0 deletions src/transformers/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -486,6 +486,10 @@
"LlavaConfig",
"LlavaProcessor",
],
"models.mplugdocowl": [
"MPLUGDocOwlConfig",
"MPLUGDocOwlProcessor",
],
"models.llava_next": [
"LlavaNextConfig",
"LlavaNextProcessor",
Expand Down Expand Up @@ -2297,6 +2301,12 @@
"LlavaPreTrainedModel",
]
)
_import_structure["models.mplugdocowl"].extend(
[
"MPLUGDocOwlForConditionalGeneration",
"MPLUGDocOwlPreTrainedModel",
]
)
_import_structure["models.llava_next"].extend(
[
"LlavaNextForConditionalGeneration",
Expand Down Expand Up @@ -5037,6 +5047,10 @@
LlavaConfig,
LlavaProcessor,
)
from .models.mplugdocowl import (
MPLUGDocOwlConfig,
MPLUGDocOwlProcessor,
)
from .models.llava_next import (
LlavaNextConfig,
LlavaNextProcessor,
Expand Down Expand Up @@ -6692,6 +6706,10 @@
LlavaForConditionalGeneration,
LlavaPreTrainedModel,
)
from .models.mplugdocowl import (
MPLUGDocOwlForConditionalGeneration,
MPLUGDocOwlPreTrainedModel,
)
from .models.llava_next import (
LlavaNextForConditionalGeneration,
LlavaNextPreTrainedModel,
Expand Down
1 change: 1 addition & 0 deletions src/transformers/models/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -129,6 +129,7 @@
lilt,
llama,
llava,
mplugdocowl,
llava_next,
longformer,
longt5,
Expand Down
2 changes: 2 additions & 0 deletions src/transformers/models/auto/configuration_auto.py
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,7 @@
("lilt", "LiltConfig"),
("llama", "LlamaConfig"),
("llava", "LlavaConfig"),
("mplugdocowl", "MPLUGDocOwlConfig"),
("llava_next", "LlavaNextConfig"),
("longformer", "LongformerConfig"),
("longt5", "LongT5Config"),
Expand Down Expand Up @@ -416,6 +417,7 @@
("llama2", "Llama2"),
("llama3", "Llama3"),
("llava", "LLaVa"),
("mplugdocowl", "mPLUGDocOwl"),
("llava_next", "LLaVA-NeXT"),
("longformer", "Longformer"),
("longt5", "LongT5"),
Expand Down
1 change: 1 addition & 0 deletions src/transformers/models/auto/image_processing_auto.py
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,7 @@
("layoutlmv3", "LayoutLMv3ImageProcessor"),
("levit", "LevitImageProcessor"),
("llava", "CLIPImageProcessor"),
("mplugdocowl", "CLIPImageProcessor"),
("llava_next", "LlavaNextImageProcessor"),
("mask2former", "Mask2FormerImageProcessor"),
("maskformer", "MaskFormerImageProcessor"),
Expand Down
2 changes: 2 additions & 0 deletions src/transformers/models/auto/modeling_auto.py
Original file line number Diff line number Diff line change
Expand Up @@ -298,6 +298,7 @@
("idefics2", "Idefics2ForConditionalGeneration"),
("layoutlm", "LayoutLMForMaskedLM"),
("llava", "LlavaForConditionalGeneration"),
("mplugdocowl", "MPLUGDocOwlForConditionalGeneration"),
("llava_next", "LlavaNextForConditionalGeneration"),
("longformer", "LongformerForMaskedLM"),
("luke", "LukeForMaskedLM"),
Expand Down Expand Up @@ -698,6 +699,7 @@
("instructblip", "InstructBlipForConditionalGeneration"),
("kosmos-2", "Kosmos2ForConditionalGeneration"),
("llava", "LlavaForConditionalGeneration"),
("mplugdocowl", "MPLUGDocOwlForConditionalGeneration"),
("llava_next", "LlavaNextForConditionalGeneration"),
("paligemma", "PaliGemmaForConditionalGeneration"),
("pix2struct", "Pix2StructForConditionalGeneration"),
Expand Down
1 change: 1 addition & 0 deletions src/transformers/models/auto/processing_auto.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,7 @@
("layoutlmv2", "LayoutLMv2Processor"),
("layoutlmv3", "LayoutLMv3Processor"),
("llava", "LlavaProcessor"),
("mplugdocowl", "MPLUGDocOwlProcessor"),
("llava_next", "LlavaNextProcessor"),
("markuplm", "MarkupLMProcessor"),
("mctct", "MCTCTProcessor"),
Expand Down
1 change: 1 addition & 0 deletions src/transformers/models/auto/tokenization_auto.py
Original file line number Diff line number Diff line change
Expand Up @@ -241,6 +241,7 @@
),
),
("llava", ("LlamaTokenizer", "LlamaTokenizerFast" if is_tokenizers_available() else None)),
("mplugdocowl", ("MPLUGDocOwlTokenizer", "MPLUGDocOwlTokenizerFast" if is_tokenizers_available() else None)),
("llava_next", ("LlamaTokenizer", "LlamaTokenizerFast" if is_tokenizers_available() else None)),
("longformer", ("LongformerTokenizer", "LongformerTokenizerFast" if is_tokenizers_available() else None)),
(
Expand Down
55 changes: 55 additions & 0 deletions src/transformers/models/mplugdocowl/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# Copyright 2024 The HuggingFace Team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from typing import TYPE_CHECKING

from ...utils import OptionalDependencyNotAvailable, _LazyModule, is_torch_available


_import_structure = {
"configuration_mplugdocowl": ["MPLUGDocOwlConfig"],
"processing_mplugdocowl": ["MPLUGDocOwlProcessor"],
}


try:
if not is_torch_available():
raise OptionalDependencyNotAvailable()
except OptionalDependencyNotAvailable:
pass
else:
_import_structure["modeling_mplugdocowl"] = [
"MPLUGDocOwlForConditionalGeneration",
"MPLUGDocOwlPreTrainedModel",
]


if TYPE_CHECKING:
from .configuration_mplugdocowl import MPLUGDocOwlConfig
from .processing_mplugdocowl import MPLUGDocOwlProcessor

try:
if not is_torch_available():
raise OptionalDependencyNotAvailable()
except OptionalDependencyNotAvailable:
pass
else:
from .modeling_mplugdocowl import (
MPLUGDocOwlForConditionalGeneration,
MPLUGDocOwlPreTrainedModel,
)

else:
import sys

sys.modules[__name__] = _LazyModule(__name__, globals()["__file__"], _import_structure)
Loading
Loading