-
Notifications
You must be signed in to change notification settings - Fork 27.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Kosmos-2.5 #31711
Open
tic-top
wants to merge
284
commits into
huggingface:main
Choose a base branch
from
tic-top:main
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+4,834
−2
Open
Support Kosmos-2.5 #31711
Changes from 219 commits
Commits
Show all changes
284 commits
Select commit
Hold shift + click to select a range
71d3275
format
d9b23c4
format
3cbca06
.
8bde09f
format
2e6cad8
add procesor
6d797c6
init weight
532b1e0
.
234149a
.
05c9943
import sort
7d8783b
.
2de836d
format
9eece30
format
3a0cfaa
reformat
b72fe0a
reformat
589e9ef
reformat
fe51247
Merge remote-tracking branch 'upstream/main' into main
241b0bf
fixup
ba8b3dd
init test
9c74c61
init weight
363180b
modeling_test in progress
29d7cff
model test
42dd2ea
better initilization
9046ec5
model test
b64e300
restore ks2_test; update ks25 test
916781a
load from the config
578acce
processor test
c306325
run slow-prepare some test
b7d5ec9
skip sdpa test
f05e361
test finish
f19b06c
duplicate import
73dddc5
add mean
cd8ac6e
std
35ef655
fixup
9379458
remove tmp img
2e398f7
hi
40b4e98
init test
303e918
fix format
d5ad957
initialization test passed
e81b7fe
update readme
eb2b93c
Merge remote-tracking branch 'upstream/main' into main
6fa6221
[run-slow] kosmos2_5
ydshieh 7710f9a
[run-slow] kosmos2_5 on A10
ydshieh 63877c3
[run-slow] kosmos2_5
ydshieh 630a40d
fix copyright
ca820d0
Update create_circleci_config.py
tic-top f518e50
Update create_circleci_config.py
tic-top 5c5dd54
Revert "fix format"
d14ac7d
Merge branch 'main' of https://github.com/tic-top/transformers into main
8998e48
Revert "Revert "fix format""
c5c4864
Revert "format"
c7c52a7
Fix copyright and add arvix link
607f65e
Update create_circleci_config.py
tic-top a5c48d5
fix copyright
9b24a63
Merge branch 'main' of https://github.com/tic-top/transformers into main
625fc05
test for ks25 processor
2fba9ab
sdpa, eager, fa2 modeling test
5d1d095
fix format
7e810e2
upload doc images
ydshieh 2fe1f94
[ydshieh] update eager/sdpa ocr expected outputs
ydshieh ec82032
[ydshieh] update FA2 ocr expected outputs
ydshieh 8066ee7
[ydshieh] require_flash_attn
ydshieh 9c1539a
[ydshieh] no need eval()
ydshieh 4eca23c
[ydshieh] cuda_compute_capability_major_version
ydshieh b574b09
[ydshieh] fix FA2 deco
ydshieh d2c57cc
[ydshieh] [ydshieh] update eager ocr expected outputs
ydshieh 93b291f
[ydshieh] update FA2 md expected outputs
ydshieh b7be077
[ydshieh] fix
ydshieh d577c90
remove add_special_tokens
2537140
without grad when generating
24961cd
Update src/transformers/models/kosmos2_5/configuration_kosmos2_5.py
tic-top 6eb0683
Update src/transformers/models/kosmos2_5/convert_kosmos2_5.py
tic-top ca57f47
Update src/transformers/models/kosmos2_5/configuration_kosmos2_5.py
tic-top c23a8dd
Update src/transformers/models/kosmos2_5/configuration_kosmos2_5.py
tic-top 452b23d
add batch test
4308a40
fix document in ks25 config
2db6b88
Merge branch 'main' of https://github.com/tic-top/transformers into main
1776f31
fix foc in ks25 processor
188adbf
add comment to ks25 image processor
3cebe13
update copyright
c54f9a8
Update src/transformers/models/kosmos2_5/convert_kosmos2_5.py
tic-top 54a632e
Update src/transformers/models/kosmos2_5/convert_kosmos2_5.py
tic-top 5b3a6f7
fix doc in ks25 cfg
e9e56d0
simplify ks25 image procrssor
8b27f80
Merge branch 'main' of https://github.com/tic-top/transformers into main
5ba6d84
simplify ks25 image processor
25e3260
[ydshieh] update repo name in doc
ydshieh fbbf151
[ydshieh] images, width, height, rows, cols = ...
ydshieh 28b58ff
remove unnecessary comment
06c52ae
copied from comment added
99f0d99
add meaningful comment
2a782f0
Merge branch 'main' of https://github.com/tic-top/transformers into main
da45edd
ks25 image processor test added
0ddfe76
add more ks25 processor test
9dcacfc
fix style
0d166de
[ydshieh] 2024
ydshieh 32df418
[ydshieh] better skip
ydshieh 9fca9ca
[ydshieh] num_image_tokens
ydshieh 87ccbc7
Merge remote-tracking branch 'upstream/main' into main
ed50bbd
refractor FA2
c027a98
fix error
64f915e
fix ans
26fb969
[ydshieh] test_sdpa
ydshieh 6b82ce0
[ydshieh] better skip
ydshieh 482e5e1
[ydshieh] better skip
ydshieh bd76555
fix format
09d8b29
make style
cfaa28f
test_model_input_names need torch
ab546cc
[ydshieh] remove
ydshieh 6cae0b6
[ydshieh] add copied
ydshieh 9e0c277
[ydshieh] style
ydshieh cc17791
[ydshieh] Kosmos2_5ForConditionalGeneration
ydshieh 865fc2f
[ydshieh] docstring
ydshieh 162f569
[ydshieh] copied
ydshieh 889d9da
[ydshieh] copied
ydshieh 40dc555
[ydshieh] copied
ydshieh 7e5a91c
[ydshieh] copied
ydshieh 7dfd145
[ydshieh] copied
ydshieh d0e4fb7
[ydshieh] copied
ydshieh 60240f2
[ydshieh] copied
ydshieh 2b2fe1c
[ydshieh] copied
ydshieh 267e1d6
[ydshieh] copied
ydshieh 2ea4d4f
[ydshieh] fix
ydshieh 18fa43b
[ydshieh] fix
ydshieh 2157f31
[ydshieh] fix
ydshieh ac1968b
fix bug
29d272b
[kirp] make style
70d85cd
[ydshieh] copied
ydshieh 1424e07
[ydshieh] copied
ydshieh 6f8b2e6
[ydshieh] _init_weights
ydshieh 2cdb62a
[ydshieh] _init_weights
ydshieh f2b61c2
[ydshieh] _init_weights
ydshieh 3681119
[yilinjia] fix doc in config
7df3000
[ydshieh] update vision model class inheritance
ydshieh de6d842
[ydshieh] copied statement for vision model
ydshieh e09217e
[ydshieh] update _init_weights
ydshieh 210ccb1
[ydshieh] update _init_weights
ydshieh 4e709e5
[ydshieh] update _init_weights
ydshieh e62993c
[ydshieh] copied statement for Kosmos2_5TextModel
ydshieh e6fe2ae
[ydshieh] Kosmos2TextForCausalLM
ydshieh 703ccfd
[ydshieh] tiny tweak
ydshieh e41b875
[ydshieh] tests
ydshieh 9822d00
[ydshieh] tests
ydshieh 1e175ba
[ydshieh] tests
ydshieh e583cd4
[ydshieh] tests
ydshieh bb4c247
[ydshieh] stye
ydshieh 139e834
[ydshieh] revert
ydshieh 66af73d
remove old url
6659897
[ydshieh] fix
ydshieh 720a8ab
[ydshieh] fix
ydshieh 8ee2aa9
[ydshieh] fix
ydshieh 9d7363f
[ydshieh] update value
ydshieh 1bd02b2
[ydshieh] add to toctree
ydshieh 06cbb5d
[kirp] update the example part in readme
f4c73b3
[kirp] remove zero bias
0ae49e0
[kirp] iterate over the images only once
ef6754c
[kirp] remove cross attention
9a01f8f
[kirp] reformat
eb116ab
[kirp] use string
e1ab413
[kirp] remove creating mask in the layer
fe418d0
[kirp] remove cache
cc7d28f
Revert "[kirp] remove creating mask in the layer"
e5ffaee
[kirp] fix typo in processor
b5ebf09
[kirp] remove head mask
dd12798
[kirp] remove test file
15feaea
[kirp] cache for eager
ab687f5
[kirp] sdpa cache
87ab935
[kirp] move attention_mask maker to vision encoder
54b1984
[kirp] cache sdpa and format
5e5a9e9
[kirp] fix format
0ed8541
[kirp] fix format
df9d3ad
[kirp] use update_causal_mask
55cb12d
[kirp] check copies
d99934d
[kirp] regroup the init
c705049
[kirp] make style
806ca1b
[run-slow] kosmos2_5
9e620b6
[run-slow] fix checkpoint bug
65490b4
[run-slow] fix checkpoint bug
d0bf57e
Merge remote-tracking branch 'upstream/main' into main
f5d4439
[run-slow] kosmos2_5
40ff015
[run-slow] kosmos2_5
63603d6
[kirp] remove cross_attn in textblock
tic-top f8497ce
[run-slow] kosmos2_5
tic-top eab8e69
[run-slow] kosmos2_5
tic-top a6154db
[run-slow] kosmos2_5
tic-top 94cc6d2
[ydshieh] update loop
ydshieh 968b033
[ydshieh] remove duplication in init file
ydshieh 142604d
[ydshieh] tokenizer class
ydshieh 4b7bc95
[ydshieh] remove copied from
ydshieh 6f2bd73
[ydshieh] skip
ydshieh 08e1cb0
[ydshieh] move
ydshieh f2dae0d
Merge branch 'main' into kosmos25
ydshieh fcc095f
[ydshieh] fix copie
ydshieh f66c6ee
[ydshieh] remove
ydshieh 9a8479d
[ydshieh] Add to MODEL_FOR_IMAGE_TEXT_TO_TEXT_MAPPING_NAMES
ydshieh 830671b
[ydshieh] new init
ydshieh 1c58c8f
[ydshieh] fix
ydshieh 0153a08
[ydshieh] remove
ydshieh ac94b57
[ydshieh] add ProcessorTesterMixin
ydshieh 52788cc
[ydshieh] add GenerationTesterMixin
ydshieh 0b9e5ad
Merge branch 'main' into kosmos25
ydshieh 925e14a
Merge branch 'main' into main
ydshieh 6ed504d
fix
ydshieh 9a841ad
fix
ydshieh dcced48
fix
ydshieh 91fa383
fix
ydshieh e3802f4
fix
ydshieh 85da449
fix
ydshieh b1db4f2
fix
ydshieh f8c98d6
it's Friday night, let cross finger
ydshieh fbb3e59
it's Friday night, let cross finger
ydshieh ce3a6b0
it's Friday night, let cross finger
ydshieh 90c4fcc
it's Friday night, let cross finger
ydshieh 00e324d
it's Friday night, let cross finger
ydshieh 9c8aff7
it's Friday night, let cross finger
ydshieh 2c47915
it's Friday night, let cross finger
ydshieh 395a636
it's Monday let's go
ydshieh 8a058d9
it's Monday let's go
ydshieh c639eeb
it's Monday let's go
ydshieh b688c4f
Merge branch 'ca03842c' into kosmos25
ydshieh d1c52f4
temp
ydshieh 3a58742
temp
ydshieh d5b8349
temp
ydshieh 9ddc86b
temp
ydshieh 39dc6ef
temp
ydshieh b2c3db2
temp
ydshieh c356a36
temp
ydshieh 55944fc
temp
ydshieh 83d600e
temp
ydshieh 2d4cbba
temp
ydshieh 6b2f7d7
temp
ydshieh 5f731a9
temp
ydshieh 0ec499a
temp
ydshieh 7f0d26c
temp
ydshieh db865db
temp
ydshieh bf14c4b
temp
ydshieh 9b29aac
temp
ydshieh ce222a6
temp
ydshieh 876cb6b
temp
ydshieh a3638ea
temp
ydshieh 30f927a
temp
ydshieh a65a9b1
temp
ydshieh 7c99fd0
temp
ydshieh ec9ea0c
fix
ydshieh 8fc9699
fix
ydshieh 22cb70d
fix
ydshieh 001fd70
fix
ydshieh d1116f5
fix
ydshieh 6f09a51
fix
ydshieh 7d0b827
Merge branch 'main' into main
ydshieh cd018b0
Merge branch 'main' into kosmos25
ydshieh File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
<!--Copyright 2024 The HuggingFace Team. All rights reserved. | ||
|
||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with | ||
the License. You may obtain a copy of the License at | ||
|
||
http://www.apache.org/licenses/LICENSE-2.0 | ||
|
||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on | ||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | ||
specific language governing permissions and limitations under the License. | ||
|
||
⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be | ||
rendered properly in your Markdown viewer. | ||
|
||
--> | ||
|
||
# KOSMOS-2.5 | ||
|
||
## Overview | ||
|
||
Kosmos-2.5 is a multimodal literate model for machine reading of text-intensive images. Pre-trained on large-scale text-intensive images, Kosmos-2.5 excels in two distinct yet cooperative transcription tasks: (1) generating spatially-aware text blocks, where each block of text is assigned its spatial coordinates within the image, and (2) producing structured text output that captures styles and structures into the markdown format. This unified multimodal literate capability is achieved through a shared decoder-only auto-regressive Transformer architecture, task-specific prompts, and flexible text representations. We evaluate Kosmos-2.5 on end-to-end document-level text recognition and image-to-markdown text generation. Furthermore, the model can be readily adapted for any text-intensive image understanding task with different prompts through supervised fine-tuning, making it a general-purpose tool for real-world applications involving text-rich images. This work also paves the way for the future scaling of multimodal large language models. | ||
|
||
The abstract from the paper is the following: | ||
|
||
*We present Kosmos-2.5, a multimodal literate model for machine reading of text-intensive images. Pre-trained on large-scale text-intensive images, Kosmos-2.5 excels in two distinct yet cooperative transcription tasks: (1) generating spatially-aware text blocks, where each block of text is assigned its spatial coordinates within the image, and (2) producing structured text output that captures styles and structures into the markdown format. This unified multimodal literate capability is achieved through a shared Transformer architecture, task-specific prompts, and flexible text representations. We evaluate Kosmos-2.5 on end-to-end document-level text recognition and image-to-markdown text generation. Furthermore, the model can be readily adapted for any text-intensive image understanding task with different prompts through supervised fine-tuning, making it a general-purpose tool for real-world applications involving text-rich images. This work also paves the way for the future scaling of multimodal large language models.* | ||
|
||
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/model_doc/kosmos2_5_ocr.png" | ||
alt="drawing" width="600"/> | ||
|
||
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/model_doc/kosmos2_5_md.png" | ||
alt="drawing" width="600"/> | ||
|
||
<small> Overview of tasks that KOSMOS-2.5 can handle. Taken from the <a href="https://arxiv.org/abs/2309.11419">original paper</a>. </small> | ||
|
||
## Example | ||
**Markdown Task:** For usage instructions, please refer to [md.py](https://huggingface.co/microsoft/kosmos-2.5/blob/main/md.py). | ||
|
||
**OCR Task:** For usage instructions, please refer to [ocr.py](https://huggingface.co/microsoft/kosmos-2.5/blob/main/ocr.py). | ||
|
||
|
||
|
||
## Kosmos2_5Config | ||
|
||
[[autodoc]] Kosmos2_5Config | ||
|
||
## Kosmos2_5ImageProcessor | ||
|
||
[[autodoc]] Kosmos2_5ImageProcessor | ||
|
||
## Kosmos2_5Processor | ||
|
||
[[autodoc]] Kosmos2_5Processor | ||
- __call__ | ||
|
||
## Kosmos2_5Model | ||
|
||
[[autodoc]] Kosmos2_5Model | ||
- forward | ||
|
||
## Kosmos2_5ForConditionalGeneration | ||
|
||
[[autodoc]] Kosmos2_5ForConditionalGeneration | ||
- forward |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -502,6 +502,11 @@ | |
"Kosmos2Config", | ||
"Kosmos2Processor", | ||
], | ||
"models.kosmos2_5": [ | ||
"Kosmos2_5Config", | ||
"Kosmos2_5ImageProcessor", | ||
"Kosmos2_5Processor", | ||
], | ||
"models.layoutlm": [ | ||
"LayoutLMConfig", | ||
"LayoutLMTokenizer", | ||
|
@@ -1195,6 +1200,7 @@ | |
_import_structure["models.idefics3"].extend(["Idefics3ImageProcessor"]) | ||
_import_structure["models.imagegpt"].extend(["ImageGPTFeatureExtractor", "ImageGPTImageProcessor"]) | ||
_import_structure["models.instructblipvideo"].extend(["InstructBlipVideoImageProcessor"]) | ||
_import_structure["models.kosmos2_5"].extend(["Kosmos2_5ImageProcessor", "Kosmos2_5Processor"]) | ||
ydshieh marked this conversation as resolved.
Show resolved
Hide resolved
|
||
_import_structure["models.layoutlmv2"].extend(["LayoutLMv2FeatureExtractor", "LayoutLMv2ImageProcessor"]) | ||
_import_structure["models.layoutlmv3"].extend(["LayoutLMv3FeatureExtractor", "LayoutLMv3ImageProcessor"]) | ||
_import_structure["models.levit"].extend(["LevitFeatureExtractor", "LevitImageProcessor"]) | ||
|
@@ -2485,6 +2491,13 @@ | |
"Kosmos2PreTrainedModel", | ||
] | ||
) | ||
_import_structure["models.kosmos2_5"].extend( | ||
[ | ||
"Kosmos2_5ForConditionalGeneration", | ||
"Kosmos2_5Model", | ||
"Kosmos2_5PreTrainedModel", | ||
] | ||
) | ||
_import_structure["models.layoutlm"].extend( | ||
[ | ||
"LayoutLMForMaskedLM", | ||
|
@@ -5320,6 +5333,11 @@ | |
Kosmos2Config, | ||
Kosmos2Processor, | ||
) | ||
from .models.kosmos2_5 import ( | ||
Kosmos2_5Config, | ||
Kosmos2_5ImageProcessor, | ||
Kosmos2_5Processor, | ||
) | ||
from .models.layoutlm import ( | ||
LayoutLMConfig, | ||
LayoutLMTokenizer, | ||
|
@@ -6051,6 +6069,7 @@ | |
from .models.idefics3 import Idefics3ImageProcessor | ||
from .models.imagegpt import ImageGPTFeatureExtractor, ImageGPTImageProcessor | ||
from .models.instructblipvideo import InstructBlipVideoImageProcessor | ||
from .models.kosmos2_5 import Kosmos2_5ImageProcessor, Kosmos2_5Processor | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. where?
ydshieh marked this conversation as resolved.
Show resolved
Hide resolved
|
||
from .models.layoutlmv2 import ( | ||
LayoutLMv2FeatureExtractor, | ||
LayoutLMv2ImageProcessor, | ||
|
@@ -7130,6 +7149,11 @@ | |
Kosmos2Model, | ||
Kosmos2PreTrainedModel, | ||
) | ||
from .models.kosmos2_5 import ( | ||
Kosmos2_5ForConditionalGeneration, | ||
Kosmos2_5Model, | ||
Kosmos2_5PreTrainedModel, | ||
) | ||
from .models.layoutlm import ( | ||
LayoutLMForMaskedLM, | ||
LayoutLMForQuestionAnswering, | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -123,6 +123,7 @@ | |
jamba, | ||
jetmoe, | ||
kosmos2, | ||
kosmos2_5, | ||
layoutlm, | ||
layoutlmv2, | ||
layoutlmv3, | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,64 @@ | ||
# coding=utf-8 | ||
# Copyright 2024 Microsoft Research and The HuggingFace Inc. team. All rights reserved. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
from typing import TYPE_CHECKING | ||
|
||
from ...utils import ( | ||
OptionalDependencyNotAvailable, | ||
_LazyModule, | ||
is_torch_available, | ||
is_vision_available, | ||
) | ||
|
||
|
||
_import_structure = { | ||
"configuration_kosmos2_5": ["Kosmos2_5Config"], | ||
"image_processing_kosmos2_5": ["Kosmos2_5ImageProcessor"], | ||
"processing_kosmos2_5": ["Kosmos2_5Processor"], | ||
} | ||
|
||
try: | ||
if not is_torch_available(): | ||
raise OptionalDependencyNotAvailable() | ||
except OptionalDependencyNotAvailable: | ||
pass | ||
else: | ||
_import_structure["modeling_kosmos2_5"] = [ | ||
"Kosmos2_5ForConditionalGeneration", | ||
"Kosmos2_5Model", | ||
"Kosmos2_5PreTrainedModel", | ||
] | ||
|
||
|
||
if TYPE_CHECKING: | ||
from .configuration_kosmos2_5 import Kosmos2_5Config | ||
from .image_processing_kosmos2_5 import Kosmos2_5ImageProcessor | ||
from .processing_kosmos2_5 import Kosmos2_5Processor | ||
|
||
try: | ||
if not is_torch_available(): | ||
raise OptionalDependencyNotAvailable() | ||
except OptionalDependencyNotAvailable: | ||
pass | ||
else: | ||
from .modeling_kosmos2_5 import ( | ||
Kosmos2_5ForConditionalGeneration, | ||
Kosmos2_5Model, | ||
Kosmos2_5PreTrainedModel, | ||
) | ||
|
||
else: | ||
import sys | ||
|
||
sys.modules[__name__] = _LazyModule(__name__, globals()["__file__"], _import_structure) | ||
ydshieh marked this conversation as resolved.
Show resolved
Hide resolved
|
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where should I add it?