-
Notifications
You must be signed in to change notification settings - Fork 27.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added segmentation maps
support for DPT image processor
#34345
base: main
Are you sure you want to change the base?
Added segmentation maps
support for DPT image processor
#34345
Conversation
cc @molbap as well in case bandwidth permits |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - just a small refactor of the method to be more aligned with existing models!
|
||
def test_call_segmentation_maps(self): | ||
# Initialize image_processing | ||
image_processing = self.image_processing_class(**self.image_processor_dict) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, image_processor
would be better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed image_processing
to image_processor
. Should I also rename it in the other tests?
if segmentation_maps is not None: | ||
segmentation_maps = [to_numpy_array(segmentation_map) for segmentation_map in segmentation_maps] | ||
|
||
# Add channel dimension if missing - needed for certain transformations | ||
if segmentation_maps[0].ndim == 2: | ||
added_channel_dim = True | ||
segmentation_maps = [segmentation_map[None, ...] for segmentation_map in segmentation_maps] | ||
input_data_format = ChannelDimension.FIRST | ||
else: | ||
added_channel_dim = False | ||
if input_data_format is None: | ||
input_data_format = infer_channel_dimension_format(segmentation_maps[0], num_channels=1) | ||
|
||
if do_reduce_labels: | ||
segmentation_maps = [self.reduce_label(segmentation_map) for segmentation_map in segmentation_maps] | ||
|
||
if do_resize: | ||
segmentation_maps = [ | ||
self.resize( | ||
image=segmentation_map, | ||
size=size, | ||
resample=resample, | ||
keep_aspect_ratio=keep_aspect_ratio, | ||
ensure_multiple_of=ensure_multiple_of, | ||
input_data_format=input_data_format, | ||
) | ||
for segmentation_map in segmentation_maps | ||
] | ||
|
||
if do_pad: | ||
segmentation_maps = [ | ||
self.pad_image( | ||
image=segmentation_map, size_divisor=size_divisor, input_data_format=input_data_format | ||
) | ||
for segmentation_map in segmentation_maps | ||
] | ||
|
||
# Remove extra channel dimension if added for processing | ||
if added_channel_dim: | ||
segmentation_maps = [segmentation_map.squeeze(0) for segmentation_map in segmentation_maps] | ||
segmentation_maps = [segmentation_map.astype(np.int64) for segmentation_map in segmentation_maps] | ||
|
||
data["labels"] = segmentation_maps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect - if there isn't any difference with Beit, can this be wrapped in a _preprocess_segmentation_map()
method in a loop, that can be flagged as # Copied from ...
the beit image processor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrapped segmentation map preprocessing code to _preprocess_segmentation_map()
, and also moved image preprocessing to separate _preprocess_image()
function and general preprocessing functionality to _preprocess()
function.
Could you please re-review the pull request? In the last commit I made all the changes you asked for: wrapped segmentation map preprocessing code to separate functions, added comments and renamed a variable in tests. Do I need to make any other changes to the code? |
hey @simonreise , will review in a moment, we were all at a team gathering last week hence the inactivity. On my radar! |
@molbap you are fobidden to work for this week 🤣 go and rest, @yonigozlan will have a look! 🤗 |
fix Co-authored-by: ydshieh <[email protected]>
* Added image-text-to-text pipeline to task guide * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by: Steven Liu <[email protected]> * Merge codeblocks --------- Co-authored-by: Steven Liu <[email protected]>
* try * tryagain * tryagggain * translated * translated2 * Update docs/source/zh/attention.md Co-authored-by: Huazhong Ji <[email protected]> --------- Co-authored-by: Huazhong Ji <[email protected]>
* fix * propagate * type check
* fix * higher max positions in tests
* Fix torch.export issue in dpt based models Signed-off-by: Phillip Kuznetsov <[email protected]> * Simplify the if statements Signed-off-by: Phillip Kuznetsov <[email protected]> * Move activation definitions of zoe_depth to init() Signed-off-by: Phillip Kuznetsov <[email protected]> * Add test_export for dpt and zoedepth Signed-off-by: Phillip Kuznetsov <[email protected]> * add depth anything Signed-off-by: Phillip Kuznetsov <[email protected]> * Remove zoedepth non-automated zoedepth changes and zoedepth test Signed-off-by: Phillip Kuznetsov <[email protected]> * [run_slow] dpt, depth_anything, zoedepth Signed-off-by: Phillip Kuznetsov <[email protected]> --------- Signed-off-by: Phillip Kuznetsov <[email protected]>
* Do not load for meta device * Make some minor improvements * Add test * Update tests/utils/test_modeling_utils.py Update test parameters Co-authored-by: Marc Sun <[email protected]> * Make the test simpler --------- Co-authored-by: Marc Sun <[email protected]>
* weights only compability * better tests from code review * ping torch version * add weights_only check
* Fix hyperparameter search when optuna+deepseed * Adding free_memory to the search setup --------- Co-authored-by: Corentin-Royer <[email protected]>
* fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <[email protected]>
* add tests for 3 more vlms * fix fuyu back * skip test
…`num_train_epochs` (huggingface#34810) Update trainer.py
* Add Nemotron GGUF Loading Support * fix the Nemotron architecture assignation --------- Co-authored-by: Marc Sun <[email protected]>
* add tensor processing system to separate logic for models * format refactoring * small fix * make some methods private * move custom methods to processors * refactor tensor processing * format fix
* skip nested deepspeed.zero.Init call * make fixup * solve conflict * solve conflict * put back local * use context mangers instead of local thread * Skip recursive calls to deepspeed.zero.Init * Skip recursive calls to deepspeed.zero.Init * back to old notebooks * make style
* fix heuristic schedule * fix style * fix format
* Create modular_starcoder2.py * Update modular_starcoder2.py * update * finalize modular * revert # no-unravel * Add support * style * Update modular_model_converter.py * update docstring
fix watermarking order
* remove fa2 test * remove other failing tests * style
* fix ForSequenceClassification * unmodularize rope layer * fix linting warning * Avoid complex PoolingHead, only one prediction head needed --------- Co-authored-by: Tom Aarsen <[email protected]>
Thanks, but the After making the required changes, you can ensure everything is in order by running the |
…cript (huggingface#35347) Add link to ModernBERT Text Classification GLUE finetuning script
* added expanded attention/padding masks prior to indexing the hidden_states * consistency fix in WavLMForSequenceClassification --------- Co-authored-by: Nikos Antoniou <[email protected]>
* fixup mamba2 - caching and several other small fixes * fixup cached forward * correct fix this time * fixup cache - we do not need to extend the attn mask it's handled by generate (gives total ids + mask at each step) * remove unnecessary (un)squeeze * fixup cache position * simplify a few things * [run-slow] mamba2 * multi gpu attempt two * [run-slow] mamba2 * [run-slow] mamba2 * [run-slow] mamba2 * [run-slow] mamba2 * add newer slow path fix * [run-slow] mamba2
) * init vptq * add integration * add vptq support fix readme * add tests && format * format * address comments * format * format * address comments * format * address comments * remove debug code * Revert "remove debug code" This reverts commit ed3b3ea. * fix test --------- Co-authored-by: Yang Wang <[email protected]>
* reduce 1 * reduce 1 --------- Co-authored-by: ydshieh <[email protected]>
…ngface#34931) * Add AsyncTextIteratorStreamer class * export AsyncTextIteratorStreamer * export AsyncTextIteratorStreamer * improve docs * missing import * missing import * doc example fix * doc example output fix * add pytest-asyncio * first attempt at tests * missing import * add pytest-asyncio * fallback to wait_for and raise TimeoutError on timeout * check for TimeoutError * autodoc * reorder imports * fix style --------- Co-authored-by: Arthur Zucker <[email protected]> Co-authored-by: Arthur <[email protected]>
* cleaner attention interfaces * correctly set the _attn_implementation when adding other functions to it * update * Update modeling_utils.py * CIs
feat: add parallel support for qwen2vl
…35011) fix zoe bug in deepspeed zero3
* fix * fix * fix --------- Co-authored-by: ydshieh <[email protected]>
…ce#35291) * bugfix: torch.export failure caused by `_make_causal_mask` Recent changes in torch dynamo prevent mutations on tensors converted with aten::_to_copy. To address this, we can clone such tensor before performing in-place operation `masked_fill_` only when the code is being compiled by torch dynamo. (relevant issue: pytorch/pytorch#127571) * chore: use `is_torchdynamo_compiling` instead of `torch._dynamo.is_compiling`
* update codecarbon * replace directly-specified-test-dirs with tmp_dir * Revert "replace directly-specified-test-dirs with tmp_dir" This reverts commit 310a6d9. * revert the change of .gitignore * Update .gitignore --------- Co-authored-by: Yih-Dar <[email protected]>
* [test-all] * style * [test-all] * [test_all] * [test_all] * style
…ce#35241) fix Co-authored-by: ydshieh <[email protected]>
…4995) * fix * fix * fix * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <[email protected]>
* Improve modular transformers documentation - Adds hints to general contribution guides - Lists which utils scripts are available to generate single-files from modular files and check their content * Show commands in copyable code cells --------- Co-authored-by: Joel Koch <[email protected]>
* Improved Documentation Of Audio Classification * Updated documentation as per review * Updated audio_classification.md * Update audio_classification.md
Thanks for iterating! you just have to rebase on main and check that the tests are still passing, then LGTM! |
* owlvit/2 dynamic input resolution. * adapt box grid to patch_dim_h patch_dim_w * fix ci * clarify variable naming * clarify variable naming.. * compute box_bias dynamically inside box_predictor * change style part of code * [run-slow] owlvit, owlv2
…ithub.com/simonreise/transformers into segmentation-maps-for-dpt-image-processor
Added
segmentation maps
support for DPT image processorMost of image processors for vision models that support semantic segmentation task accept
images
andsegmentation_maps
as inputs, but for some reason DPT image processor does not process segmentation maps, only images. This PR can make code that one uses for training or evaluation of semantic segmentation models more reusable, as now DPT image processor can process segmentation maps as most of other image processors do.I also added
do_reduce_labels
arg because other image processors that support segmentation masks use it.I added two new tests: one that tests segmentation_masks support and one that tests if do_reduce_labels work as expected.
Most of the code is adapted from BEIT image processor.
Before submitting
Pull Request section?
to it if that's the case.
Who can review?
@amyeroberts, @qubvel