Add strategy to store results in evaluation loop #30267

qubvel · 2024-04-16T12:00:19Z

What does this PR do?

In certain scenarios, the evaluation loop concatenates tensors, making them unusable. For instance, in the object detection evaluation loop, the label structure is as follows:

batch_1 = [
    {
        'size': tensor([ 771, 1333]),
        'image_id': tensor([1592]),
        'class_labels': tensor([0]),
        'boxes': tensor([[0.2268, 0.6586, 0.1567, 0.1480]]),
        'area': tensor([23827.1230]),
        'iscrowd': tensor([0]),
        'orig_size': tensor([561, 970])
    },
    ...
]
batch_2 = [
    {
        'size': tensor([1333,  763]),
        'image_id': tensor([44]),
        'class_labels': tensor([0, 1]),
        'boxes': tensor([[0.4216, 0.4584, 0.3794, 0.1979], [0.4216, 0.4584, 0.3794, 0.1979]]),
        'area': tensor([76371.3984, 76371.3984]),
        'iscrowd': tensor([0, 0]),
        'orig_size': tensor([926, 530])
    },
    ...
]

Each dictionary represents one image, with all its bounding boxes.

The problem arises when the trainer runs the evaluation loop and must concatenate each result. It concatenates the labels of every evaluation batch, using the internal nested_concat.

The result of the nested concatenation of these labels will look like this:

[
    {
        'size': tensor([ 771, 1333, 1333,  763]),
        'image_id': tensor([1592, 44]),
        'class_labels': tensor([0, 0, 1]),
        'boxes': tensor([[0.2268, 0.6586, 0.1567, 0.1480], [0.4216, 0.4584, 0.3794, 0.1979], [0.4216, 0.4584, 0.3794, 0.1979]]),
        'area': tensor([23827.1230, 76371.3984, 76371.3984]),
        'iscrowd': tensor([0, 0]),
        'orig_size': tensor([561, 970, 926, 530])
    },
    ...
]

This results in the concatenation of all boxes, making it impossible to distinguish which boxes belong to each image.

This PR introduces an additional strategy for storing batches. By setting eval_do_concat_batches=False in the training arguments, batches will be stored as separate list items rather than being concatenated.

Fixes # (issue)
#25939
https://discuss.huggingface.co/t/possible-fix-for-trainer-evaluation-with-object-detection/72307
https://discuss.huggingface.co/t/add-metrics-to-object-detection-example/31213/12

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@muellerzr and @pacman100
@amyeroberts

amyeroberts

Beautiful - the simplification in the trainer code speaks for itself. Thanks for adding this and making our trainer more flexible for different datasets!

All looks good to me, but let's get another approval from @muellerzr or @pacman100 to confirm this change to trainer is OK before merging

tests/trainer/test_trainer_utils.py

HuggingFaceDocBuilderDev · 2024-04-16T18:17:17Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

muellerzr

Very, very nice abstraction here! The code is much easier to read, I love it! 😍

Let's make sure those tests pass please, and I left a few recommendations for the documentation. Otherwise LG(reat)2M afterwords

src/transformers/training_args.py

qubvel · 2024-04-17T11:26:27Z

All tests are green, all comments addressed 🙂

* Add evaluation loop container for interm. results * Add tests for EvalLoopContainer * Formatting * Fix padding_index in test and typo * Move EvalLoopContainer to pr_utils to avoid additional imports * Fix `eval_do_concat_batches` arg description * Fix EvalLoopContainer import

qubvel added 3 commits April 16, 2024 11:06

Add evaluation loop container for interm. results

3be12de

Add tests for EvalLoopContainer

4b9bab5

Formatting

bcbe83f

amyeroberts approved these changes Apr 16, 2024

View reviewed changes

tests/trainer/test_trainer_utils.py Outdated Show resolved Hide resolved

tests/trainer/test_trainer_utils.py Outdated Show resolved Hide resolved

Fix padding_index in test and typo

7deb60f

muellerzr approved these changes Apr 16, 2024

View reviewed changes

src/transformers/training_args.py Outdated Show resolved Hide resolved

src/transformers/training_args.py Outdated Show resolved Hide resolved

qubvel added 3 commits April 17, 2024 10:35

Move EvalLoopContainer to pr_utils to avoid additional imports

ec048e1

Fix eval_do_concat_batches arg description

b76fe22

Fix EvalLoopContainer import

d93171b

amyeroberts merged commit c15aad0 into huggingface:main Apr 17, 2024
21 checks passed

qubvel mentioned this pull request Apr 23, 2024

Add examples for detection models finetuning #30422

Merged

5 tasks

SunMarc mentioned this pull request Jun 17, 2024

[fix bug] logits's shape different from label's shape in preprocess_logits_for_metrics #31447

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add strategy to store results in evaluation loop #30267

Add strategy to store results in evaluation loop #30267

qubvel commented Apr 16, 2024 •

edited

Loading

amyeroberts left a comment

HuggingFaceDocBuilderDev commented Apr 16, 2024

muellerzr left a comment •

edited

Loading

qubvel commented Apr 17, 2024

Add strategy to store results in evaluation loop #30267

Add strategy to store results in evaluation loop #30267

Conversation

qubvel commented Apr 16, 2024 • edited Loading

What does this PR do?

Before submitting

Who can review?

amyeroberts left a comment

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Apr 16, 2024

muellerzr left a comment • edited Loading

Choose a reason for hiding this comment

qubvel commented Apr 17, 2024

qubvel commented Apr 16, 2024 •

edited

Loading

muellerzr left a comment •

edited

Loading