Knowledge distillation for vision guide #25619

merveenoyan · 2023-08-20T20:24:22Z

This is a draft PR that I opened in the past on KD guide for CV, but I accidentally removed my fork. I prioritized TGI docs so this PR might stay stale for a while, I will ask for a review after I iterate over comments left by @sayakpaul in my previous PR. (Mainly training MobileNet with random initial weights and not with pre-trained weights from transformers)

merveenoyan · 2023-09-12T14:16:10Z

@sayakpaul I changed the setup and didn't observe a lot of difference, but I felt like it would be still cool to show how to distill a model. WDYT?

amyeroberts · 2023-09-12T14:36:14Z

cc @rafaelpadilla for reference

rafaelpadilla

Fantastic to see knowledge distillation being discussed—such an exciting topic! 🚀
Just shared a few comments and suggestions that might enhance readability. Most are related to writing style.
I appreciate the straightforward example you've provided. 👍

docs/source/en/tasks/knowledge_distillation_for_image_classification.md

…cation.md Co-authored-by: Rafael Padilla <[email protected]>

HuggingFaceDocBuilderDev · 2023-10-02T15:05:53Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

docs/source/en/tasks/knowledge_distillation_for_image_classification.md

…cation.md Co-authored-by: Rafael Padilla <[email protected]>

merveenoyan · 2023-10-03T13:01:08Z

@rafaelpadilla @NielsRogge can we merge this if this looks good?

rafaelpadilla · 2023-10-03T19:39:40Z

@rafaelpadilla @NielsRogge can we merge this if this looks good?

Yes, it's OK to me.
My comments were merely about writing style

docs/source/en/tasks/knowledge_distillation_for_image_classification.md

…cation.md Co-authored-by: NielsRogge <[email protected]>

docs/source/en/tasks/knowledge_distillation_for_image_classification.md

NielsRogge · 2023-10-04T17:22:25Z

docs/source/en/tasks/knowledge_distillation_for_image_classification.md

+dataset = load_dataset("beans")
+```
+
+We can use either of the processors given they return the same output. We will use `map()` method of `dataset` to apply the preprocessing to every split of the dataset.


This sentence is actually not true, ResNet and MobileNet each have their own image processors

They do return the same thing because processor just does preprocessing on same resolution. Check this out.

from transformers import AutoFeatureExtractor from PIL import Image import requests import numpy as np teacher_extractor = AutoFeatureExtractor.from_pretrained("microsoft/resnet-50") student_extractor = AutoFeatureExtractor.from_pretrained("google/mobilenet_v2_1.4_224") url = "http://images.cocodataset.org/val2017/000000039769.jpg" sample = Image.open(requests.get(url, stream=True).raw) np.array_equal(teacher_extractor(sample),student_extractor(sample)) # True

docs/source/en/tasks/knowledge_distillation_for_image_classification.md

NielsRogge

Thanks for writing this up! ❤️

…cation.md Co-authored-by: NielsRogge <[email protected]>

MKhalusova

Great work on the guide! While reading it I had a few questions that I feel other folks may have, and it would be great to address them :)

docs/source/en/tasks/knowledge_distillation_for_image_classification.md

MKhalusova · 2023-10-05T15:12:58Z

docs/source/en/tasks/knowledge_distillation_for_image_classification.md

+processed_datasets = dataset.map(process, batched=True)
+```
+
+Essentially, we want the student model (a randomly initialized MobileNet) to mimic the teacher model (pre-trained ResNet). To achieve this, we first get the logits output by the teacher and the student. Then, we divide each of them by the parameter `temperature`, which controls the importance of each soft target. We will use the KL loss to compute the divergence between the student and teacher. A parameter called `lambda` weighs the importance of the distillation loss. In this example, we will use `temperature=5` and `lambda=0.5`.


Would be cool to link KL loss to some page that gives a definition of what that is for people who are not familiar.

Since you're customizing the Trainer, it would also be nice to link to this page https://huggingface.co/docs/transformers/en/main_classes/trainer#trainer

The first sentence would be great to have somewhere in the introduction - how the distillation works. Something like: "To distill knowledge from one model to another, we take a pre-trained teacher model, and randomly initialize a student model. Next, we train the student model to minimize the difference between its outputs and the teacher's outputs, thus making it mimic the behavior. "

docs/source/en/tasks/knowledge_distillation_for_image_classification.md

MKhalusova · 2023-10-05T16:02:45Z

docs/source/en/tasks/knowledge_distillation_for_image_classification.md

+
+```python
+trainer.evaluate(processed_datasets["test"])
+```


Maybe also push the final model to hub?
trainer.push_to_hub()?

I think the final model is pushed already when we set push_to_hub to True (I also have save strategy enabled for every epoch so it's triggered every epoch as well), no?

AFAIK trainer.push_to_hub() also creates a basic model card, e.g. with metrics, and some training results.

…cation.md Co-authored-by: Maria Khalusova <[email protected]>

MKhalusova

Thank you for iterating on this! This revision looks fantastic :)

rafaelpadilla

Looks good to me. :)

merveenoyan · 2023-10-13T10:11:04Z

@LysandreJik can you give a review or ask for another reviewer if needed?

LysandreJik

Thank you @merveenoyan!

LysandreJik · 2023-10-16T13:14:19Z

Please resolve the merge conflicts and merge @merveenoyan

* Knowledge distillation for vision guide * Update knowledge_distillation_for_image_classification.md * Update docs/source/en/tasks/knowledge_distillation_for_image_classification.md Co-authored-by: Rafael Padilla <[email protected]> * Update docs/source/en/tasks/knowledge_distillation_for_image_classification.md Co-authored-by: Rafael Padilla <[email protected]> * Iterated on Rafael's comments * Added to toctree * Update docs/source/en/tasks/knowledge_distillation_for_image_classification.md Co-authored-by: Rafael Padilla <[email protected]> * Addressed comments * Update knowledge_distillation_for_image_classification.md * Update docs/source/en/tasks/knowledge_distillation_for_image_classification.md Co-authored-by: Rafael Padilla <[email protected]> * Update docs/source/en/tasks/knowledge_distillation_for_image_classification.md Co-authored-by: NielsRogge <[email protected]> * Update docs/source/en/tasks/knowledge_distillation_for_image_classification.md Co-authored-by: NielsRogge <[email protected]> * Update docs/source/en/tasks/knowledge_distillation_for_image_classification.md Co-authored-by: NielsRogge <[email protected]> * Update docs/source/en/tasks/knowledge_distillation_for_image_classification.md Co-authored-by: NielsRogge <[email protected]> * Update docs/source/en/tasks/knowledge_distillation_for_image_classification.md Co-authored-by: NielsRogge <[email protected]> * Update docs/source/en/tasks/knowledge_distillation_for_image_classification.md Co-authored-by: NielsRogge <[email protected]> * Update docs/source/en/tasks/knowledge_distillation_for_image_classification.md Co-authored-by: NielsRogge <[email protected]> * Update docs/source/en/tasks/knowledge_distillation_for_image_classification.md Co-authored-by: NielsRogge <[email protected]> * Update knowledge_distillation_for_image_classification.md * Update knowledge_distillation_for_image_classification.md * Update docs/source/en/tasks/knowledge_distillation_for_image_classification.md Co-authored-by: Maria Khalusova <[email protected]> * Update docs/source/en/tasks/knowledge_distillation_for_image_classification.md Co-authored-by: Maria Khalusova <[email protected]> * Update docs/source/en/tasks/knowledge_distillation_for_image_classification.md Co-authored-by: Maria Khalusova <[email protected]> * Update docs/source/en/tasks/knowledge_distillation_for_image_classification.md Co-authored-by: Maria Khalusova <[email protected]> * Address comments * Update knowledge_distillation_for_image_classification.md * Explain KL Div --------- Co-authored-by: Rafael Padilla <[email protected]> Co-authored-by: NielsRogge <[email protected]> Co-authored-by: Maria Khalusova <[email protected]>

merveenoyan added 2 commits August 20, 2023 23:21

Knowledge distillation for vision guide

aad783d

Update knowledge_distillation_for_image_classification.md

01080da

merveenoyan marked this pull request as ready for review September 12, 2023 14:15

merveenoyan requested a review from sayakpaul September 12, 2023 14:15

rafaelpadilla reviewed Sep 14, 2023

View reviewed changes

merveenoyan and others added 6 commits September 15, 2023 09:22

Update docs/source/en/tasks/knowledge_distillation_for_image_classifi…

6eca2fa

…cation.md Co-authored-by: Rafael Padilla <[email protected]>

Update docs/source/en/tasks/knowledge_distillation_for_image_classifi…

b595a1a

…cation.md Co-authored-by: Rafael Padilla <[email protected]>

Iterated on Rafael's comments

36979c6

Added to toctree

06d7659

Update docs/source/en/tasks/knowledge_distillation_for_image_classifi…

8702960

…cation.md Co-authored-by: Rafael Padilla <[email protected]>

Addressed comments

742dc93

merveenoyan requested review from rafaelpadilla and NielsRogge October 2, 2023 13:00

merveenoyan added 2 commits October 2, 2023 15:02

Update knowledge_distillation_for_image_classification.md

23085f0

Merge branch 'main' into knowledge-distillation

ed113cd

rafaelpadilla reviewed Oct 3, 2023

View reviewed changes

docs/source/en/tasks/knowledge_distillation_for_image_classification.md Outdated Show resolved Hide resolved

Update docs/source/en/tasks/knowledge_distillation_for_image_classifi…

c01e4cd

…cation.md Co-authored-by: Rafael Padilla <[email protected]>

NielsRogge reviewed Oct 4, 2023

View reviewed changes

docs/source/en/tasks/knowledge_distillation_for_image_classification.md Outdated Show resolved Hide resolved

NielsRogge reviewed Oct 4, 2023

View reviewed changes

docs/source/en/tasks/knowledge_distillation_for_image_classification.md Outdated Show resolved Hide resolved

NielsRogge reviewed Oct 4, 2023

View reviewed changes

docs/source/en/tasks/knowledge_distillation_for_image_classification.md Outdated Show resolved Hide resolved

merveenoyan and others added 3 commits October 4, 2023 13:49

Update docs/source/en/tasks/knowledge_distillation_for_image_classifi…

4e46a06

…cation.md Co-authored-by: NielsRogge <[email protected]>

Update docs/source/en/tasks/knowledge_distillation_for_image_classifi…

10cc3e0

…cation.md Co-authored-by: NielsRogge <[email protected]>

Update docs/source/en/tasks/knowledge_distillation_for_image_classifi…

5c36920

…cation.md Co-authored-by: NielsRogge <[email protected]>

NielsRogge reviewed Oct 4, 2023

View reviewed changes

docs/source/en/tasks/knowledge_distillation_for_image_classification.md Outdated Show resolved Hide resolved

NielsRogge reviewed Oct 4, 2023

View reviewed changes

docs/source/en/tasks/knowledge_distillation_for_image_classification.md Outdated Show resolved Hide resolved

NielsRogge reviewed Oct 4, 2023

View reviewed changes

docs/source/en/tasks/knowledge_distillation_for_image_classification.md Outdated Show resolved Hide resolved

NielsRogge reviewed Oct 4, 2023

View reviewed changes

docs/source/en/tasks/knowledge_distillation_for_image_classification.md Outdated Show resolved Hide resolved

NielsRogge approved these changes Oct 4, 2023

View reviewed changes

LysandreJik requested a review from MKhalusova October 5, 2023 08:29

merveenoyan and others added 7 commits October 5, 2023 11:03

Update docs/source/en/tasks/knowledge_distillation_for_image_classifi…

cacbe86

…cation.md Co-authored-by: NielsRogge <[email protected]>

Update docs/source/en/tasks/knowledge_distillation_for_image_classifi…

3bc1928

…cation.md Co-authored-by: NielsRogge <[email protected]>

Update docs/source/en/tasks/knowledge_distillation_for_image_classifi…

f07351b

…cation.md Co-authored-by: NielsRogge <[email protected]>

Update docs/source/en/tasks/knowledge_distillation_for_image_classifi…

836cb90

…cation.md Co-authored-by: NielsRogge <[email protected]>

Update docs/source/en/tasks/knowledge_distillation_for_image_classifi…

c8b5098

…cation.md Co-authored-by: NielsRogge <[email protected]>

Update knowledge_distillation_for_image_classification.md

ea0b75e

Update knowledge_distillation_for_image_classification.md

c4bce38

MKhalusova suggested changes Oct 5, 2023

View reviewed changes

merveenoyan and others added 7 commits October 9, 2023 11:45

Update docs/source/en/tasks/knowledge_distillation_for_image_classifi…

70c1c1b

…cation.md Co-authored-by: Maria Khalusova <[email protected]>

Update docs/source/en/tasks/knowledge_distillation_for_image_classifi…

04582a5

…cation.md Co-authored-by: Maria Khalusova <[email protected]>

Update docs/source/en/tasks/knowledge_distillation_for_image_classifi…

1cb7469

…cation.md Co-authored-by: Maria Khalusova <[email protected]>

Update docs/source/en/tasks/knowledge_distillation_for_image_classifi…

72a419d

…cation.md Co-authored-by: Maria Khalusova <[email protected]>

Address comments

518017d

Update knowledge_distillation_for_image_classification.md

0be1027

Explain KL Div

9cae56d

MKhalusova approved these changes Oct 12, 2023

View reviewed changes

rafaelpadilla approved these changes Oct 12, 2023

View reviewed changes

merveenoyan requested a review from LysandreJik October 16, 2023 10:02

LysandreJik approved these changes Oct 16, 2023

View reviewed changes

Merge branch 'main' into knowledge-distillation

f0c2a9e

merveenoyan requested a review from LysandreJik October 17, 2023 13:30

LysandreJik merged commit 280c757 into huggingface:main Oct 18, 2023
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Knowledge distillation for vision guide #25619

Knowledge distillation for vision guide #25619

merveenoyan commented Aug 20, 2023

merveenoyan commented Sep 12, 2023

amyeroberts commented Sep 12, 2023

rafaelpadilla left a comment

HuggingFaceDocBuilderDev commented Oct 2, 2023

merveenoyan commented Oct 3, 2023

rafaelpadilla commented Oct 3, 2023

NielsRogge Oct 4, 2023

merveenoyan Oct 5, 2023 •

edited

Loading

NielsRogge left a comment

MKhalusova left a comment

MKhalusova Oct 5, 2023

MKhalusova Oct 5, 2023

MKhalusova Oct 5, 2023

MKhalusova Oct 5, 2023

merveenoyan Oct 12, 2023

MKhalusova Oct 12, 2023

MKhalusova left a comment

rafaelpadilla left a comment

merveenoyan commented Oct 13, 2023

LysandreJik left a comment

LysandreJik commented Oct 16, 2023

Knowledge distillation for vision guide #25619

Knowledge distillation for vision guide #25619

Conversation

merveenoyan commented Aug 20, 2023

merveenoyan commented Sep 12, 2023

amyeroberts commented Sep 12, 2023

rafaelpadilla left a comment

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Oct 2, 2023

merveenoyan commented Oct 3, 2023

rafaelpadilla commented Oct 3, 2023

NielsRogge Oct 4, 2023

Choose a reason for hiding this comment

merveenoyan Oct 5, 2023 • edited Loading

Choose a reason for hiding this comment

NielsRogge left a comment

Choose a reason for hiding this comment

MKhalusova left a comment

Choose a reason for hiding this comment

MKhalusova Oct 5, 2023

Choose a reason for hiding this comment

MKhalusova Oct 5, 2023

Choose a reason for hiding this comment

MKhalusova Oct 5, 2023

Choose a reason for hiding this comment

MKhalusova Oct 5, 2023

Choose a reason for hiding this comment

merveenoyan Oct 12, 2023

Choose a reason for hiding this comment

MKhalusova Oct 12, 2023

Choose a reason for hiding this comment

MKhalusova left a comment

Choose a reason for hiding this comment

rafaelpadilla left a comment

Choose a reason for hiding this comment

merveenoyan commented Oct 13, 2023

LysandreJik left a comment

Choose a reason for hiding this comment

LysandreJik commented Oct 16, 2023

merveenoyan Oct 5, 2023 •

edited

Loading