Mexican NLP 2024 Summer School Tutorial on Knowledge Distillation and Parameter Efficient Fine-tuning

The slides can be found here

Code Examples:

Baseline Training: This notebook shows an example of fine-tuning BERT for a standard classification task.
Small Model Training: Similar to Baseline Training, but now we train a significantly smaller model from scratch.
Small Model Training + KD: We train the small model from scratch with knowledge distillation, using the baseline training as the teacher.
Small Model Training + Optimized KD: We optimize the knowledge distillation by caching the teacher's logits.
Baseline Training with LoRA: We re-train the baseline model with LoRA.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Mexican NLP Summer School 2024.pdf		Mexican NLP Summer School 2024.pdf
README.md		README.md