Expand tutorial 2

flairNLP · Dec 7, 2024 · 22cc158 · 22cc158
1 parent fa07cb3
commit 22cc158
Show file tree

Hide file tree

Showing 5 changed files with 32 additions and 4 deletions.
diff --git a/docs/tutorial/tutorial-training/how-to-load-custom-dataset.md b/docs/tutorial/tutorial-training/how-to-load-custom-dataset.md
@@ -159,3 +159,6 @@ example we chose `label_type='topic'` to denote that we are loading a corpus wit
 
 
 
+## Next 
+
+Next, learn [how to train a sequence tagger](how-to-train-sequence-tagger.md).
diff --git a/docs/tutorial/tutorial-training/how-to-load-prepared-dataset.md b/docs/tutorial/tutorial-training/how-to-load-prepared-dataset.md
@@ -193,3 +193,7 @@ The following datasets are supported:
 | Universal Dependency Treebanks      | [flair.datasets.treebanks](#flair.datasets.treebanks)                                                                                       |
 | OCR-Layout-NER                      | [flair.datasets.ocr](#flair.datasets.ocr)                                                                                                   |
 
+
+## Next 
+
+Next, learn how to load a [custom dataset](how-to-load-custom-dataset.md).
diff --git a/docs/tutorial/tutorial-training/how-to-train-sequence-tagger.md b/docs/tutorial/tutorial-training/how-to-train-sequence-tagger.md
@@ -223,3 +223,6 @@ trainer.train('resources/taggers/example-universal-pos',
 This gives you a multilingual model. Try experimenting with more languages!
 
 
+## Next 
+
+Next, learn [how to train a text classifier](how-to-train-text-classifier.md).
diff --git a/docs/tutorial/tutorial-training/how-to-train-text-classifier.md b/docs/tutorial/tutorial-training/how-to-train-text-classifier.md
@@ -58,3 +58,7 @@ classifier.predict(sentence)
 print(sentence.labels)
 ```
 
+
+## Next 
+
+Next, learn [how to train an entity linker](how-to-train-span-classifier.md).
diff --git a/docs/tutorial/tutorial-training/train-vs-fine-tune.md b/docs/tutorial/tutorial-training/train-vs-fine-tune.md
@@ -14,23 +14,37 @@ Since in this case, the vast majority of parameters in the model is already trai
 model. This means: Very small learning rate (LR) and just a few epochs. You are essentially just minimally modifying 
 the model to adapt it to the task you want to solve.
 
-Most models in Flair are trained using fine-tuning. So this is likely the approach you'll want to use. 
+Use this method by calling [`ModelTrainer.fine_tune()`](#flair.trainers.ModelTrainer.fine_tune).
+Since most models in Flair were trained this way, this is likely the approach you'll want to use. 
 
 
 ## Training
 
 On the other hand, you should use the classic training approach if the majority of the trainable parameters in your 
-model is randomly initialized. This is essentially the "old way", before fine-tuning of transformers. 
-
+model is randomly initialized. This can happen for instance if you freeze the model weights of the pre-trained language 
+model, leaving only the randomly initialited prediction head as trainable parameters. This training approach is also
+referred to as "feature-based" or "probing" in some papers. 
+
 Since the majority of parameters is randomly initialized, you need to fully train the model. This means: high learning 
 rate and many epochs. 
 
+Use this method by calling  [`ModelTrainer.train()`](#flair.trainers.ModelTrainer.train) .
+
+```{note}
 Another application of classic training is for linear probing of pre-trained language models. In this scenario, you 
 "freeze" the weights of the language model (meaning that they cannot be changed) and add a prediction head that is 
 trained from scratch. So, even though a language model is involved, its parameters are not trainable. This means that 
 all trainable parameters in this scenario are randomly initialized, therefore necessitating the use of the classic
 training approach.
+```
+
 
 ## Paper 
 
-Our paper 
+If you are interested in an experimental comparison of the two above-mentioned approach, check out [our paper](https://arxiv.org/pdf/2011.06993) 
+that compares fine-tuning to the feature-based approach.
+
+
+## Next 
+
+Next, learn how to load a [training dataset](how-to-load-prepared-dataset.md).
Original file line number	Diff line number	Diff line change
Expand Up		@@ -159,3 +159,6 @@ example we chose `label_type='topic'` to denote that we are loading a corpus wit



		## Next

		Next, learn [how to train a sequence tagger](how-to-train-sequence-tagger.md).
Original file line number	Diff line number	Diff line change
Expand Up		@@ -223,3 +223,6 @@ trainer.train('resources/taggers/example-universal-pos',
		This gives you a multilingual model. Try experimenting with more languages!


		## Next

		Next, learn [how to train a text classifier](how-to-train-text-classifier.md).
-Original file line number
+Diff line change
@@ Expand Up / @@ -58,3 +58,7 @@ classifier.predict(sentence) @@
     print(sentence.labels)
     ```
+    ## Next
+    Next, learn [how to train an entity linker](how-to-train-span-classifier.md).