Fine Tuning with a new model #37

tednaseri · 2022-11-09T03:47:52Z

I am trying to fine-tune using different pre-trainedf distil-bert models. If the number of labels is not matched with the one Tner expects, I face with an error as:

RuntimeError: Error(s) in loading state_dict for DistilBertForTokenClassification:
	size mismatch for classifier.weight: copying a param with shape torch.Size([9, 768]) from checkpoint, the shape in current model is torch.Size([15, 768]).
	size mismatch for classifier.bias: copying a param with shape torch.Size([9]) from checkpoint, the shape in current model is torch.Size([15]).
	You may consider adding `ignore_mismatched_sizes=True` in the model `from_pretrained` method.

The suggested solution is using ignore_mismatched_sizes=True when loading model, like:
loading --> from_pretrained(path, num_labels, ignore_mismatched_sizes=True)
What do you think about it?

Thank you.

The text was updated successfully, but these errors were encountered:

asahi417 · 2022-11-09T06:06:36Z

It looks like that’s an error specific for the distill Bert model class. Let me test the solution locally and would merge it once I confirmed it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fine Tuning with a new model #37

Fine Tuning with a new model #37

tednaseri commented Nov 9, 2022

asahi417 commented Nov 9, 2022

Fine Tuning with a new model #37

Fine Tuning with a new model #37

Comments

tednaseri commented Nov 9, 2022

asahi417 commented Nov 9, 2022