Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix trainer final evaluation with tstv & best-model #3363

Merged

Conversation

helpmefindaname
Copy link
Collaborator

No description provided.

@alanakbik alanakbik merged commit c375fba into master Oct 27, 2023
1 check passed
@alanakbik alanakbik deleted the fix_transformer_smaller_training_vocab_with_best_model branch October 27, 2023 14:25
@alanakbik
Copy link
Collaborator

Fixes the main issue, but this script throws errors:

# 1. get the corpus
from flair.data import Corpus
from flair.datasets import TREC_6
from flair.embeddings import TransformerDocumentEmbeddings
from flair.models import TextClassifier
from flair.trainers import ModelTrainer

corpus: Corpus = TREC_6()

# 2. what label do we want to predict?
label_type = "question_class"

# 3. create the label dictionary
label_dict = corpus.make_label_dictionary(label_type=label_type)

# 4. initialize transformer document embeddings (many models are available)
document_embeddings = TransformerDocumentEmbeddings("xlm-roberta-large", fine_tune=True)

# 5. create the text classifier
classifier = TextClassifier(document_embeddings, label_dictionary=label_dict, label_type=label_type)

# 6. initialize trainer
trainer = ModelTrainer(classifier, corpus)

# 7. fine-tune the model, but **reduce the vocabulary** for faster training
trainer.train(
    "resources/taggers/question-classification-with-transformer_final",
    reduce_transformer_vocab=True,  # set this to False for slow version
    max_epochs=2,
    learning_rate=0.0001,
    use_final_model_for_eval=False,
)

It works for distilbert-base-uncased and bert-base-uncased, but throws an error for xlm-roberta-large:

RuntimeError: Error(s) in loading state_dict for TextClassifier:
        size mismatch for embeddings.model.embeddings.word_embeddings.weight: copying a param with shape torch.Size([8426, 1024]) from checkpoint, the shape in current model is torch.Size([250003, 1024]).

@helpmefindaname any idea why xlm-roberta-large is causing a problem here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants