-
Notifications
You must be signed in to change notification settings - Fork 27.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update doc for metric_for_best_model
when save_strategy="best"
.
#35389
base: main
Are you sure you want to change the base?
Changes from all commits
048250e
a903136
84e91d1
8accf30
20b276a
dd79c84
ce46012
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4220,7 +4220,9 @@ def test_save_best_checkpoint(self): | |
total=total, | ||
) | ||
|
||
# Case 3: Metric name not provided; throw error. | ||
def test_metric_for_best_model_behavior(self): | ||
# Case 1: Metric name not provided when `save_strategy == "best"`. | ||
# Should raise ValueError. | ||
with tempfile.TemporaryDirectory() as tmpdir: | ||
with self.assertRaises(ValueError) as context: | ||
trainer = get_regression_trainer( | ||
|
@@ -4232,9 +4234,22 @@ def test_save_best_checkpoint(self): | |
save_strategy="best", | ||
compute_metrics=AlmostAccuracy(), | ||
) | ||
|
||
self.assertIn("`args.metric_for_best_model` must be provided", str(context.exception)) | ||
|
||
# Case 2: Metric name not provided when `load_best_model_at_end == True`. | ||
# `metric_for_best_model` should be set to `"loss"` by default. | ||
with tempfile.TemporaryDirectory() as tmpdir: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. not critical / minor: tbh, it seems a bit out of place for the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess I agree that it logically does seem a bit out of place. I think cases 3 and 4 could be grouped into their own methods since the point isn't so much to test the I'm not sure if actually running training would be necessary, though. Case 3 is simply to check whether a ValueError is being properly thrown at Trainer initialization time, and case 4 is also simply to check whether the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed, I would also split it into a separate test (or two test). And, yes, we are testing the init here, that's why it was looking out of place. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. no strong opinion. We can split it into a separate test for case 3 and 4. |
||
trainer = get_regression_trainer( | ||
a=1.5, | ||
b=2.5, | ||
output_dir=tmpdir, | ||
learning_rate=0.1, | ||
eval_strategy="steps", | ||
save_strategy="steps", | ||
load_best_model_at_end=True, | ||
) | ||
self.assertTrue(trainer.args.metric_for_best_model == "loss") | ||
|
||
|
||
@require_torch | ||
@is_staging_test | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my 2cs (Disclaimer! I'm not very familiar with the whole scope of the initial change, or reason behind it!): it's a bit hard to read and understand what is going on here and why. E.g. why can't it default to
loss
when save_strategy == best? What is the major difference with theload_best_model_at_end
(and save_strategy!="best")?Again, apologies if I'm missing some obvious context here. Please feel free to disregard my comment / question then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't find the place where we set
metric_for_best_model = "loss"
whensave_strategy!=best
. Can you explain a bit why you changed the description ?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shcheklein That was a design decision made here (#31817 (comment)). It was deemed easier to debug if we don't add a hard-coded value and rather raise an error.
@SunMarc Hmm I'm starting to think that maybe the problem is that we're not able to set
load_best_model_at_end = True
whensave_strategy = "best"
sinceload_best_model_at_end
requireseval_strategy == save_strategy
buteval_strategy
doesn't have a"best"
option.This means that if we want to use
save_strategy = "best"
then we have to haveload_best_model_at_end = False
, which in turn means that whensave_strategy != "best"
andload_best_model_at_end = True
then the__post_init__
method of TrainingArguments is settingmetric_for_best_model
to"loss"
. https://github.com/huggingface/transformers/blob/main/src/transformers/training_args.py#L1676:L1679The modified docstring aims to add a bit more detail as to when the
metric_for_best_model
is set to a default of"loss"
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we also add
best
foreval_strategy
then ?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like that might sound a bit awkward since it means we'd "perform evaluation at every best checkpoint."
Maybe we could check if
save_strategy == "best"
and then bypass theeval_strategy == save_strategy
condition? That would mean that here we would change the code to:I'm not 100% sure about the history of why
eval_strategy == save_strategy
but I'm assuming that it's a safe guard to prevent situations where we want to load the best model at the end of training but we never saved it because the two didn't match. Ifsave_strategy == "best"
I don't think we'd have that problem since saving is guaranteed to following evaluation.I think this also means that we may have to either remove the default loss error we're raising or change it to a warning (#31817 (comment)).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me for the new condition, just add a comment to explain why we don't need to perform the check when self.save_strategy == "best". Also, which default loss error you are talking about ? I'm not sure why we need to remove it.