-
Notifications
You must be signed in to change notification settings - Fork 27.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🚨🚨🚨Deprecate evaluation_strategy
to eval_strategy
🚨🚨🚨
#30190
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
This even goes further: in many of the example scripts the dataset is checked for "train", "test", "validation" splits (and not "eval" or "evaluation"), which may be a tad confusing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this! Agreed - we definitely want to aim for more consistency.
I'd prefer if we did a proper deprecation cycle and pointed users to using eval_strategy
instead of having these two arguments working in tandem.
src/transformers/training_args.py
Outdated
# Deal with alias | ||
if self.eval_strategy is not None: | ||
self.evaluation_strategy = self.eval_strategy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to deal with both evaluation_strategy
and eval_strategy
being set (even if very unlikely)
Sounds good @amyeroberts, mostly wanted this PR to make sure if we wanted a blanket-wide |
@muellerzr Sounds good. It think accepting both is fine, but we ideally want to prioritise one. As you mentioned, |
evaluation_strategy
to eval_strategy
🚨🚨🚨
@amyeroberts (still working on getting tests to pass but) what version should we deprecate to? v4.44? Or v5 (Current pypi is v4.39) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mammoth piece of work - thanks for making sure all of the examples are updated as well!
Regarding the deprecation version, I'd say let's go for a longer window as this is such a public API, but let's not wait until v5 as that could be a while away! We should be releasing today, which will be v4.40. What do you think of a 6 months window i.e to v4.46?
6 months sounds great to me! |
* Alias * Note alias * Tests and src * Rest * Clean * Change typing? * Fix tests * Deprecation versions
* Alias * Note alias * Tests and src * Rest * Clean * Change typing? * Fix tests * Deprecation versions
* Alias * Note alias * Tests and src * Rest * Clean * Change typing? * Fix tests * Deprecation versions
What does this PR do?
Something I've found rather discouraging when working with the
TrainingArguments
is we have made it known thateval_X
is a thing, and yet there are some cases of usingevaluation_X
instead.Case in point:
Eval:
eval_steps
per_device_eval_batch_size
eval_accumulation_steps
eval_delay
Problematic eval:
evaluation_strategy
This PR creates an alias,
eval_strategy
, which aligns to the prior design decisions. I'm not sure a full deprecation is in order, but we are leading users to X and then presenting them Y. We should allow them to do X still for Y, but I don't think a full overhaul is needed (e.g. addingevaluation_accumulation_steps
etc etc) for one simple flag.I'll write some tests after we're okay with this being an option.
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@amyeroberts