Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🚨🚨🚨Deprecate evaluation_strategy to eval_strategy🚨🚨🚨 #30190

Merged
merged 8 commits into from
Apr 18, 2024

Conversation

muellerzr
Copy link
Contributor

What does this PR do?

Something I've found rather discouraging when working with the TrainingArguments is we have made it known that eval_X is a thing, and yet there are some cases of using evaluation_X instead.

Case in point:
Eval:

  • eval_steps
  • per_device_eval_batch_size
  • eval_accumulation_steps
  • eval_delay

Problematic eval:

  • evaluation_strategy

This PR creates an alias, eval_strategy, which aligns to the prior design decisions. I'm not sure a full deprecation is in order, but we are leading users to X and then presenting them Y. We should allow them to do X still for Y, but I don't think a full overhaul is needed (e.g. adding evaluation_accumulation_steps etc etc) for one simple flag.

I'll write some tests after we're okay with this being an option.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@amyeroberts

@muellerzr muellerzr requested a review from amyeroberts April 11, 2024 15:32
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@BramVanroy
Copy link
Collaborator

This even goes further: in many of the example scripts the dataset is checked for "train", "test", "validation" splits (and not "eval" or "evaluation"), which may be a tad confusing.

Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this! Agreed - we definitely want to aim for more consistency.

I'd prefer if we did a proper deprecation cycle and pointed users to using eval_strategy instead of having these two arguments working in tandem.

Comment on lines 1400 to 1402
# Deal with alias
if self.eval_strategy is not None:
self.evaluation_strategy = self.eval_strategy
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to deal with both evaluation_strategy and eval_strategy being set (even if very unlikely)

@muellerzr
Copy link
Contributor Author

muellerzr commented Apr 15, 2024

Sounds good @amyeroberts, mostly wanted this PR to make sure if we wanted a blanket-wide eval or evaluation. (aka, shorthand for everything)

@amyeroberts
Copy link
Collaborator

@muellerzr Sounds good. It think accepting both is fine, but we ideally want to prioritise one. As you mentioned, eval_xxx for consistency seems like a good choice

@muellerzr muellerzr changed the title Alias evaluation_strategy for eval_strategy 🚨🚨🚨Deprecate evaluation_strategy to eval_strategy🚨🚨🚨 Apr 16, 2024
@muellerzr
Copy link
Contributor Author

muellerzr commented Apr 16, 2024

@amyeroberts (still working on getting tests to pass but) what version should we deprecate to? v4.44? Or v5 (Current pypi is v4.39)

@muellerzr muellerzr requested a review from amyeroberts April 18, 2024 13:29
Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mammoth piece of work - thanks for making sure all of the examples are updated as well!

Regarding the deprecation version, I'd say let's go for a longer window as this is such a public API, but let's not wait until v5 as that could be a while away! We should be releasing today, which will be v4.40. What do you think of a 6 months window i.e to v4.46?

@muellerzr
Copy link
Contributor Author

6 months sounds great to me!

@muellerzr muellerzr merged commit 60d5f8f into main Apr 18, 2024
22 checks passed
@muellerzr muellerzr deleted the muellerzr-allow-for-shorthand branch April 18, 2024 16:49
ArthurZucker pushed a commit that referenced this pull request Apr 22, 2024
* Alias

* Note alias

* Tests and src

* Rest

* Clean

* Change typing?

* Fix tests

* Deprecation versions
ydshieh pushed a commit that referenced this pull request Apr 23, 2024
* Alias

* Note alias

* Tests and src

* Rest

* Clean

* Change typing?

* Fix tests

* Deprecation versions
itazap pushed a commit that referenced this pull request May 14, 2024
* Alias

* Note alias

* Tests and src

* Rest

* Clean

* Change typing?

* Fix tests

* Deprecation versions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants