Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RewardConfig's max_length argument docstring should indicate that it filters out dataset, rather than truncating it #2488

Open
Kallinteris-Andreas opened this issue Dec 16, 2024 · 1 comment
Labels
📚 documentation Improvements or additions to documentation 👶 good first issue Good for newcomers 🙋 help from community wanted Open invitation for community members to contribute 🏋 Reward Related to Reward modelling

Comments

@Kallinteris-Andreas
Copy link

RewardConfig's max_length argument docstring should indicate that it filters out dataset entries that exceed the limit, rather than truncating them

maybe the current documentation is obvious, and I just misunderstood it on first glance

@Kallinteris-Andreas Kallinteris-Andreas changed the title RewardConfig's max_length argument docstring should indicate that it filters out dataset, rows rather than truncating it RewardConfig's max_length argument docstring should indicate that it filters out dataset, rather than truncating it Dec 16, 2024
@qgallouedec
Copy link
Member

Good point, given that for other trainers (like DPO), it's a truncation.

In fact, the best thing would be to have a common behavior for all trainers (truncation), but the urgent thing is to clarify the documentation.

@qgallouedec qgallouedec added 📚 documentation Improvements or additions to documentation 🙋 help from community wanted Open invitation for community members to contribute 👶 good first issue Good for newcomers 🏋 Reward Related to Reward modelling labels Dec 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
📚 documentation Improvements or additions to documentation 👶 good first issue Good for newcomers 🙋 help from community wanted Open invitation for community members to contribute 🏋 Reward Related to Reward modelling
Projects
None yet
Development

No branches or pull requests

2 participants