`RewardConfig`'s `max_length` argument docstring should indicate that it filters out dataset, rather than truncating it #2488

Kallinteris-Andreas · 2024-12-16T10:57:56Z

RewardConfig's max_length argument docstring should indicate that it filters out dataset entries that exceed the limit, rather than truncating them

maybe the current documentation is obvious, and I just misunderstood it on first glance

The text was updated successfully, but these errors were encountered:

qgallouedec · 2024-12-16T11:04:09Z

Good point, given that for other trainers (like DPO), it's a truncation.

In fact, the best thing would be to have a common behavior for all trainers (truncation), but the urgent thing is to clarify the documentation.

Kallinteris-Andreas changed the title ~~RewardConfig's max_length argument docstring should indicate that it filters out dataset, rows rather than truncating it~~ RewardConfig's max_length argument docstring should indicate that it filters out dataset, rather than truncating it Dec 16, 2024

qgallouedec added 📚 documentation Improvements or additions to documentation 🙋 help from community wanted Open invitation for community members to contribute 👶 good first issue Good for newcomers 🏋 Reward Related to Reward modelling labels Dec 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`RewardConfig`'s `max_length` argument docstring should indicate that it filters out dataset, rather than truncating it #2488

`RewardConfig`'s `max_length` argument docstring should indicate that it filters out dataset, rather than truncating it #2488

Kallinteris-Andreas commented Dec 16, 2024

qgallouedec commented Dec 16, 2024

RewardConfig's max_length argument docstring should indicate that it filters out dataset, rather than truncating it #2488

RewardConfig's max_length argument docstring should indicate that it filters out dataset, rather than truncating it #2488

Comments

Kallinteris-Andreas commented Dec 16, 2024

qgallouedec commented Dec 16, 2024

`RewardConfig`'s `max_length` argument docstring should indicate that it filters out dataset, rather than truncating it #2488

`RewardConfig`'s `max_length` argument docstring should indicate that it filters out dataset, rather than truncating it #2488