`BCOTrainer` conversational dataset support #2107

qgallouedec · 2024-09-24T11:56:04Z

What does this PR do?

Part of #2071

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a GitHub issue? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

HuggingFaceDocBuilderDev · 2024-09-24T12:02:18Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

lewtun

Thanks for the nice refactor! I left a comment about using this as an opportunity to switch the model/dataset to something more standard, but happy to leave that for a follow-up PR if you wish

docs/source/bco_trainer.mdx

lewtun · 2024-09-24T13:33:47Z

examples/scripts/bco.py

@@ -18,6 +18,8 @@
 # Full training:
 python examples/scripts/bco.py \
    --model_name_or_path=nnheui/stablelm-2-1_6b-sft-full \
+    --trust_remote_code \
+    --dataset_name trl-lib/ultrafeedback-gpt-3.5-turbo-helpfulness \


I wonder if we can switch this to an unpaired version of trl-lib/ultrafeedback_binarized? That way we have two simple datasets that should "just" work

Good idea. How trl-lib/ultrafeedback_binarized is generated? Which "aspect" (helpfulness, ...) is used?

I merge the PR, but we can work on it in a following PR

examples/scripts/bco.py

Co-authored-by: lewtun <[email protected]>

qgallouedec added 4 commits September 24, 2024 11:30

update test

641d7a2

maybe_apply_chat_template

351ed55

simplify bco example

d273e43

Update documentation

8cc574c

qgallouedec mentioned this pull request Sep 24, 2024

[Tracking issue] General dataset support #2071

Open

29 tasks

qgallouedec requested review from kashif and lewtun September 24, 2024 12:13

lewtun reviewed Sep 24, 2024

View reviewed changes

qgallouedec and others added 4 commits September 24, 2024 15:54

Update examples/scripts/bco.py

0058f47

Update docs/source/bco_trainer.mdx

d545d0d

Co-authored-by: lewtun <[email protected]>

Merge branch 'main' into bco-conversational-dataset

53c72ad

Merge branch 'main' into bco-conversational-dataset

88cf41a

kashif approved these changes Sep 24, 2024

View reviewed changes

qgallouedec merged commit 44a06fc into main Sep 24, 2024
10 checks passed

qgallouedec deleted the bco-conversational-dataset branch September 24, 2024 16:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`BCOTrainer` conversational dataset support #2107

`BCOTrainer` conversational dataset support #2107

qgallouedec commented Sep 24, 2024

HuggingFaceDocBuilderDev commented Sep 24, 2024

lewtun left a comment

lewtun Sep 24, 2024

qgallouedec Sep 24, 2024

qgallouedec Sep 24, 2024

BCOTrainer conversational dataset support #2107

BCOTrainer conversational dataset support #2107

Conversation

qgallouedec commented Sep 24, 2024

What does this PR do?

Before submitting

Who can review?

HuggingFaceDocBuilderDev commented Sep 24, 2024

lewtun left a comment

Choose a reason for hiding this comment

lewtun Sep 24, 2024

Choose a reason for hiding this comment

qgallouedec Sep 24, 2024

Choose a reason for hiding this comment

qgallouedec Sep 24, 2024

Choose a reason for hiding this comment

`BCOTrainer` conversational dataset support #2107

`BCOTrainer` conversational dataset support #2107