[SFTTrainer] Support logging response in wandb #851

vwxyzjn · 2023-10-10T17:47:57Z

What does this PR do?

This PR attempts to log some sample and reference responses in wandb, which gives concrete examples for our inspection and also makes training more informative. Basically, it's going to log something as follows:

Some datasets such as timdettmers/openassistant-guanaco do not really have a query/response structure, so I basically give the model the first half of the token in the dataset and let it generate the remaining tokens; I also added the second half of the token as the reference response to test out the SFT policy.

Question:

I wasn't quite sure how to get the batching working with dataset and generate, though... The dataloader complains

ValueError: Unable to create tensor, you should probably activate truncation and/or padding with 'padding=True' 'truncation=True' to have batched tensors with the same length. Perhaps your features (`remaining_input_ids` in this case) have excessive nesting (inputs type `list` where type `int` is expected).

if I try to use a batch_size > 1 for the dataloader.

HuggingFaceDocBuilderDev · 2023-10-10T17:55:11Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

github-actions · 2023-11-10T15:05:06Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

vwxyzjn · 2023-12-01T14:55:03Z

update: after chatting with @lvwerra, we think maybe the change in SFTTrainer is unnecessary. Will checkout how can we further improve it.

github-actions · 2023-12-25T15:05:16Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

JamesSand · 2024-01-09T09:48:41Z

Hi, I have met the same problem. Does anyone solve that?

I have tried to use the demo code provided in the readme, but it still does not work....

Support logging reference response in wandb

8f1df46

vwxyzjn requested a review from lewtun October 10, 2023 17:47

github-actions bot closed this Nov 19, 2023

vwxyzjn reopened this Nov 20, 2023

github-actions bot closed this Nov 28, 2023

lvwerra reopened this Nov 29, 2023

github-actions bot closed this Jan 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SFTTrainer] Support logging response in wandb #851

[SFTTrainer] Support logging response in wandb #851

vwxyzjn commented Oct 10, 2023

HuggingFaceDocBuilderDev commented Oct 10, 2023

github-actions bot commented Nov 10, 2023

vwxyzjn commented Dec 1, 2023

github-actions bot commented Dec 25, 2023

JamesSand commented Jan 9, 2024

[SFTTrainer] Support logging response in wandb #851

[SFTTrainer] Support logging response in wandb #851

Conversation

vwxyzjn commented Oct 10, 2023

What does this PR do?

Question:

HuggingFaceDocBuilderDev commented Oct 10, 2023

github-actions bot commented Nov 10, 2023

vwxyzjn commented Dec 1, 2023

github-actions bot commented Dec 25, 2023

JamesSand commented Jan 9, 2024