How to do c-RLFT? #227

whosaytree · 2024-07-17T04:11:28Z

As the essay said, the fine-tuning proccess consists two parts: c-SFT and c-RLFT. However, it seems like the train process in code is only c-SFT. I don't know how to do c-RLFT. I'll be very grateful if someone can give me some help, thanks!

imoneoi · 2024-08-10T04:09:41Z

C-RLFT is weighted C-SFT. You can use the "weight" field in the training data format to set up weights for different conditions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to do c-RLFT? #227

How to do c-RLFT? #227

whosaytree commented Jul 17, 2024

imoneoi commented Aug 10, 2024

How to do c-RLFT? #227

How to do c-RLFT? #227

Comments

whosaytree commented Jul 17, 2024

imoneoi commented Aug 10, 2024