Recommended values for modifiers #48

eu9ene · 2024-02-06T20:02:57Z

It's not clear from the examples in the Readme and from the paper what would be a good first choice of the modifiers' probabilities to start with. I understand that it likely depends a lot on language and data. However developing the intuition for setting those probabilities and other settings will take a lot of experimentation. It would help if the paper disclosed the full OpusTrainer config for the French-English case study to provide a good starting point and increase reproducibility (there is some config listed in the paper but it's not clear whether it's a real training config or just an example).

For context we're trying to reproduce the results from the paper by adding the same methods to our training pipeline to increase robustness of our models. We've successfully integrated UpperCase, TitleCase and SentencePiece sampling so far.

jelmervdl · 2024-02-11T19:31:25Z

CC @XapaJIaMnu

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recommended values for modifiers #48

Recommended values for modifiers #48

eu9ene commented Feb 6, 2024 •

edited

Loading

jelmervdl commented Feb 11, 2024

Recommended values for modifiers #48

Recommended values for modifiers #48

Comments

eu9ene commented Feb 6, 2024 • edited Loading

jelmervdl commented Feb 11, 2024

eu9ene commented Feb 6, 2024 •

edited

Loading