3X GPT-like Augmentation #17

heisguyy · 2024-01-25T11:57:34Z

We will use swa, yor, hau, fon, wolof, and x for our experiments
All sentences should be randomly sampled from the training set and shuffling should be done before fine-tuning

Run your augmentation to generate 3x of random samples from the training data
Create a folder in the data/ folder with the name of your augmentation strategy and put the files there. The format should be in line json, as seen in the Mafand dataset.
Fintetune NLLB for 5 epochs and report results (training samples + augmentation samples)

Iambusayor added the experiment Experiments, modeling, etc. label Jan 27, 2024

faithhunja mentioned this issue Feb 1, 2024

Converted lafand files to json and added training/translation script #26

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

3X GPT-like Augmentation #17

3X GPT-like Augmentation #17

heisguyy commented Jan 25, 2024 •

edited

Loading

3X GPT-like Augmentation #17

3X GPT-like Augmentation #17

Comments

heisguyy commented Jan 25, 2024 • edited Loading

heisguyy commented Jan 25, 2024 •

edited

Loading