Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3X GPT-like Augmentation #17

Open
3 tasks
heisguyy opened this issue Jan 25, 2024 · 0 comments
Open
3 tasks

3X GPT-like Augmentation #17

heisguyy opened this issue Jan 25, 2024 · 0 comments
Labels
experiment Experiments, modeling, etc.

Comments

@heisguyy
Copy link
Collaborator

heisguyy commented Jan 25, 2024

We will use swa, yor, hau, fon, wolof, and x for our experiments
All sentences should be randomly sampled from the training set and shuffling should be done before fine-tuning

  • Run your augmentation to generate 3x of random samples from the training data
  • Create a folder in the data/ folder with the name of your augmentation strategy and put the files there. The format should be in line json, as seen in the Mafand dataset.
  • Fintetune NLLB for 5 epochs and report results (training samples + augmentation samples)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
experiment Experiments, modeling, etc.
Projects
None yet
Development

No branches or pull requests

2 participants