To add a new task, create a new folder for that task, directly under Std-Indic-NLP
.
1.) That folder must have a README.md
detailing the most common file structure for that particular task.
2.) Must also have a CONTRIBUTING.md
& template.py
detailing how other dataset can be added.
3.) A script for combining multiple datasets must be added eventually.
1.) Add task specific utilities and cleaning processes in their respective cleaners.py
and utils.py
files.
2.) Add general utilities and functions in Std-Indic-NLP/utils.py
or Std-Indic-NLP/cleaners.py
.
Although, you may have write access to the repository, but please open pull requests so that discussion can be done regarding the changes.