-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closes #24 | Add dataset loader for UIT-ViSD4SA #72
Conversation
To reviewers: Please let me know of the possible adjustments or any suggestions regarding the implementation of the dataloader. |
Hi @jen-santoso, thanks for your contribution!! 🥳 Before the review, please let me throw a quick question to the other maintainers: I'll make a new task |
Sounds perfect to me, @holylovenia! Thanks! |
@holylovenia @jen-santoso : Does the task can simply be framed as sequence labelling using the IOB tag? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @SamuelCahyawijaya, thanks for the brilliant suggestion. @jen-santoso and I discussed the possibility of using the schema seq_label
for this Tasks.SPAN_BASED_ABSA
and agreed with your suggestion.
I have changed the schema-task mapping accordingly in the master
branch and @jen-santoso will change the kb
schema to seq_label
for this dataloader implementation.
@holylovenia @SamuelCahyawijaya done changing the schema to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @jen-santoso, thanks for the dataloader! It works well. I have a minor request though, could you please provide the _LANGUAGES
constant?
@holylovenia updated |
Ah sorry I wasn't clear before, array form please. 🙏 |
oops, ok now in array form >.< |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your help, @jen-santoso!! Looks good to me~ Let's wait for @SamuelCahyawijaya's review before we proceed to merging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jen-santoso : thank you for updating the schema. LGTM!.
Based on our internal discussion, we decided to give +2 bonus points for this contribution.
Please name your PR after the issue it closes. You can use the following line: "Closes #24 " where you replace the ISSUE-NUMBER with the one corresponding to your dataset.
Checkbox
seacrowd/sea_datasets/my_dataset/my_dataset.py
(please use only lowercase and underscore for dataset naming)._CITATION
,_DATASETNAME
,_DESCRIPTION
,_HOMEPAGE
,_LICENSE
,_URLs
,_SUPPORTED_TASKS
,_SOURCE_VERSION
, and_SEACROWD_VERSION
variables._info()
,_split_generators()
and_generate_examples()
in dataloader script.BUILDER_CONFIGS
class attribute is a list with at least oneSEACrowdConfig
for the source schema and one for a seacrowd schema.datasets.load_dataset
function.python -m tests.test_seacrowd seacrowd/sea_datasets/<my_dataset>/<my_dataset>.py
.