Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create dataset loader for Thai-NNER #95

Open
SamuelCahyawijaya opened this issue Nov 20, 2023 · 8 comments · May be fixed by #326
Open

Create dataset loader for Thai-NNER #95

SamuelCahyawijaya opened this issue Nov 20, 2023 · 8 comments · May be fixed by #326
Assignees
Labels
pr-ready A PR that closes this issue is Ready to be reviewed

Comments

@SamuelCahyawijaya
Copy link
Collaborator

SamuelCahyawijaya commented Nov 20, 2023

Dataloader name: thai_nner/thai_nner.py
DataCatalogue: http://seacrowd.github.io/seacrowd-catalogue/card.html?thai_nner

Dataset thai_nner
Description This work presents the first Thai Nested Named Entity Recognition (N-NER) dataset. Thai N-NER consists of 264,798 mentions, 104 classes, and a maximum depth of 8 layers obtained from news articles and restaurant reviews, a total of 4894 documents. Our work, to the best of our knowledge, presents the largest non-English N-NER dataset and the first non-English one with fine-grained classes.
Subsets -
Languages tha
Tasks Named Entiy Recognition
License Creative Commons Attribution Share Alike 3.0 (cc-by-sa-3.0)
Homepage https://github.com/vistec-AI/Thai-NNER
HF URL -
Paper URL https://aclanthology.org/2022.findings-acl.116/
@SamuelCahyawijaya SamuelCahyawijaya converted this from a draft issue Nov 20, 2023
@bp-high
Copy link
Contributor

bp-high commented Nov 22, 2023

#self-assign

Copy link

Hi, may I know if you are still working on this issue? Please let @holylovenia @SamuelCahyawijaya @sabilmakbar know if you need any help.

@pavaris-pm
Copy link

pavaris-pm commented Dec 15, 2023

@bp-high do you currently working on this? if not, maybe I will assign myself to handle this dataset for dataloader creation instead.

@pavaris-pm
Copy link

#self-assign

@bp-high
Copy link
Contributor

bp-high commented Dec 15, 2023

Hi @pavaris-pm I am thinking of finishing this task this weekend, If I am not able to do this by this weekend I will inform you and you can feel free to pick this up then.

@pavaris-pm
Copy link

Roger that, anyway please let me know 🫡

Copy link

Hi, may I know if you are still working on this issue? Please let @holylovenia @SamuelCahyawijaya @sabilmakbar know if you need any help.

@sabilmakbar
Copy link
Collaborator

Hi @bp-high, may I know the progress of this dataloader creation? Since it has passed 2+2 weeks of expected completion of dataloader, I will clear the assignee if no update is received by Monday 12 PM UTC.

@bp-high bp-high linked a pull request Jan 15, 2024 that will close this issue
8 tasks
@sabilmakbar sabilmakbar added the pr-ready A PR that closes this issue is Ready to be reviewed label Jan 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-ready A PR that closes this issue is Ready to be reviewed
Projects
Status: No status
Development

Successfully merging a pull request may close this issue.

4 participants