From a55956556856ebe85819903398ab0d8c064c515c Mon Sep 17 00:00:00 2001 From: Holy Lovenia Date: Tue, 18 Jun 2024 12:58:36 +0800 Subject: [PATCH] Update README.md --- profile/README.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/profile/README.md b/profile/README.md index 98e2792..6cb8a13 100644 --- a/profile/README.md +++ b/profile/README.md @@ -5,6 +5,8 @@ This movement is co-initiated by SEA researchers and practitioners from various See what SEA indigenous and non-indigenous languages we accept [here](https://github.com/SEACrowd/seacrowd-datahub/blob/master/LANGUAGES.md). +> Our first publication is out: ["SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages"](https://arxiv.org/pdf/2406.10118)! + ## Why Is It Important? It is essential to greatly increase the accessibility of SEA datasets, promote research in SEA languages and cultures, as well as build more AI models that represent SEA. @@ -53,6 +55,8 @@ Definitely. Please feel free to ask in `#general` on Discord or message one of t ### SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages +> Check out [our paper](https://arxiv.org/pdf/2406.10118)! + Our first collaboration ran from 1 November 2023 to 15 June 2024 with a total of [86 contributors](https://docs.google.com/spreadsheets/d/e/2PACX-1vQDZtJjA6i7JsxS5IlMtVuwOYjr2Pbl_b47yMSH4aAdHDBIpf-CiJQjNQAzcJPEu_aE7kwH4ZvKvPm0/pubhtml?gid=225616890&single=true). We managed to consolidate 498 datasheets in SEACrowd Catalogue ([web](https://seacrowd.github.io/seacrowd-catalogue/)/[csv](https://docs.google.com/spreadsheets/d/1ibbywsC1tQ_sLPX8bUAjC-vrTrUqZgZA46W_sxWw4Ss/edit?usp=sharing)) and standardize 399 dataloaders in [SEACrowd Data Hub](https://github.com/SEACrowd/seacrowd-datahub/), covering 980 out of 1308 SEA languages. Through our SEACrowd benchmarks, we assess the quality of AI models on 36 indigenous languages across 13 tasks, offering valuable insights into the current AI landscape in SEA. Furthermore, we propose strategies to facilitate greater AI advancements, maximizing potential utility and resource equity for the future of AI in SEA. @@ -79,4 +83,4 @@ Once their points reached 20, they would be rewarded with **merchandise and co-a The contribution point tracking for this past project is available at [this sheet](https://docs.google.com/spreadsheets/d/e/2PACX-1vQDZtJjA6i7JsxS5IlMtVuwOYjr2Pbl_b47yMSH4aAdHDBIpf-CiJQjNQAzcJPEu_aE7kwH4ZvKvPm0/pubhtml?gid=225616890&single=true)! -Contribution Progress \ No newline at end of file +Contribution Progress