From e210a5338face591209bc7d82eae11de8578e718 Mon Sep 17 00:00:00 2001 From: Fajri Koto Date: Tue, 31 Oct 2023 08:21:43 +0400 Subject: [PATCH] Update UPLOADING.md --- UPLOADING.md | 26 +++++++++++++------------- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/UPLOADING.md b/UPLOADING.md index 0440af361..a30f39a5a 100644 --- a/UPLOADING.md +++ b/UPLOADING.md @@ -8,19 +8,19 @@ Please do the following before getting started: - [Make](https://huggingface.co/join) an account on 🤗's Hub and [login](https://huggingface.co/login). **Choose a good password, as you'll need to authenticate your credentials**. -- Join the Indobenchmark initiative [here](https://huggingface.co/indobenchmark). +- Join the SEACrowd initiative [here](https://huggingface.co/SEACrowd). - click the "Request to join this org" button in the upper right corner. - Make a github account; you can follow instructions to install git [here](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git). -**Note - your permissions will be set to READ. Please contact an admin in your dataset's github issue to be granted WRITE access; this should be given after your PR is accepted**. +**Note - your permissions will be set to READ. Please contact an admin in your dataset's GitHub issue to be granted WRITE access; this should be given after your PR is accepted**. ### 2) Activate the Huggingface hub -You can find the official instructions [here](https://huggingface.co/welcome). We will provide what you need for the nusantara-datasets hackathon environment. +You can find the official instructions [here](https://huggingface.co/welcome). We will provide what you need for the seacrowd-datasets hackathon environment. -With your active `nusantara` environment, use the following command: +With your active `seacrowd` environment, use the following command: ``` huggingface-cli login @@ -32,7 +32,7 @@ Login with your 🤗 Hub account username and password. Make a repository via the 🤗 Hub [here](https://huggingface.co/new-dataset) with the following details. -+ Set Owner: nusantara-datasets ++ Set Owner: seacrowd-datasets + Set Dataset name: the name of the dataset + Set License: the license that applies to this dataset + Select Private @@ -44,10 +44,10 @@ If there is no appropriate license available in the provided options (for exampl ### 4. Clone the dataset repository -Using terminal access, find a location to place your github repository. In this location, use the following command: +Using terminal access, find a location to place your GitHub repository. In this location, use the following command: ``` -git clone https://huggingface.co/datasets/indobenchmark/ +git clone https://huggingface.co/datasets/SEACrowd/ ``` ### 5. Commit your changes @@ -64,14 +64,14 @@ git push origin Run the following command **in a folder that does not include your data-loading script**: -Test both the original dataset schema/config and the nusantara schema/config. +Test both the original dataset schema/config and the seacrowd schema/config. **Public Dataset** ```python from datasets import load_dataset -dataset_orig = load_dataset("indobenchmark/", name="source", use_auth_token=True) -dataset_indobenchmark= load_dataset("indobenchmark/", name="indobenchmark", use_auth_token=True) +dataset_orig = load_dataset("SEACrowd/", name="source", use_auth_token=True) +dataset_SEACrowd = load_dataset("SEACrowd/", name="SEACrowd", use_auth_token=True) ``` **Private Dataset** @@ -80,13 +80,13 @@ dataset_indobenchmark= load_dataset("indobenchmark/", name="i from datasets import load_dataset dataset_orig = load_dataset( - "indobenchmark/", + "SEACrowd/", name="source", data_dir="/local/path/to/data/files", use_auth_token=True) -dataset_indobenchmark = load_dataset( - "indobenchmark/", +dataset_SEACrowd = load_dataset( + "SEACrowd/", name="indobenchmark", data_dir="/local/path/to/data/files", use_auth_token=True)