Update template.py instruction for dataloader class name (SEACrowd#334)

* Add documentation for dataloader class name * Update template.py * Update REVIEWING.md This modified the content of adding "Dataset" suffix into optional, and giving a reference to templates/templates.py for example * Update REVIEWING.md fix file reference name --------- Co-authored-by: Salsabil Maulana Akbar <[email protected]>
raileymontalan · Feb 28, 2024 · db035c6 · db035c6
1 parent 911a582
commit db035c6
Show file tree

Hide file tree

Showing 2 changed files with 4 additions and 3 deletions.
diff --git a/REVIEWING.md b/REVIEWING.md
@@ -66,8 +66,8 @@ The objective of datasheet review is to ensure that all dataloaders in SEACrowd
  b. Execute `datasets.load_dataset` check based on config list (a)
  c. Check on the dataset schema & few first examples for plausibility.
 5. Follows some general rules/conventions:
-    1. `PascalCase` for dataloader class name (and “Dataset” is contained in the suffix of the class name).
-    2. Lowercase word characters (regex identifier: `\w`) for schema column names, including the `source` schema if the original dataset doesn’t follow it.
+    1. Use `PascalCase` for the dataloader class name (optional: “Dataset” can be appended to the Dataloader class name, see `templates/template.py` for example).
+    2. Use lowercase word characters (regex identifier: `\w`) for schema column names, including the `source` schema if the original dataset doesn’t follow it.
 6. The code aligns with the `black` formatter:
 use this `make check_file=seacrowd/sea_datasets/{dataloader}/{dataloader}.py`
 7. Follows Dataloader Config Rule

diff --git a/templates/template.py b/templates/template.py
@@ -101,7 +101,8 @@
 _SEACROWD_VERSION = "1.0.0"
 
 
-# TODO: Name the dataset class to match the script name using CamelCase instead of snake_case
+# TODO: Name the dataset class to match the script name using PascalCase instead of snake_case. 
+# optional: class name can append "Dataset" as suffix to provide better clarity (e.g. OSCAR 2201 --> Oscar2201Dataset/Oscar2201)
 class NewDataset(datasets.GeneratorBasedBuilder):
     """TODO: Short description of my dataset."""