SEACrowd · holylovenia · Feb 28, 2024 · Jan 20, 2024 · Feb 25, 2024 · Feb 25, 2024
@@ -66,8 +66,8 @@ The objective of datasheet review is to ensure that all dataloaders in SEACrowd
  b. Execute `datasets.load_dataset` check based on config list (a)
  c. Check on the dataset schema & few first examples for plausibility.
 5. Follows some general rules/conventions:
-    1. `PascalCase` for dataloader class name (and “Dataset” is contained in the suffix of the class name).
-    2. Lowercase word characters (regex identifier: `\w`) for schema column names, including the `source` schema if the original dataset doesn’t follow it.
+    1. Use `PascalCase` for the dataloader class name (optional: “Dataset” can be appended to the Dataloader class name, see `templates/template.py` for example).
+    2. Use lowercase word characters (regex identifier: `\w`) for schema column names, including the `source` schema if the original dataset doesn’t follow it.
 6. The code aligns with the `black` formatter:
 use this `make check_file=seacrowd/sea_datasets/{dataloader}/{dataloader}.py`
 7. Follows Dataloader Config Rule

@@ -101,7 +101,8 @@
 _SEACROWD_VERSION = "1.0.0"
 
 
-# TODO: Name the dataset class to match the script name using CamelCase instead of snake_case
+# TODO: Name the dataset class to match the script name using PascalCase instead of snake_case. 
+# optional: class name can append "Dataset" as suffix to provide better clarity (e.g. OSCAR 2201 --> Oscar2201Dataset/Oscar2201)
 class NewDataset(datasets.GeneratorBasedBuilder):
     """TODO: Short description of my dataset."""