From f7595760ed5c1d29211f363ff57816c27d1f345b Mon Sep 17 00:00:00 2001
From: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Date: Thu, 7 Dec 2023 10:47:35 -0800
Subject: [PATCH] [docs] Custom semantic segmentation dataset (#27859)

* custom dataset

* fix link

* feedback
---
 docs/source/en/tasks/semantic_segmentation.md | 57 ++++++++++++++++++-
 1 file changed, 55 insertions(+), 2 deletions(-)

diff --git a/docs/source/en/tasks/semantic_segmentation.md b/docs/source/en/tasks/semantic_segmentation.md
index ce6fb70c82441f..f422fef9aeb566 100644
--- a/docs/source/en/tasks/semantic_segmentation.md
+++ b/docs/source/en/tasks/semantic_segmentation.md
@@ -23,8 +23,8 @@ rendered properly in your Markdown viewer.
 Image segmentation models separate areas corresponding to different areas of interest in an image. These models work by assigning a label to each pixel. There are several types of segmentation: semantic segmentation, instance segmentation, and panoptic segmentation.
 
 In this guide, we will:
-1. [Take a look at different types of segmentation](#Types-of-Segmentation),
-2. [Have an end-to-end fine-tuning example for semantic segmentation](#Fine-tuning-a-Model-for-Segmentation). 
+1. [Take a look at different types of segmentation](#types-of-segmentation).
+2. [Have an end-to-end fine-tuning example for semantic segmentation](#fine-tuning-a-model-for-segmentation).
 
 Before you begin, make sure you have all the necessary libraries installed:
 
@@ -256,6 +256,59 @@ You'll also want to create a dictionary that maps a label id to a label class wh
 >>> num_labels = len(id2label)
 ```
 
+#### Custom dataset
+
+You could also create and use your own dataset if you prefer to train with the [run_semantic_segmentation.py](https://github.com/huggingface/transformers/blob/main/examples/pytorch/semantic-segmentation/run_semantic_segmentation.py) script instead of a notebook instance. The script requires:
+
+1. a [`~datasets.DatasetDict`] with two [`~datasets.Image`] columns, "image" and "label"
+
+     ```py
+     from datasets import Dataset, DatasetDict, Image
+
+     image_paths_train = ["path/to/image_1.jpg/jpg", "path/to/image_2.jpg/jpg", ..., "path/to/image_n.jpg/jpg"]
+     label_paths_train = ["path/to/annotation_1.png", "path/to/annotation_2.png", ..., "path/to/annotation_n.png"]
+
+     image_paths_validation = [...]
+     label_paths_validation = [...]
+
+     def create_dataset(image_paths, label_paths):
+         dataset = Dataset.from_dict({"image": sorted(image_paths),
+                                     "label": sorted(label_paths)})
+         dataset = dataset.cast_column("image", Image())
+         dataset = dataset.cast_column("label", Image())
+
+     return dataset
+
+     # step 1: create Dataset objects
+     train_dataset = create_dataset(image_paths_train, label_paths_train)
+     validation_dataset = create_dataset(image_paths_validation, label_paths_validation)
+
+     # step 2: create DatasetDict
+     dataset = DatasetDict({
+          "train": train_dataset,
+          "validation": validation_dataset,
+          }
+     )
+
+     # step 3: push to Hub (assumes you have ran the huggingface-cli login command in a terminal/notebook)
+     dataset.push_to_hub("your-name/dataset-repo")
+
+     # optionally, you can push to a private repo on the Hub
+     # dataset.push_to_hub("name of repo on the hub", private=True)
+     ```
+
+2. an id2label dictionary mapping the class integers to their class names
+
+     ```py
+     import json
+     # simple example
+     id2label = {0: 'cat', 1: 'dog'}
+     with open('id2label.json', 'w') as fp:
+     json.dump(id2label, fp)
+     ```
+
+As an example, take a look at this [example dataset](https://huggingface.co/datasets/nielsr/ade20k-demo) which was created with the steps shown above.
+
 ### Preprocess
 
 The next step is to load a SegFormer image processor to prepare the images and annotations for the model. Some datasets, like this one, use the zero-index as the background class. However, the background class isn't actually included in the 150 classes, so you'll need to set `reduce_labels=True` to subtract one from all the labels. The zero-index is replaced by `255` so it's ignored by SegFormer's loss function: