Skip to content

Commit

Permalink
Improve object detection task guideline (#29967)
Browse files Browse the repository at this point in the history
* Add improvements

* Address comment
  • Loading branch information
NielsRogge authored May 1, 2024
1 parent d2feb54 commit dc401d3
Showing 1 changed file with 14 additions and 9 deletions.
23 changes: 14 additions & 9 deletions docs/source/en/tasks/object_detection.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,11 +41,11 @@ To see all architectures and checkpoints compatible with this task, we recommend
Before you begin, make sure you have all the necessary libraries installed:

```bash
pip install -q datasets transformers evaluate timm albumentations
pip install -q datasets transformers accelerate evaluate albumentations
```

You'll use 🤗 Datasets to load a dataset from the Hugging Face Hub, 🤗 Transformers to train your model,
and `albumentations` to augment the data. `timm` is currently required to load a convolutional backbone for the DETR model.
and `albumentations` to augment the data.

We encourage you to share your model with the community. Log in to your Hugging Face account to upload it to the Hub.
When prompted, enter your token to log in:
Expand Down Expand Up @@ -342,6 +342,7 @@ and `id2label` maps that you created earlier from the dataset's metadata. Additi
... id2label=id2label,
... label2id=label2id,
... ignore_mismatched_sizes=True,
... revision="no_timm", # DETR models can be loaded without timm
... )
```

Expand All @@ -357,7 +358,7 @@ Face to upload your model).
>>> training_args = TrainingArguments(
... output_dir="detr-resnet-50_finetuned_cppe5",
... per_device_train_batch_size=8,
... num_train_epochs=10,
... num_train_epochs=100,
... fp16=True,
... save_steps=200,
... logging_steps=50,
Expand Down Expand Up @@ -487,10 +488,10 @@ Next, prepare an instance of a `CocoDetection` class that can be used with `coco
... return {"pixel_values": pixel_values, "labels": target}


>>> im_processor = AutoImageProcessor.from_pretrained("devonho/detr-resnet-50_finetuned_cppe5")
>>> image_processor = AutoImageProcessor.from_pretrained("devonho/detr-resnet-50_finetuned_cppe5")

>>> path_output_cppe5, path_anno = save_cppe5_annotation_file_images(cppe5["test"])
>>> test_ds_coco_format = CocoDetection(path_output_cppe5, im_processor, path_anno)
>>> test_ds_coco_format = CocoDetection(path_output_cppe5, image_processor, path_anno)
```

Finally, load the metrics and run the evaluation.
Expand All @@ -505,10 +506,13 @@ Finally, load the metrics and run the evaluation.
... test_ds_coco_format, batch_size=8, shuffle=False, num_workers=4, collate_fn=collate_fn
... )

>>> device = torch.device("cuda") if torch.cuda.is_available() else "cpu"
>>> model.to(device)

>>> with torch.no_grad():
... for idx, batch in enumerate(tqdm(val_dataloader)):
... pixel_values = batch["pixel_values"]
... pixel_mask = batch["pixel_mask"]
... pixel_values = batch["pixel_values"].to(device)
... pixel_mask = batch["pixel_mask"].to(device)

... labels = [
... {k: v for k, v in t.items()} for t in batch["labels"]
Expand All @@ -518,8 +522,9 @@ Finally, load the metrics and run the evaluation.
... outputs = model(pixel_values=pixel_values, pixel_mask=pixel_mask)

... orig_target_sizes = torch.stack([target["orig_size"] for target in labels], dim=0)
... results = im_processor.post_process(outputs, orig_target_sizes) # convert outputs of model to Pascal VOC format (xmin, ymin, xmax, ymax)

... # convert outputs of model to Pascal VOC format (xmin, ymin, xmax, ymax)
... results = image_processor.post_process_object_detection(outputs, threshold=0, target_sizes=orig_target_sizes)
...
... module.add(prediction=results, reference=labels)
... del batch

Expand Down

0 comments on commit dc401d3

Please sign in to comment.