Add `zero-shot-object-detection` w/ OwlViT #392

xenova · 2023-11-15T00:05:29Z

Example usage:

Example 1

(showing same results as python library with unquantized model)

import { pipeline } from '@xenova/transformers';

let detector = await pipeline('zero-shot-object-detection', 'Xenova/owlvit-base-patch32', { quantized: false });

let url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/astronaut.png';
let candidate_labels = ['human face', 'rocket', 'nasa badge', 'star-spangled banner'];
let output = await detector(url, candidate_labels);

See output

// [
//   {
//     score: 0.3585130274295807,
//     label: 'human face',
//     box: { xmin: 180, ymin: 71, xmax: 271, ymax: 178 }
//   },
//   {
//     score: 0.28625914454460144,
//     label: 'nasa badge',
//     box: { xmin: 129, ymin: 348, xmax: 206, ymax: 428 }
//   },
//   {
//     score: 0.2107662707567215,
//     label: 'rocket',
//     box: { xmin: 351, ymin: -1, xmax: 468, ymax: 288 }
//   },
//   {
//     score: 0.13869591057300568,
//     label: 'star-spangled banner',
//     box: { xmin: 1, ymin: 0, xmax: 105, ymax: 509 }
//   },
//   {
//     score: 0.1277477741241455,
//     label: 'nasa badge',
//     box: { xmin: 277, ymin: 339, xmax: 326, ymax: 380 }
//   },
//   {
//     score: 0.12643635272979736,
//     label: 'rocket',
//     box: { xmin: 358, ymin: 64, xmax: 424, ymax: 280 }
//   }
// ]

Example 2

(different labels + using quantized model)

import { pipeline } from '@xenova/transformers';

let detector = await pipeline('zero-shot-object-detection', 'Xenova/owlvit-base-patch32');

let url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/astronaut.png';
let candidate_labels = ['human face', 'rocket', 'helmet', 'american flag'];
let output = await detector (url, candidate_labels);

See output

// [
//   {
//     score: 0.24392342567443848,
//     label: 'human face',
//     box: { xmin: 180, ymin: 67, xmax: 274, ymax: 175 }
//   },
//   {
//     score: 0.15129457414150238,
//     label: 'american flag',
//     box: { xmin: 0, ymin: 4, xmax: 106, ymax: 513 }
//   },
//   {
//     score: 0.13649864494800568,
//     label: 'helmet',
//     box: { xmin: 277, ymin: 337, xmax: 511, ymax: 511 }
//   },
//   {
//     score: 0.10262022167444229,
//     label: 'rocket',
//     box: { xmin: 352, ymin: -1, xmax: 463, ymax: 287 }
//   }
// ]

Example 3

(different image and pipeline parameters)

import { pipeline } from '@xenova/transformers';

let detector = await pipeline('zero-shot-object-detection', 'Xenova/owlvit-base-patch32');

let url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/beach.png';
let candidate_labels = ['hat', 'book', 'sunglasses', 'camera'];
let output = await detector(url, candidate_labels, { topk: 4, threshold: 0.05 });

See output

// [
//   {
//     score: 0.1606510728597641,
//     label: 'sunglasses',
//     box: { xmin: 347, ymin: 229, xmax: 429, ymax: 264 }
//   },
//   {
//     score: 0.08935828506946564,
//     label: 'hat',
//     box: { xmin: 38, ymin: 174, xmax: 258, ymax: 364 }
//   },
//   {
//     score: 0.08530698716640472,
//     label: 'camera',
//     box: { xmin: 187, ymin: 350, xmax: 260, ymax: 411 }
//   },
//   {
//     score: 0.08349756896495819,
//     label: 'book',
//     box: { xmin: 261, ymin: 280, xmax: 494, ymax: 425 }
//   }
// ]

tobiascornille · 2023-12-06T11:47:08Z

@xenova Why do the boxes of the object detection pipeline have values between 0-1 and the boxes of this pipeline absolute values? I wanted to try to adapt the static template on HF, but now I need to first get the image height and width somehow.

xenova · 2023-12-06T11:48:52Z

@tobiascornille You can modify the percentage option to return pixel values (set it to false or return percentage value (set it to true). See the docs for more information.

xenova added 16 commits November 14, 2023 19:35

Set batch_size=1 for owlvit exports

edc5d3a

Add support for owlvit models

19cebac

Update default quantization settings

71a9a62

Add list of supported models

ff3e8b7

Revert update of owlvit quantization settings

1d03deb

Add OwlViTProcessor

4252b15

Move get_bounding_box to utils

7dbf6f3

Add ZeroShotObjectDetectionPipeline

d564def

Add unit tests

d0e6d25

Add owlvit processor test

983c502

Add listed support for zero-shot-object-detection

98b04a1

Add OWL-ViT to list of supported models

60bc6ea

Update README.md

4b616e2

Merge branch 'main' into add-owlvit

da0bc91

Fix typo from merge

534d6e4

Merge branch 'main' into add-owlvit

48b90ba

xenova merged commit 7cf8a2c into main Nov 20, 2023
4 checks passed

xenova deleted the add-owlvit branch December 6, 2023 15:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `zero-shot-object-detection` w/ OwlViT #392

Add `zero-shot-object-detection` w/ OwlViT #392

xenova commented Nov 15, 2023 •

edited

Loading

tobiascornille commented Dec 6, 2023

xenova commented Dec 6, 2023 •

edited

Loading

Add zero-shot-object-detection w/ OwlViT #392

Add zero-shot-object-detection w/ OwlViT #392

Conversation

xenova commented Nov 15, 2023 • edited Loading

Example usage:

Example 1

Example 2

Example 3

tobiascornille commented Dec 6, 2023

xenova commented Dec 6, 2023 • edited Loading

Add `zero-shot-object-detection` w/ OwlViT #392

Add `zero-shot-object-detection` w/ OwlViT #392

xenova commented Nov 15, 2023 •

edited

Loading

xenova commented Dec 6, 2023 •

edited

Loading