awesome-ocr-resources/datasets/IRREGULAR_DATA at master · ZumingHuang/awesome-ocr-resources

History

Name		Name	Last commit message	Last commit date
parent directory ..
demo_images		demo_images
README.md		README.md

README.md

Overview

Dataset	Train	Validation	Test	Character-Level Annotation	Word-Level Annotation	Line-Level Annotation
Total-Text	1255	No	300	Yes	Yes (Polygon)	No
SCUT-CTW1500	1000	No	500	No	No	Yes (Polygon)
Uber-Text	59,001	23,606	35,362	No	No	Yes (Polygon)

Total-Text

Demo images of Total-Text dataset.

The Total-Text dataset is a more comprehensive dataset than the existing text datasets. The Total-Text consists of 1555 images with more than 3 different text orientations: Horizontal, Multi-Oriented, and Curved, one of a kind.

SCUT-CTW1500

Demo images of SCUT-CTW1500 dataset.

The CTW1500 dataset contains 1500 images, with 10,751 bounding boxes (3,530 are curve bounding boxes) and at least one curve text per image. The images are manually harvested from internet, image library like google Open-Image and private data collected by phone cameras, which also contain lots of horizontal and multi-oriented text. The distribution of the images is various, containing indoor, outdoor, born digital, blurred, perspective distortion texts and so on. In addition, the dataset is multi-lingual with mainly Chinese and English text.

Uber-Text

Demo images of Uber-Text dataset.

Uber-Text is a large-scale OCR dataset which contains street-level images collected from car mounted sensors and truths annotated by a team of image analysts. The characteristics of the dataset include (1) streetside images with their text region polygons and the corresponding transcriptions, (2) 9 categories indicating the business name text, street name text and street number text, etc, (3) a set containing over 110k images, (4) 4.84 text instances per image on average.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IRREGULAR_DATA

IRREGULAR_DATA

README.md

Overview

Total-Text

SCUT-CTW1500

Uber-Text

Files

IRREGULAR_DATA

Directory actions

More options

Directory actions

More options

Latest commit

History

IRREGULAR_DATA

Folders and files

parent directory

README.md

Overview

Total-Text

SCUT-CTW1500

Uber-Text