Home

Tesserwrap

Tesserwrap is a Python binding to the Tesseract-OCR API (provided by libtesseract_api). Currently Tesserwrap is Alpha quality with no warranty. The goal is to create a simple way to OCR images in Python without calling the Tesseract application directly and creating temporary files.

Currently Supports:

OCRing an image in one-shot.
Setting an image and bounding rectangle.
Setting segmentation modes.
Large pictures. (All of the test images I use are 600dpi. I have yet to test with lower resolution images)

Future Support/Ideas:

Leptonica image format conversion. (Leptonica PIX is the native format for Tesseract)
Processing OCR boundaries in C++ and returning an array of strings. (This may give a slight speed boost)
Thread-safety. Currently I would NOT try to access this object from multiple threads without using a Semaphore.

Installation: Installation Examples: Examples

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home

Tesserwrap

Clone this wiki locally