-
Notifications
You must be signed in to change notification settings - Fork 19
Home
gregjurman edited this page Aug 14, 2011
·
11 revisions
Tesserwrap is a Python binding to the Tesseract-OCR API (provided by libtesseract_api). Currently Tesserwrap is Alpha quality with no warranty. The goal is to create a simple way to OCR images in Python without calling the Tesseract application directly and creating temporary files.
Currently Supports:
- OCRing an image in one-shot.
- Setting an image and bounding rectangle.
- Setting segmentation modes.
- Large pictures. (All of the test images I use are 600dpi. I have yet to test with lower resolution images)
Future Support/Ideas:
- Leptonica image format conversion. (Leptonica PIX is the native format for Tesseract)
- Processing OCR boundaries in C++ and returning an array of strings. (This may give a slight speed boost)
- Thread-safety. Currently I would NOT try to access this object from multiple threads without using a Semaphore.
Installation: Installation Examples: Examples