opticr

Python library to expose a single interface and API to few OCR tools (google vision, Tesseract)

Install

Required binaries available in the $PATH

poppler-utils (pdf2image)

https://github.com/Belval/pdf2image#how-to-install

tesseract

https://tesseract-ocr.github.io

Install OpticR

With pip

pip install opticr

With poetry

poetry add opticr

or to get the latest 'dangerous' version

poetry add  git+https://github.com/lzayep/opticr@main

Usage

from opticr import OpticR

ocr = OpticR("tesseract")
pathtofile = "test/contract.pdf
pages: list[str] = ocr.get_pages(pathtofile)

With google-vision:

from opticr import OpticR

ocr = OpticR("google-vision", options={"google-vision": {"auth": {"token": ""}}})

# file could come from an URL
pathtofile = "https://example.com/contract.pdf
pages: list[str] = ocr.get_pages(pathtofile)

Cache the result, if the file as already been OCR return immediatly the previous result. Result are stored temporarly in the local storage or shared storage such as Redis.

from opticr import OpticR

ocr = OpticR("tesseract", options={"cache":
                         {"backend": "redis", redis: "redis://"}}

# file could come from an URL
pathtofile = "https://example.com/contract.pdf
pages: list[str] = ocr.get_pages(pathtofile, cache=True)

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.github/workflows		.github/workflows
bin		bin
opticr		opticr
tests		tests
.gitignore		.gitignore
.pylintrc		.pylintrc
.pyre_configuration		.pyre_configuration
.watchmanconfig		.watchmanconfig
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
VERSION		VERSION
config.yaml		config.yaml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

opticr

Install

Required binaries available in the $PATH

poppler-utils (pdf2image)

tesseract

Install OpticR

With pip

With poetry

Usage

About

Releases

Packages

Languages

License

lzayep/opticr

Folders and files

Latest commit

History

Repository files navigation

opticr

Install

Required binaries available in the $PATH

poppler-utils (pdf2image)

tesseract

Install OpticR

With pip

With poetry

Usage

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages