Extractor module for OCR'ing image files.
To use the Google Vision input module, follow these steps:
- Create a new Google Cloud Platform App
- Activate the Google Vision API
- Create new credentials and download the JSON file containing your credentials.
- Copy the contents of the downloaded JSON file and paste everything as
"credentials"
value:"extractor": { "pdf": "...", "ocr": "google-vision", "credentials": { "type": "...", "project_id": "...", "private_key_id": "...", "private_key": "...", "client_email": "...", "client_id": "...", "auth_uri": "...", "token_uri": "...", "auth_provider_x509_cert_url": "...", "client_x509_cert_url": "..." } },
- Change your configuration to use
google-vision
for OCR.
NB: This module only works with images yet. To make it work with PDF and TIFF, we'll need to use Batch File Annotation and Google Cloud Storage. See the official guide.