Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

s3-ocr file command to process a single PDF #12

Open
simonw opened this issue Jun 30, 2022 · 1 comment
Open

s3-ocr file command to process a single PDF #12

simonw opened this issue Jun 30, 2022 · 1 comment
Labels
enhancement New feature or request

Comments

@simonw
Copy link
Owner

simonw commented Jun 30, 2022

Would still require a bucket since PDFs through Textract need to go through a bucket.

Maybe has an option to block and poll for completion?

Default operation can be to put the object to the bucket and then start an OCR run against it.

Can use the same filename, but return an error if a file of that name exists already.

@simonw simonw added the enhancement New feature or request label Jun 30, 2022
@simonw
Copy link
Owner Author

simonw commented Jun 30, 2022

s3-ocr file my-bucket document.pdf

Default mode outputs a message saying that the file has been uploaded and put in the OCR queue.

Option --wait waits for it to complete and then returns the text version of the OCR.

--wait --json blocks and then returns the output of fetch --combine to standard output.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant