You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ohmantics: Seems like this should be reported to tesseract then?
stumpylog: Not likely. It's a problem with an image in the PDF. 40364 * 15220 is 614 megapixels, even without a color depth.
So you were already told what the problem is and that reporting an issue for tesseract is not a good idea.
In addition you are using an old version of tesseract.
There is already an issue for large images, see #3184.
Current Behavior
Carrying over paperless-ngx/paperless-ngx#3142 to here. The linked PDF causes trouble.
'tesseract -l eng --psm 2 /tmp/ocrmypdf.io.mg5udwao/000003_rasterize.png stdout' returns 2.
Expected Behavior
No response
Suggested Fix
No response
tesseract -v
tesseract 4.1.1
leptonica-1.79.0
libgif 5.1.9 : libjpeg 6b (libjpeg-turbo 2.0.6) : libpng 1.6.37 : libtiff 4.2.0 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.4.0
Found AVX512BW
Found AVX512F
Found AVX2
Found AVX
Found FMA
Found SSE
Found libarchive 3.4.3 zlib/1.2.11 liblzma/5.2.5 bz2lib/1.0.8 liblz4/1.9.3 libzstd/1.4.8
Operating System
No response
Other Operating System
Docker on Debian Bullseye on Proxmox
uname -a
Linux paperless 5.15.85-1-pve #1 SMP PVE 5.15.85-1 (2023-02-01T00:00Z) x86_64 GNU/Linux
Compiler
No response
CPU
No response
Virtualization / Containers
Docker version 23.0.1, build a5ee5b1 on top of an LXC container of Debian Bullseye on Proxmox 7.3-6.
Other Information
No response
The text was updated successfully, but these errors were encountered: