Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

document scanner rotates the image when it recognizes the document #798

Open
Tavorc opened this issue Mar 26, 2024 · 9 comments
Open

document scanner rotates the image when it recognizes the document #798

Tavorc opened this issue Mar 26, 2024 · 9 comments
Assignees

Comments

@Tavorc
Copy link

Tavorc commented Mar 26, 2024

When i'm using the ML kit for document scanner, most of the time(like 95%), the document that's recognized by the library it rotates the image.
it doesn't matter if i'm using the automatic mode or manual

any idea how to solve it?
Does it happen to someone?
Screenshot_20240326_095202_Google Play services

@listvin
Copy link

listvin commented Apr 3, 2024

Does your unwanted rotation happen with LTR scripted docs? Are all of your docs receipts?

I suspect that MLKit's document scanner "UI flow" may be slightly tied with Text Recognition API, which in turn does not support Hebrew or any other RTL scripts. Even if it's irrelevant..

As you may have noticed in real life people who don't know Hebrew are trying to read documents written in Hebrew upside down. Idk about other rtl scripts irl, but I guess the fact that almost all Hebrew letters have the same height does not help at all.

These are just thots, I am not affiliated with Google in any way. I believe it's indeed a bug since API seems to be designed text-agnostic. Especially in manual mode.

If all of your data are receipts of similar format, probably you can postprocess them on low level or with tesseract

@listvin
Copy link

listvin commented Apr 3, 2024

Inspired by:

#784 (comment)

It seems that your Android language is English, can you try switching it to Hebrew?

@Tavorc
Copy link
Author

Tavorc commented Apr 3, 2024

first of all thank you.
Yes, all of the docs are receipts, it's fintech app.
I tried to change the language to Hebrew, doesn't work.

there is openCV library that i can use to cropping an image, but i didn't want it because the ML kit is more innovative.

@ai-plays
Copy link
Collaborator

ai-plays commented Apr 5, 2024

Thanks for the feedback.

There is an auto-rotation step in the scanning flow. The intention is that when you hold the phone in parallel to the table, it may trigger the phone's and camera's auto rotation logic, and results in taking images with wrong orientation. However, apparently that text-based model doesn't work very well in this case.

What do you think would be a better behavior for you? Ideally, the model just handles everything. But if not the case, an option to turn on/off auto-rotation OR something else in your mind?

@Tavorc
Copy link
Author

Tavorc commented Apr 7, 2024

I think you can know what is the orientation of the device, for example in the camera there is label 1x that represent the zoom, when i rotate the device the "1X" will rotate also, so probably you can use this.

@tamirla
Copy link

tamirla commented Sep 29, 2024

any update on that ? we're also experiencing the same issue with Flutter library that uses MLKit, when trying to scan documents in Hebrew:

jachzen/cunning_document_scanner#74

It could be great if it was possible to disable auto rotate... :)

@ai-plays ai-plays assigned mebjas and unassigned ai-plays Sep 30, 2024
@mebjas
Copy link
Collaborator

mebjas commented Oct 1, 2024

Thanks for flagging this issue. We have been working on this and improving our rotation classifier.

@Tavorc @tamirla - is this mostly happening with Hebrew documents or you have faced the issue with english documents as well?

Would it be possible to share some full image examples (pre-crop, taking form stock camera) where this issues is clearly reproducible (try import from gallery option).

@tamirla
Copy link

tamirla commented Oct 2, 2024

@mebjas Thanks for this update, so far we saw it only for Hebrew documents and only in Android, see attached image as an example

pic1
pic

@mebjas
Copy link
Collaborator

mebjas commented Oct 21, 2024

Got it, thanks for sharing!

We have this issue in our radar and this issue is pretty high in our stack rank, however we don't have an active solution for this - and team has prioritised other quality improvement efforts for Q4 2024.

Will share an update to this in Q1 2025.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants
@mebjas @ai-plays @tamirla @listvin @Tavorc and others