-
Notifications
You must be signed in to change notification settings - Fork 27.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for GOT-OCR2.0 #34173
Comments
Hello, If someone from the core-team is not already working on this, or there is interest it in, I would really love to contribute this model to transformers with some help! |
hi @VladOS95-cyber If you don't mind can I help u with this issue, if you are working on it ? |
Hi @GargDivanshu, I don't mind at all, let's wait for decision from @qubvel @LysandreJik |
Hey @VladOS95-cyber @GargDivanshu ! |
cool 🙌 |
+1 |
Any movement on this? Looking forward to trying it out |
Been testing stepfun's demo code along with the model -- would love to see this in transformers! |
Hey all! |
@yonigozlan if you remember, please paste the PR link in this thread, would love to subscribe |
Hi again! |
Model description
As an OCR-2.0 model, GOT can handle all artificial optical signals (e.g., plain texts, math/molecular formulas, tables, charts, sheet music, and even geometric shapes) under various OCR tasks. On the input side, the model supports commonly used scene- and document-style images in slice and whole-page styles. On the output side, GOT can generate plain or formatted results (markdown/tikz/smiles/kern) via an easy prompt. Besides, the model enjoys interactive OCR features, i.e., region-level recognition guided by coordinates or colors.
Open source status
Provide useful links for the implementation
Implementation: https://github.com/Ucas-HaoranWei/GOT-OCR2.0/
Paper: https://arxiv.org/abs/2409.01704
The text was updated successfully, but these errors were encountered: