-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The results without pre-training #2
Comments
Actually, I want to try to reproduce the results of the paper, but the problems are:
However, as I have added the script pre-training portion, I would soon add the script for training on any custom dataset and hope that would help the community as well And, I would surely update the repo and add weights, if I get any way to pre-train and fine-tune the model to achieve the results mentioned in the paper Regards, |
Thanks for your wonderful work. Looking forward to the training scripts on downstream textvqa datasets! |
Hey @uakarsh, I am the first author of the LaTr. And thank you for the implementation since I couldn't publish the code (well it is Amazon's code). Here are some things I can offer though:
Sorry I can't do much more since it is mostly out of my hand. PS: I will try to get the Amazon-OCR results on TextVQA and ST-VQA sometime soon hopefully. |
Hi @furkanbiten Thanks for your reply and appreciating the work. Looking forward to have a great conversation with you regarding the same doubts. For your points
I had two questions going on, so thought of asking
Regards, |
Hi again, If you would like, we can move the discussion to a pinned issue so that it would get more visibility. Your call. Here are the answers to the questions:
I hope this makes it a bit clear. |
Thank you for your detailed answer, a lot of things got clear, and yes you are right, I would be making a new issue, linking it with this issue's discussion. And, I am working right now, on making a step-by-step walkthrough of training LaTr on TextVQA, and hopefully, as we proceed, a lot of things will get clear in the way!! Regards, |
Glad it helped! You can contact me or ask anything that is not clear or needs clarification at anytime. |
Thank you for your responses and contributions. @uakarsh |
Hey @furkanbiten , Thank you for your excellent work and detailed suggestions. May I ask when you will release the Amazon-OCR results of TextVQA and ST-VQA datasets? I want to have a try. |
Hey @Gyann-z, I am actually trying to write my thesis and in the mean time trying to run Amazon-OCR on TextVQA and STVQA in my university since I couldn't get the data out of Amazon. I will also ask @uakarsh to refer to the repo so that more people know about it. |
Thanks! Looking forward to getting Amazon-OCR results soon. |
I have some good news. I finally had the time to run the Amazon-OCR on STVQA and TextVQA. I have created a repo where you can find the small code snippet and the raw json file returned from Amazon-OCR pipeline. Here is the repo: https://github.com/furkanbiten/stvqa_amazon_ocr Let me know if you guys have any problem. |
Thank you very much! @furkanbiten That's really good news for me. |
Thanks for your implementation. Have you tried TextVQA training without the layout-aware pre-training? Can you reproduce the results of the paper? E.g., LaTr-base achieves 44.06 on Rosetta-en and 52.29 on Amazon-OCR.
The text was updated successfully, but these errors were encountered: