-
Notifications
You must be signed in to change notification settings - Fork 406
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Florence2 VS BLIP2 #61
Comments
florence 2 is trained after we put the paper on arxiv. We don't have benchmark numbers yet, but from examples I have tried so far, it seems to be at least comparable to (if not better than) blip2. |
It also tend to produce the same gibberish output:
|
interesting, can you post a link to the original screenshot? |
@aliencaocao I noticed some issue with the demo on huggingface space, fixing it now. Update: ok seems to be some transient issue. Resumed the demo now. |
Edit: issue persists in #52. @aliencaocao Thank you for your feedback. |
@abrichr please stop posting your PR everywhere on unrelated issues. Unless you prove that your code somehow do not produce gibberish using the Florence2 model on my image, you are not contributing to the discussion here. This isn't the first time you have posted something irrelevant in other people's issues. |
@yadong-lu So can I assume the gibberish with Florence is normal, and just caused by limited model capacity, not some implementation bug? |
@aliencaocao When the icon detected are neither text nor app icons, I think florence has some issue caption it. |
Is there any benchmark/comparsion between the 2 models you released? I cannot find any info regarding florence 2 in your paper. The inference cost differs quite signifcantly.
The text was updated successfully, but these errors were encountered: