-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Llava Next crashes on certain image sizes #1777
Comments
Sometimes it works landscape images of certain sizes. Some times it also crashes. Do images sizes have to be multiples of 336? |
Same problem |
It seems that the current implementation counts the tokens generated from the encoded image as part of the prompt length. |
Same issue, only width == height image works |
I have the same issue, it seems to be linked to image sizes. I found that some sizes work in TGI v2.0.1 but not in TGI v2.0.2, and inversely. I made here a recap for image size I tested. Note that the 2-bis image is the 2 image cropped, to ensure that the dimension is causing the issue.
When the image hasn't the right dimension, the server encounters an error and crashes. Here are the logs I get: v2.0.1 (image 1 crash)
v2.0.2 (image 2 crash, not happening at warmup)
My model info
|
Experiencing crashes too cURL:
Docker
Nvidia
Logs
|
cc @Narsil any ideas how hard a fix here would be? We're considering moving to TGI for our Llava-Next traffic, but the entire Docker container crashes and stops on the very first image we tried. |
same issue found and any fix? @Narsil |
Hit same issue with idefics2 model. Crashed hard.
Unsure if about image sizes, maybe something else. Was among one use over many hundreds of uses. |
@pseudotensor Is this with latest version? |
No, 2.0.3. I will try 2.0.4 thanks! |
We could get shape mismatches with non-square images, resulting in an exception that crashed the backend. When post-processing an image, features corresponding to padding are removed when padding was needed. This is also reflected in the calculation of the number of image tokens to get the correct number of slots. However, there was a mismatch between the post-processing and the slot calculation. The image post-processing could exclude fewer padding features due to rounding. This change updates the image token calculation to correspond to the image postprocessing. Fixes #1777. While investigating this, I found another issue where the upstream code contains a bug that swaps the height and width dimensions after computing the image grid shape. Since the models were also trained with this bug, we should reproduce the same bug to ensure that we are generating the same features.
This still needs to be rebased and reviewed, but this should be fixed with PR #2097 if anyone wants to try. |
System Info
Running in docker
CLI Arguments
Info
Information
Tasks
Reproduction
Here is a script that I run on this image with the prompt
Describe the image?
. Note the image is (286 × 524). It returns an error and the service crashes.Logs from the tgi service
Expected behavior
When I run the same script on an image that's square (554x554), it behaves as expected.
Response
Logs from cgi
The text was updated successfully, but these errors were encountered: