You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We've tried inference.py with --batch-size 2, 4, and 8. We expected the inference time to be constant or at least near constant for these different batch sizes. But instead, we've observed linear increase in time with increasing batch size i.e. when we doubled the batch size, the inference time also became doubled.
Is this the expected behavior? If so, why? Is it possible to leverage true parallelism (constant time) in Batch Inference at all using CatVTON? It will be really helpful, if you please guide through the solution of this problem.
The text was updated successfully, but these errors were encountered:
Increasing the batch size can lead to some speedup, as in your example, where 2.46s x 2 > 4.64s. However, this is achieved on the same GPU, so the speedup will not be particularly significant. If you aim to maintain a constant processing speed, you need to implement parallel processing across multiple GPUs.
But I noticed one thing when trying out IDM-VTON is that, although it takes around 19GB VRAM (compared to around only 8.5GB of CatVTON) while inferencing with batch size 2, it does take almost same time (around 18.5 seconds) as it takes during single inference i.e. batch size 1 using single GPU, not multiple. Can you please guide me on whether achieving this type of true parallelism during batch inference is possible for CatVTON at all (in single GPU)?
We've tried
inference.py
with--batch-size
2, 4, and 8. We expected the inference time to be constant or at least near constant for these different batch sizes. But instead, we've observed linear increase in time with increasing batch size i.e. when we doubled the batch size, the inference time also became doubled.Is this the expected behavior? If so, why? Is it possible to leverage true parallelism (constant time) in Batch Inference at all using CatVTON? It will be really helpful, if you please guide through the solution of this problem.
The text was updated successfully, but these errors were encountered: