Can Not leverage True parallelism (Constant time) in Batch Inference, rather taking Sequential (Linear) time as batch size increases. #85

abdullah-al-munem · 2024-12-04T07:40:02Z

We've tried inference.py with --batch-size 2, 4, and 8. We expected the inference time to be constant or at least near constant for these different batch sizes. But instead, we've observed linear increase in time with increasing batch size i.e. when we doubled the batch size, the inference time also became doubled.

Is this the expected behavior? If so, why? Is it possible to leverage true parallelism (constant time) in Batch Inference at all using CatVTON? It will be really helpful, if you please guide through the solution of this problem.

The text was updated successfully, but these errors were encountered:

Zheng-Chong · 2024-12-04T08:26:51Z

Increasing the batch size can lead to some speedup, as in your example, where 2.46s x 2 > 4.64s. However, this is achieved on the same GPU, so the speedup will not be particularly significant. If you aim to maintain a constant processing speed, you need to implement parallel processing across multiple GPUs.

abdullah-al-munem · 2024-12-04T10:10:50Z

But I noticed one thing when trying out IDM-VTON is that, although it takes around 19GB VRAM (compared to around only 8.5GB of CatVTON) while inferencing with batch size 2, it does take almost same time (around 18.5 seconds) as it takes during single inference i.e. batch size 1 using single GPU, not multiple. Can you please guide me on whether achieving this type of true parallelism during batch inference is possible for CatVTON at all (in single GPU)?

abdullah-al-munem closed this as completed Dec 4, 2024

abdullah-al-munem reopened this Dec 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can Not leverage True parallelism (Constant time) in Batch Inference, rather taking Sequential (Linear) time as batch size increases. #85

Can Not leverage True parallelism (Constant time) in Batch Inference, rather taking Sequential (Linear) time as batch size increases. #85

abdullah-al-munem commented Dec 4, 2024

Zheng-Chong commented Dec 4, 2024 •

edited

Loading

abdullah-al-munem commented Dec 4, 2024

Can Not leverage True parallelism (Constant time) in Batch Inference, rather taking Sequential (Linear) time as batch size increases. #85

Can Not leverage True parallelism (Constant time) in Batch Inference, rather taking Sequential (Linear) time as batch size increases. #85

Comments

abdullah-al-munem commented Dec 4, 2024

Zheng-Chong commented Dec 4, 2024 • edited Loading

abdullah-al-munem commented Dec 4, 2024

Zheng-Chong commented Dec 4, 2024 •

edited

Loading