Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU Memory Overflow Issues During Inference with Large sized input Data. #1663

Open
ylnhari opened this issue Apr 24, 2024 · 0 comments
Open

Comments

@ylnhari
Copy link

ylnhari commented Apr 24, 2024

I am encountering memory overflow issues when processing large data batches, specifically with a inference module of digital fingerprinting pipeline while generating predictions involving using GPU for data approximately 10 million rows by 10 columns. The problem arises during inference on GPUs after the model is loaded. When invoking the get_results() method of the autoencoder class, PyTorch appears to load both the model graph along with input data tensors simultaneously onto the GPU here. Consequently, when the input data exceeds around 7 million instances, it consumes 77 GB of the available 81 GB of GPU memory, leading to errors. A potential solution could be to implement a batch size setting within the digital fingerprinting (DFP) inference pipeline or the autoencoder's predict functions. This setting would enable the system to process predictions in batches, regardless of the total number of rows passed to the predict function, thereby preventing memory overflow errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Todo
Development

No branches or pull requests

1 participant