-
Notifications
You must be signed in to change notification settings - Fork 202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allocation of 360434219 exceeds 10% of free system memory. #588
Comments
What model are you using, how big are the inputs and how big are the outputs? |
Hi @Craigacp, thank you for your response! To clarify, the model uses a protobuf file. The input consists of phrases that are transformed into six arrays. The size of these arrays depends on the number of words in the input phrases, which typically contain around 20 words. From these arrays, six tensors are created and used as input to TensorFlow. The model outputs two tensors as a result of the computation. I should also mention that the Docker container starts with approximately 5GB of memory usage. Over time, after handling thousands upon thousands of requests, the memory usage grows significantly and eventually reaches around 100GB. |
Ok, so you have 6 inputs which are roughly 20-30 ints long? Or have you embedded them externally? And the output is what dimension? Presumably you're closing all the input tensors not just the main one? You should explicitly close the session when you're done, but that wouldn't cause a leak aside from the side of the model itself unless Spring is continually reconstructing your model wrapper. Can you try using the concrete function loader rather than a bare saved model? |
Yes, you’re correct — the inputs are typically 20-30 integers long. The model produces two output tensors: A tensor with a constant shape of DenseTFloat32[134][134]. Additionally, I should mention that at some point, I iterate over the results and perform further operations. This involves creating a new EagerSession (or potentially reusing an existing one — I’m unsure how this works behind the scenes). Here's an example of how I’m doing this:
In this example, I explicitly close the EagerSession after use. Is it possible I’m missing something here that could contribute to the issue? |
A The eager session might be causing trouble too. @karllessard do you remember the correct way to clean up after an eager session? |
Storing your session as a private field of a Spring Bean shouldn't cause any trouble, you'll be able to reuse it for all predictions. For the eager session, since you are explicitly closing it, you should be fine too. Your error message says |
Hi @karllessard, thank you for your input! I should clarify that I trimmed the size number when posting — the actual allocation size is closer to 36GB. I’m not aware of any operations in my setup that could trigger such a massive memory allocation. Could you please elaborate on what might be causing this issue and what I should check to investigate further? Is there a possibility that the model itself is broken or improperly serialized? Thanks in advance for your help! |
One idea: If you could monitor the native memory (i.e. the memory used by the JVM process itself), it can give you a better hint. Check if it leaks even outside docker. The heap memory won't tell you much about this kind of leaks. Now why the model itself needs to allocate that much memory all of a sudden? It is hard to tell without knowing much about the model architecture itself. |
@icrecescu You can try to use JeMalloc. You can inject and configure it using environment variables, so it's not too much of a hassle to try it. It can also profile & detect leaks. You can also try |
Apologies - it seems my suspicions were unfounded, and just an (unhappy) coincidence. |
Hi @pluppens, thank you for your suggestions! I’ll give JeMalloc a try. I also suspect a native memory leak, as everything seems fine within the Java heap, as I described earlier. The issue becomes apparent when the app runs inside a container. In the Kubernetes monitoring dashboard, I can observe the memory usage gradually increasing over time. This also happens with RC1 and with 0.0.5. I'll try to create a sample project which reproduces this issue. |
Also @icrecescu in your eager operations (post-processing I think), do you retrieve values of the tensor via |
Please make sure that this is a bug. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:bug_template
System information
java -version
): openjdk version "21.0.4"Describe the current behavior
I am using TensorFlow in a Spring Boot application, which exposes an endpoint for NER processing. The TensorFlow model is trained in Python and loaded into the Java application for inference.
To optimize performance, I initialize the TensorFlow session once during application startup using a @PostConstruct method and store it in a private field:
The session is reused in a public method for running predictions:
During performance testing, I monitored the heap memory and found no significant issues. However, when the application runs in a Docker container, it crashes after a while, regardless of the memory allocated to the container (even with 120GB of memory). The following warning appears in the logs before the crash:
W external/local_tsl//framework/cpu_allocator_impl.cc:83] Allocation of 34891293 exceeds 10% of free system memory.
Is it possible that the memory leak is caused by the session being stored in a private field and never explicitly closed, even though all tensors and intermediate results are properly managed (closed) in the predict method?
Describe the expected behavior
The application should not exhibit memory leaks or crashes when deployed in a Docker container, regardless of memory allocation.
The text was updated successfully, but these errors were encountered: