Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zipkin-dependencies (storage: ES) exception -> java.lang.OutOfMemoryError: Java heap space #143

Open
ldcsaa opened this issue Aug 6, 2019 · 8 comments
Assignees

Comments

@ldcsaa
Copy link

ldcsaa commented Aug 6, 2019

From one day onwards, my zipkin-dependencies job (storage: ES) run fail, and output logs like these, and how to resolve it ?
My zipkin-server version: 2.12.9.
both zipkin-dependencies version 2.1.0 and 2.3.1 throw these exceptions.

(heap memory confing: -Xmx6g -Xms6g, I think it's enough)

exception.log

@codefromthecrypt
Copy link
Member

codefromthecrypt commented Aug 7, 2019 via email

@aaf1
Copy link

aaf1 commented Dec 5, 2019

hello i have the same problem, my zipkin index size ~9-11 Gb. Must i set heap > then index size?

@shakuzen
Copy link
Member

Must i set heap > then index size?

Not in my experience with the zipkin-dependencies job, but I'm not a Spark expert either.

@jorgheymans
Copy link
Contributor

FWIW we're running zipkin-dependencies with jdk8 and default heap, biggest index size we've seen for now is 2.5GB and it passed fine. Perhaps the complexity / size of the trace or span data plays a role ?

@jorgheymans
Copy link
Contributor

jorgheymans commented May 20, 2020

Coming back to this, we ingested about 8.5GB of span data for a day recently, and even with a heap of 12G i can not get this processed it always OOMs. Obviously (right?) the heap dump contains mostly the trace data so analysing it is pointless.

I started digging into the depths of Spark tuning and discovered there's a whole world of optimizations possible: https://spark.apache.org/docs/latest/tuning.html#determining-memory-consumption . I will try and get to the bottom of this, and see what options there are to make this go through.

@jorgheymans jorgheymans self-assigned this May 20, 2020
@codefromthecrypt
Copy link
Member

codefromthecrypt commented May 21, 2020 via email

@jorgheymans
Copy link
Contributor

jorgheymans commented Jun 1, 2020

spark-streaming is a different thing https://spark.apache.org/docs/latest/streaming-programming-guide.html , that is not what this job is doing (but maybe it should or could).

Hooking up jconsole shows that in order to analyze about 5.5Gb of trace data, you need up to 10Gb of memory:

zipkin-deps-default-5 5G

Toying around with the kryo serializer as recommended here did not improve things greatly:

zipkin-deps-kryo-5 5G

I am going to try this week with different index sizes and see if the 2x rule in terms of heap holds. We could then document it as a recommendation. Still, i can imagine that 10Gb of trace data is not all that big, many sites will have a lot more ...

@codefromthecrypt
Copy link
Member

codefromthecrypt commented Jun 2, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants