-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
out of memory error #102
Comments
I wonder what was the bam file size in your case? |
so using smaller bams solved the memory issue, but it still seems to be going down the bwa-mem route. I also tried uploading SAM files, but no more luck. I couldnt figure out how the code base distinguishes a fastq input from a bam/sam input, the pipeline seems to be the same in either case |
Currently this feature is in the separate git branch |
Oh yes, sorry. Using that branch I get the following error many times: java.lang.StringIndexOutOfBoundsException: String index out of range: 23 Seems to be something to do with the location of the sam files? |
the object location is objectId=Uploads/ICUNEW/out.sam which shouldnt cause any issues. It looks like a problem with a trailing / but I cant find one in this case |
I believe I've made off-by-1 error, I'll do a fix now |
That seems better, but now it just seems to get stuck at the
GroupBySamReference step - about 5,8Mb input and no output
…On Wed, 13 Mar 2019 at 22:57, Alexander Bushkovsky ***@***.***> wrote:
Fixed:
https://github.com/allenday/nanostream-dataflow/blob/bam_files/NanostreamDataflowMain/src/main/java/com/google/allenday/nanostream/pubsub/GCSSourceData.java#L49
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#102 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AD01ZB1VANJENXK1ZFD_VO4qz_IHCd0_ks5vWPXLgaJpZM4bq6qP>
.
--
Group leader, Institute for Molecular Bioscience, University of Queensland
Senior Lecturer, Imperial College
http://academickarma.org/0000-0002-4300-455X
http://orcid.org/0000-0002-4300-455X
|
I am testing the bam input option and getting following error:
java.lang.OutOfMemoryError: Java heap space
java.util.Arrays.copyOf(Arrays.java:3236)
java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:118)
java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:153)
com.google.api.client.util.ByteStreams.copy(ByteStreams.java:55)
com.google.api.client.util.IOUtils.copy(IOUtils.java:94)
com.google.api.client.util.IOUtils.copy(IOUtils.java:63)
com.google.api.client.http.HttpResponse.download(HttpResponse.java:421)
com.google.cloud.storage.spi.v1.HttpStorageRpc.load(HttpStorageRpc.java:585)
com.google.cloud.storage.StorageImpl$16.call(StorageImpl.java:464)
com.google.cloud.storage.StorageImpl$16.call(StorageImpl.java:461)
com.google.api.gax.retrying.DirectRetryingExecutor.submit(DirectRetryingExecutor.java:105)
com.google.cloud.RetryHelper.run(RetryHelper.java:76)
com.google.cloud.RetryHelper.runWithRetries(RetryHelper.java:50)
com.google.cloud.storage.StorageImpl.readAllBytes(StorageImpl.java:461)
com.google.cloud.storage.Blob.getContent(Blob.java:478)
com.google.allenday.nanostream.gcs.GetDataFromFastQFile.processElement(GetDataFromFastQFile.java:37)
com.google.allenday.nanostream.gcs.GetDataFromFastQFile$DoFnInvoker.invokeProcessElement(Unknown Source)
org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:275)
org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:240)
org.apache.beam.runners.dataflow.worker.SimpleParDoFn.processElement(SimpleParDoFn.java:325)
org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:44)
org.apache.beam.runners.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:49)
org.apache.beam.runners.dataflow.worker.SimpleParDoFn$1.output(SimpleParDoFn.java:272)
org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.outputWindowedValue(SimpleDoFnRunner.java:309)
org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.access$700(SimpleDoFnRunner.java:77)
org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.output(SimpleDoFnRunner.java:621)
org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.output(SimpleDoFnRunner.java:609)
com.google.allenday.nanostream.gcs.ParseGCloudNotification.processElement(ParseGCloudNotification.java:16)
com.google.allenday.nanostream.gcs.ParseGCloudNotification$DoFnInvoker.invokeProcessElement(Unknown Source)
org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:275)
org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:240)
The text was updated successfully, but these errors were encountered: