You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am demultiplexing a S4 sequencing run and Picard ExtractIlluminaBarcodes opens to many files which crashes the run. It's dual index data with UMIs and I need unmapped BAM files with the umi sequence. I checked the MD5sum of the raw data several times and I also run a check on the Basecall dir.
I monitored the open files of the process with 'lsof' and it quickly exceeds 120000 files, which is the maximum that I can set with 'ulimit -n' .
picard.PicardException: File not found: (/data/gpfs-1/users/altwassr_c/scratch/data/220325_A00643_0438_BH22YTDSX2/Data/Intensities/BaseCalls/L002/C237.1/L002_1.cbcl)
at picard.illumina.parser.readers.BaseBclReader.open(BaseBclReader.java:93)
at picard.illumina.parser.readers.CbclReader.readHeader(CbclReader.java:127)
at picard.illumina.parser.readers.CbclReader.readTileData(CbclReader.java:200)
at picard.illumina.parser.readers.CbclReader.advance(CbclReader.java:275)
at picard.illumina.parser.readers.CbclReader.hasNext(CbclReader.java:252)
at picard.illumina.parser.NewIlluminaDataProvider.hasNext(NewIlluminaDataProvider.java:125)
at picard.illumina.ExtractIlluminaBarcodes$PerTileBarcodeExtractor.run(ExtractIlluminaBarcodes.java:363)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.FileNotFoundException: /data/gpfs-1/users/altwassr_c/scratch/data/220325_A00643_0438_BH22YTDSX2/Data/Intensities/BaseCalls/L002/C237.1/L002_1.cbcl (Too many open files)
at java.io.FileInputStream.open0(Native Method)
at java.io.FileInputStream.open(FileInputStream.java:195)
at java.io.FileInputStream.(FileInputStream.java:138)
at picard.illumina.parser.readers.BaseBclReader.open(BaseBclReader.java:90)
... 11 more
INFO 2022-04-19 04:41:06 ExtractIlluminaBarcodes Extracting barcodes for tile 2141
@gbggrant this error came up on the GATK Forum. Is there anything going wrong with ExtractIlluminaBarcodes that it is opening 120000 files? This user has a limit of 100000. Here they have already tried increasing --MAX_RECORDS_IN_RAM.
This request was created from a contribution made by Robert Altwasser on April 19, 2022 10:09 UTC.
Link: https://gatk.broadinstitute.org/hc/en-us/community/posts/5461192217627-Picard-Too-many-open-files-
--
I am demultiplexing a S4 sequencing run and
Picard
ExtractIlluminaBarcodes opens to many files which crashes the run. It's dual index data with UMIs and I need unmapped BAM files with the umi sequence. I checked the MD5sum of the raw data several times and I also run a check on the Basecall dir.I monitored the open files of the process with
'lsof'
and it quickly exceeds 120000 files, which is the maximum that I can set with'ulimit -n'
.Here is the RunInfo:
a) Versions:
The Genome Analysis Toolkit (GATK) v4.2.5.0
HTSJDK Version: 2.24.1
Picard Version: 2.25.4
Java: openjdk version "1.8.0_312"
b) Exact command used:
(bash) $ ulimit -n 100000
picard -Xmx110g -Djava.io.tmpdir=/data/gpfs-1/users/altwassr_c/scratch/tmp/ -Xms110g \
ExtractIlluminaBarcodes \
-B /data/gpfs-1/users/altwassr_c/scratch/data/220325_A00643/Data/Intensities/BaseCalls/ \
-L 1 \
--NUM_PROCESSORS 1 \
-M metrices/barcode_metrices1.txt \
-BARCODE_FILE /data/gpfs-1/users/altwassr_c/work/projekte/barcode1.csv \
-RS 148T8B9M8B148T \
--MAX_RECORDS_IN_RAM 1000000000 \
--TMP_DIR /data/gpfs-1/users/altwassr_c/scratch/tmp/
c) Log: ``
ERROR 2022-04-19 04:41:06 ExtractIlluminaBarcodes Error processing tile 2140
picard.PicardException: File not found: (/data/gpfs-1/users/altwassr_c/scratch/data/220325_A00643_0438_BH22YTDSX2/Data/Intensities/BaseCalls/L002/C237.1/L002_1.cbcl)
at picard.illumina.parser.readers.BaseBclReader.open(BaseBclReader.java:93)
at picard.illumina.parser.readers.CbclReader.readHeader(CbclReader.java:127)
at picard.illumina.parser.readers.CbclReader.readTileData(CbclReader.java:200)
at picard.illumina.parser.readers.CbclReader.advance(CbclReader.java:275)
at picard.illumina.parser.readers.CbclReader.hasNext(CbclReader.java:252)
at picard.illumina.parser.NewIlluminaDataProvider.hasNext(NewIlluminaDataProvider.java:125)
at picard.illumina.ExtractIlluminaBarcodes$PerTileBarcodeExtractor.run(ExtractIlluminaBarcodes.java:363)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.FileNotFoundException: /data/gpfs-1/users/altwassr_c/scratch/data/220325_A00643_0438_BH22YTDSX2/Data/Intensities/BaseCalls/L002/C237.1/L002_1.cbcl (Too many open files)
at java.io.FileInputStream.open0(Native Method)
at java.io.FileInputStream.open(FileInputStream.java:195)
at java.io.FileInputStream.(FileInputStream.java:138)
at picard.illumina.parser.readers.BaseBclReader.open(BaseBclReader.java:90)
... 11 more
INFO 2022-04-19 04:41:06 ExtractIlluminaBarcodes Extracting barcodes for tile 2141
ERROR 2022-04-19 04:41:06 ExtractIlluminaBarcodes Error processing tile 2141
picard.PicardException: Unrecognized data type(Cbcl) found by IlluminaDataProviderFactory!
at picard.illumina.parser.IlluminaDataProviderFactory.makeParser(IlluminaDataProviderFactory.java:400)
at picard.illumina.parser.IlluminaDataProviderFactory.makeDataProvider(IlluminaDataProviderFactory.java:249)
at picard.illumina.parser.IlluminaDataProviderFactory.makeDataProvider(IlluminaDataProviderFactory.java:228)
at picard.illumina.ExtractIlluminaBarcodes$PerTileBarcodeExtractor.run(ExtractIlluminaBarcodes.java:355)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
(created from Zendesk ticket #281653)
gz#281653
The text was updated successfully, but these errors were encountered: