Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in matam_assembly on Docker when running example file #63

Open
rachelleLim opened this issue Jul 11, 2018 · 5 comments
Open

Error in matam_assembly on Docker when running example file #63

rachelleLim opened this issue Jul 11, 2018 · 5 comments
Assignees

Comments

@rachelleLim
Copy link

Hi,
Thanks so much for this program, it seems really cool! :)
My working computer is a Mac, so I've been using Docker to run matam. I use an interactive session of matam as follows (docker run -it bonsaiteam/matam) and then run the following code:
matam_assembly.py -i examples/16sp_simulated_dataset/16sp.art_HS25_pe_100bp_50x.fq
I then get the following error:
INFO - === MATAM assembly ===
INFO - CMD: /matam/scripts/matam_assembly.py --cpu 1 --max_memory 10000 --best 10 --evalue 1.00e-05 --score_threshold 0.90 --coverage_threshold 0 --min_identity 1.00 --min_overlap_length 50 --min_read_node 1 --min_overlap_edge 1 --quorum 0.51 --read_correction auto --contig_coverage_threshold 20 --min_scaffold_length 500 --out_dir /matam/matam_assembly --ref_db /matam/db/SILVA_128_SSURef_NR95 --input_fastx /matam/examples/16sp_simulated_dataset/16sp.art_HS25_pe_100bp_50x.fq
INFO - === Input ===
INFO - Input file: /matam/examples/16sp_simulated_dataset/16sp.art_HS25_pe_100bp_50x.fq
INFO - Input file reads nb: 11650 reads
INFO - === Reads mapping against ref db ===
CRITICAL - The last command returns a non-zero return code: 1
Non-zero return code

Would it be possible to get a fix on this? We have no linux computers in my lab and this seems really useful!
Thank you!

Additional info:
Docker version:
Docker version 18.03.1-ce, build 9ee9f40
Matam Image:
bonsaiteam/matam latest 75143b82cd20 5 months ago 4.02GB

@loic-couderc
Copy link
Member

Hi @rachelleLim,

Sorry for the poor error message.
Generally, we advise to run MATAM with the verbose option: -v.

The error you encountered arise because the SSU rRNA reference database is missing.
You have to get the reference database before running MATAM with the following commands:

DBDIR=/matam/db
# retrieve & index the database
index_default_ssu_rrna_db.py -d $DBDIR --max_memory 10000
# run MATAM on the default db
matam_assembly.py -d $DBDIR/SILVA_128_SSURef_NR95 -i examples/16sp_simulated_dataset/16sp.art_HS25_pe_100bp_50x.fq --cpu 4 --max_memory 10000 -v

Thank you for your interest.

@rachelleLim
Copy link
Author

Dang, that makes sense haha!! Thank you so much for the quick response and clarification!! :D

@rachelleLim
Copy link
Author

rachelleLim commented Jul 11, 2018

Hi Loic,
Sorry to respond again but I ran the code as you suggested and index_default_ssu_rrna_db.py completed successfully

2018-07-11 16:21:56,408 - INFO - -- Completed default SSU rRNA DB indexing --
2018-07-11 16:21:56,457 - DEBUG - Indexing completed in 5192.87 seconds
2018-07-11 16:21:56,459 - INFO - Indexing went well. Default SSU rRNA DB and its indexes can be found in: /matam/db/SILVA_128_SSURef_NR95*

However I've now run into a second error running the example dataset (matam_assembly.py -d $DBDIR/SILVA_128_SSURef_NR95 -i examples/16sp_simulated_dataset/16sp.art_HS25_pe_100bp_50x.fq --cpu 4 --max_memory 10000 -v):
ERROR: The index '/matam/db/SILVA_128_SSURef_NR95.complete.stats' does not exist.
Make sure you have constructed your index using the command indexdb'. See indexdb -h' for help.

The command indexed doesn't seem to exist.....any insights? Sorry for the hassle, I really appreciate your prompt responses!

@loic-couderc
Copy link
Member

Hi @rachelleLim,

Currently, I’m not able to reproduce your error as no error shows up for me.
Some how, I’m suspecting the indexing step to failed even if MATAM claims the contrary.
Could you past the output of the indexing command to see what happens?

At the end of this step, the following files must be present in your $DBDIR:

root@cbe763ba4bfd:/matam# ls -ot $DBDIR
total 8748548
-rw-r--r-- 1 root   24072998 Jul 12 11:06 SILVA_128_SSURef_NR95.complete.fasta.fai
-rw-r--r-- 1 root    1808652 Jul 12 10:02 SILVA_128_SSURef_NR95.clustered.stats
-rw-r--r-- 1 root  936894096 Jul 12 10:02 SILVA_128_SSURef_NR95.clustered.pos_0.dat
-rw-r--r-- 1 root  432201520 Jul 12 10:02 SILVA_128_SSURef_NR95.clustered.bursttrie_0.dat
-rw-r--r-- 1 root    1048576 Jul 12 10:02 SILVA_128_SSURef_NR95.clustered.kmer_0.dat
-rw-r--r-- 1 root   15120477 Jul 12 09:51 SILVA_128_SSURef_NR95.complete.stats
-rw-r--r-- 1 root 5295589644 Jul 12 09:50 SILVA_128_SSURef_NR95.complete.pos_0.dat
-rw-r--r-- 1 root  895089300 Jul 12 09:48 SILVA_128_SSURef_NR95.complete.bursttrie_0.dat
-rw-r--r-- 1 root    1048576 Jul 12 09:48 SILVA_128_SSURef_NR95.complete.kmer_0.dat
-rw-r--r-- 1 root  140567671 Jan 18  2017 SILVA_128_SSURef_NR95.tar.bz2
-rw-r--r-- 1 1000   79032798 Jan 18  2017 SILVA_128_SSURef_NR95.complete.taxo.tab
-rw-r--r-- 1 1000 1019378193 Jan 18  2017 SILVA_128_SSURef_NR95.complete.fasta
-rw-r--r-- 1 1000  116607096 Jan 18  2017 SILVA_128_SSURef_NR95.clustered.fasta

@triplem90manas
Copy link

I always get this error whenever I do matam assembly.py
./matam_assembly.py -d DBDIR/SILVA_128_SSURef_NR95 -i /home/nada/matam-master/nohost1.fastq --cpu 4 --max_memory 10000 -v --perform_taxonomic_assignment
"No valid binary found for componentsearch"
please help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants