Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mmseqs2 core dump #4

Open
jmartinsjrbr opened this issue Jun 18, 2021 · 6 comments
Open

mmseqs2 core dump #4

jmartinsjrbr opened this issue Jun 18, 2021 · 6 comments
Assignees
Labels
enhancement New feature or request

Comments

@jmartinsjrbr
Copy link

jmartinsjrbr commented Jun 18, 2021

Hi,

We have installed Agnostos-wf and in our first attempt to analyze our metagenomic data we got the error bellow (in bold) after running 'db_creation' workflow

Used command line:
snakemake --use-conda -j 100 --cluster-config config/cluster.yaml --cluster "sbatch --export=ALL -t {cluster.time} -c {threads} --ntasks-per-node {cluster.ntasks_per_node} --nodes {cluster.nodes} --cpus-per-task {cluster.cpus_per_task} --job-name {rulename}.{jobid} --partition {cluster.partition}" -R --until workflow_report

#################################BEGIN slurm log file##############################################
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 5
Rules claiming more threads will be scaled down.
Job counts:
count jobs
1 mmseqs_clustering
1
Select jobs to execute...

[Fri Jun 18 15:05:26 2021]
rule mmseqs_clustering:
output: /home/joaquim.junior/work/projects/bagasse/analysis/agnostos/db_creation/mmseqs_clustering/cluDB.tsv
log: logs/mmseqs_clustering_stdout.log, logs/mmseqs_clustering_stderr.err
jobid: 0
benchmark: benchmarks/mmseqs_clustering/clu.tsv
threads: 5

  • set -e

  • export 'OMPI_MCA_btl=^openib'

  • OMPI_MCA_btl='^openib'

  • export OMP_NUM_THREADS=5

  • OMP_NUM_THREADS=5

  • export OMP_PROC_BIND=FALSE

  • OMP_PROC_BIND=FALSE

  • /home/bioinf/progs/agnostos/agnostos-wf/bin/mmseqs createdb /home/joaquim.junior/work/projects/bagasse/analysis/agnostos/db_creation/mmseqs_clustering/seqDB
    /usr/bin/bash: line 8: 1032725 Illegal instruction (core dumped) /home/bioinf/progs/agnostos/agnostos-wf/bin/mmseqs createdb /home/joaquim.junior/work/projects/bagasse/analysis/agnostos/db_creation/mmseqs_clustering/seqDB 2> logs/mm>[Fri Jun 18 15:05:27 2021]
    Error in rule mmseqs_clustering:
    jobid: 0
    output: /home/joaquim.junior/work/projects/bagasse/analysis/agnostos/db_creation/mmseqs_clustering/cluDB.tsv
    log: logs/mmseqs_clustering_stdout.log, logs/mmseqs_clustering_stderr.err (check log file(s) for error message)
    shell:

      set -x
      set -e
    
      export OMPI_MCA_btl=^openib
      export OMP_NUM_THREADS=5
      export OMP_PROC_BIND=FALSE
    
      /home/bioinf/progs/agnostos/agnostos-wf/bin/mmseqs createdb  /home/joaquim.junior/work/projects/bagasse/analysis/agnostos/db_creation/mmseqs_clustering/seqDB 2>logs/mmseqs_clustering_stderr.err 1>logs/mmseqs_clustering_stdout.log
        /home/bioinf/progs/agnostos/agnostos-wf/bin/mmseqs cluster           /home/joaquim.junior/work/projects/bagasse/analysis/agnostos/db_creation/mmseqs_clustering/seqDB           /home/joaquim.junior/work/projects/bagasse/analysi>
      /home/bioinf/progs/agnostos/agnostos-wf/bin/mmseqs createtsv /home/joaquim.junior/work/projects/bagasse/analysis/agnostos/db_creation/mmseqs_clustering/seqDB /home/joaquim.junior/work/projects/bagasse/analysis/agnostos/db_creati>
      (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
    

Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
#################################END slurm log file##############################################

I was not able to figure out what might be happening.

Best regards,
Joaquim

@genomewalker
Copy link
Contributor

Hi Joaquim
looks like is not picking up the ORF files in the folder:
/home/joaquim.junior/work/projects/bagasse/analysis/agnostos/db_creation/gene_prediction
Can you check that you have fastA files in there?

@jmartinsjrbr
Copy link
Author

Hi genomewalker,
In the path you mentioned there is only this folder:
combine_samples/

I have used as input files only the metagenome contigs, like decribed on the topic:
1. DB-creation module: Start from a set of genomic/metagenomic contigs in fasta format and retrieve a database of categorised gene clusters and cluster communities.

Shall I have to include extra input files?

Hi Joaquim
looks like is not picking up the ORF files in the folder:
/home/joaquim.junior/work/projects/bagasse/analysis/agnostos/db_creation/gene_prediction
Can you check that you have fastA files in there?

@genomewalker genomewalker added the enhancement New feature or request label Jun 18, 2021
@genomewalker
Copy link
Contributor

I had a look at the code and seems that it expect that the contig should have the following format:

{smp}_contigs.fasta

where smp would be the name of your sample or any other string. If you rename your contigs file like this it might work.

@ChiaraVanni can you generalise this so it doesn't depend on the contigs file name?

@jmartinsjrbr
Copy link
Author

jmartinsjrbr commented Jun 19, 2021

My input file is named: "inFile_contigs.fasta"

In addtion, find bellow my config.yaml in db_creation folder:

Maybe you can figure out some mistake I could been made.

###########################################db_creation/config/config.yaml#############################

# This file should contain everything to configure the workflow on a global scale.
# In case of sample based data, it should be complemented by a samples.tsv file that contains
# one row per sample. It can be parsed easily via pandas.
wdir: "/home/bioinf/progs/agnostos/agnostos-wf/db_creation"
rdir: "/home/joaquim.junior/work/projects/bagasse/analysis/agnostos/db_creation"
data: "/home/joaquim.junior/work/projects/bagasse/analysis/agnostos/L5prokka_contigs.fasta" # rename your data to match the format "{sample_name}_contigs.fasta"
# choose a name for your dataset
data_name: "L5prokka"

# If you want to classify the singleton in the four category set the following entry to "true"
singl: "true"

conda_env: "/home/bioinf/progs/agnostos/agnostos-wf/envs/workflow.yml"
# Threads configuration
threads_default: 16
threads_collect: 16
threads_cat_ref: 4
# Databases
pfam_db: "/home/bioinf/progs/agnostos/agnostos-wf/databases/Pfam-A.hmm"
pfam_clan: "/home/bioinf/progs/agnostos/agnostos-wf/databases/Pfam-A.clans.tsv.gz"
antifam_db: "/home/bioinf/progs/agnostos/agnostos-wf/databases/AntiFam.hmm"
uniref90_db: "/home/bioinf/progs/agnostos/agnostos-wf/databases/uniref90.db"
nr_db: "/home/bioinf/progs/agnostos/agnostos-wf/databases/nr.db"
uniclust_db: "/home/bioinf/progs/agnostos/agnostos-wf/databases/uniclust30_2018_08/uniclust30_2018_08"
#uniprot_db: "/home/bioinf/progs/agnostos/agnostos-wf/databases/uniprotKB.fasta.gz"
pfam_hh_db: "/home/bioinf/progs/agnostos/agnostos-wf/databases/pfam"
DPD: "/home/bioinf/progs/agnostos/agnostos-wf/databases/dpd_uniprot_sprot.fasta.gz"
db_dir: "/home/bioinf/progs/agnostos/agnostos-wf/databases/"
taxdb: "/home/bioinf/progs/agnostos/agnostos-wf/databases/uniprotKB"
gtdb_tax: "/home/bioinf/progs/agnostos/agnostos-wf/databases/gtdb-r89_54k/gtdb-r89_54k.fmi"
# Files retrieved from the databases
# List of shared reduced Pfam domain names (dowloadable from Figshare..)
pfam_shared_terms: "/home/bioinf/progs/agnostos/agnostos-wf/databases/Pfam-31_names_mod_01122019.tsv"
# Created using the protein accessions and the descriptions found on the fasta headers
uniref90_prot: "/home/bioinf/progs/agnostos/agnostos-wf/databases/uniref90.proteins.tsv.gz"
nr_prot: "/home/bioinf/progs/agnostos/agnostos-wf/databases/nr.proteins.tsv.gz"
# Information dowloaded from Dataset-S1 from the DPD paper:
dpd_info: "/home/bioinf/progs/agnostos/agnostos-wf/databases/dpd_ids_all_info.tsv.gz"

# Local template folder
local_tmp: "/home/bioinf/tmp"

# MPI runner (de.NBI cloud, SLURM)
mpi_runner: "srun --mpi=pmi2"

#vmtouch for the DBs
vmtouch: "vmtouch"

# Gene prediction
prodigal_mode: "meta" #"meta" for metagenomes or "normal" for genomes
prodigal_bin: "prodigal"

# Annotation
hmmer_bin: "/home/bioinf/progs/agnostos/agnostos-wf/bin/hmmsearch"

# Clustering config
ffindex_apply: "/home/bioinf/progs/agnostos/agnostos-wf/bin/ffindex_apply_mpi"
mmseqs_bin: "/home/bioinf/progs/agnostos/agnostos-wf/bin/mmseqs"
mmseqs_tmp: "/home/bioinf/progs/agnostos/agnostos-wf/tmp"
mmseqs_local_tmp: "/home/bioinf/tmp"
mmseqs_split_mem: "100G"
mmseqs_split: 10

# Clustering results config
seqtk_bin: "seqtk"

# Spurious and shadows config
hmmpress_bin: "/home/bioinf/progs/agnostos/agnostos-wf/bin/hmmpress"

# Compositional validation config
datamash_bin: "datamash"
famsa_bin: "/home/bioinf/progs/agnostos/agnostos-wf/bin/famsa"
odseq_bin: "/home/bioinf/progs/agnostos/agnostos-wf/bin/OD-seq"
leonbis_bin: "/home/bioinf/progs/agnostos/agnostos-wf/bin/leon-bis.tcsh"
parasail_bin: "/home/bioinf/progs/agnostos/agnostos-wf/bin/parasail_aligner"
parallel_bin: "parallel"
get_stats: "/home/bioinf/progs/agnostos/agnostos-wf/db_creation/scripts/get_stats.r"
isconn: "/home/bioinf/progs/agnostos/agnostos-wf/db_creation/scripts/is_connected"
filterg: "/home/bioinf/progs/agnostos/agnostos-wf/db_creation/scripts/filter_graph"
igraph_lib: "export LD_LIBRARY_PATH=/home/bioinf/progs/agnostos/agnostos-wf/bin/igraph/lib:${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}"
parasail_lib: "export LD_LIBRARY_PATH=/home/bioinf/progs/agnostos/agnostos-wf/lib:${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}"

# Cluster classification config
seqkit_bin: "seqkit"
filterbyname: "filterbyname.sh"
hhcons_bin: "/home/bioinf/progs/agnostos/agnostos-wf/bin/hh-suite/bin/hhconsensus"

# Cluster category refinement
hhsuite: "/home/bioinf/progs/agnostos/agnostos-wf/bin/hh-suite"
hhblits_bin_mpi: "/home/bioinf/progs/agnostos/agnostos-wf/bin/hh-suite/bin/hhblits_mpi"
hhmake: "/home/bioinf/progs/agnostos/agnostos-wf/binhh-suite/bin/hhmake"
hhblits_prob: 90
hypo_filt: 1.0

# Taxonomy
kaiju_bin: "/home/bioinf/progs/agnostos/agnostos-wf/bin/kaiju"

# Cluster communities
hhblits_bin: "/home/bioinf/progs/agnostos/agnostos-wf/bin/hh-suite/bin/hhblits"
hhsearch_bin: "/home/bioinf/progs/agnostos/agnostos-wf/bin/hh-suite/bin/hhsearch"

########################################db_creation/config/config.yaml################################

Thanks,
Joaquim

@genomewalker
Copy link
Contributor

Hi Joaquim

here:

data: "/home/joaquim.junior/work/projects/bagasse/analysis/agnostos/L5prokka_contigs.fasta" # rename your data to match the format "{sample_name}_contigs.fasta"

should be:

data: "/home/joaquim.junior/work/projects/bagasse/analysis/agnostos/" # rename your data to match the format "{sample_name}_contigs.fasta"

the data folder should contain all the contig fastA files you want to process.

@genomewalker
Copy link
Contributor

Hi @jmartinsjrbr

did this work?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants