Skip to content

Use the central fDOG

Vinh Tran edited this page Jan 24, 2023 · 9 revisions

(for internal use only!)

Prerequisite

We provide a central installation of fDOG that you can directly use with your data. To use this, you must however have all the dependencies ready on your computer. These dependencies are:

ncbi-blast+
hmmer
clustalw
mafft
muscle

The same as the standalone version, you still need a searchTaxa_dir(*) folder, where all of your taxa of interest should be found, a coreTaxa_dir(*) folder, where the blast DBs of taxa that should be included in the core sets are stored, as well as a annotation_dir(*) folder, where the annotation of all taxa in genome_dir and blast_dir are present in case you want to use FAS with fDOG. These data must also follow the format of fDOG. The easiest way to create the data from your genome files is using our functions /path/to/central/fdog/fdog.addTaxon or /path/to/central/fdog/fdog.addTaxa using this instruction.

We prepare also the ready-to-use data for some taxa. Please make use of them before wasting time for creating them again.

Usage

The usage of the central version of fDOG is not much different than the standalone version, you just need to specify these options to your own directories:

  • --searchpath /path/to/your/searchTaxa_dir
  • --corepath /path/to/your/coreTaxa_dir
  • --annopath /path/to/your/annotation_dir
  • --outpath /path/to/your/output_directory

Otherwise, fDOG will use the default directories, which are defined by the pathconfig.txt file in the installed fdog/bin/ folder.

For example:

/path/to/central/fDOG/fdog.run --seqFile /path/to/mySeq.fa --jobName test --refspec HUMAN@9606@3 --corepath ./coreTaxa_dir/ --searchpath ./searchTaxa_dir/ --annopath ./annotation_dir/ --outpath ./

where coreTaxa_dir, searchTaxa_dir, annotation_dir are found in the current directory ./. The core ortholog called test will be saved in the current directory under core_orthologs/test/ folder. The final fDOG output will be saved within current directory.

Another option to give paths to fDOG is using a path configuration file in YAML format. An example for a path_config.yml file:

outpath: '/home/vinh/fdog/out/'
corepath: '/home/vinh/fdog/data/coreTaxa_dir'
searchpath: '/home/vinh/fdog/data/searchTaxa_dir'
annopath: '/home/vinh/fdog/data/annotation_dir'

Then, call fdog.run (or fdogs.run) with the option --pathFile:

fdog.run --seqFile /path/to/mySeq.fa --jobName test --refspec HUMAN@9606@3 --pathFile /path/to/path_config.yml

To learn about all available options of fDOG, please use this command:

/path/to/central/fDOG/fdog.run -h