diff --git a/README.md b/README.md index 3c3b8de..bf6b363 100644 --- a/README.md +++ b/README.md @@ -4,12 +4,14 @@ BioInstaller package ============== ## Introduction -[Conda](https://conda.io/docs/intro.html) and [Bioconda](http://bioconda.github.io/) have made it easy to install many packages and bio-softwares conveniently. Yet, learning how to install and compile bioinformatics softwares were still necessary. Because, the experience will help you to improve the ability of debugging. +[Conda](https://conda.io/docs/intro.html) and [Bioconda](http://bioconda.github.io/) have made us easy to install many packages and bio-softwares conveniently. Yet, learning how to install and compile bioinformatics softwares were still necessary. Because, the experience will help you to improve the ability of debugging. Especialy, when start a NGS analysis work in a new computer or system, you need costs so much time and energy to establish a complete set of softwares and dependce of a analysis pipeline and set the corresponding configuration file. [BioInstaller](https://github.com/JhuangLab/BioInstaller) can be used to download/install bioinformatics tools, dependences and databases in R relatively easily, and the information of installed softwares will be saved which can be used to generate configuration file. More detail can be founded in [Document](http://bioinfo.rjh.com.cn/labs/jhuang/tools/BioInstaller/) website. +Moreover, BioInstaller provide a different way to provide softwares download/install for others. + ## Installation ### CRAN diff --git a/vignettes/BioInstaller.Rmd b/vignettes/BioInstaller.Rmd new file mode 100644 index 0000000..5e7290e --- /dev/null +++ b/vignettes/BioInstaller.Rmd @@ -0,0 +1,81 @@ +--- +title: "Introduction to BioInstaller" +author: "Jianfeng Li" +date: "`r Sys.Date()`" +output: rmarkdown::html_vignette +vignette: > + %\VignetteIndexEntry{Introduction to BioInstaller} + %\VignetteEngine{knitr::rmarkdown} + \usepackage[utf8]{inputenc} +--- + +```{r, echo = FALSE} +knitr::opts_chunk$set(comment = "#>", collapse = TRUE) +``` + +## Introduction +[Conda](https://conda.io/docs/intro.html) and [BioContainer](https://biocontainers.pro) have made it easy to install many packages and bio-softwares conveniently. Yet, learning how to install and compile bioinformatics softwares were still necessary. Because, the experience will help you to improve the ability of debugging. + +Especialy, when start a NGS analysis work in a new computer or system, you need costs so much time and energy to + establish a complete set of softwares and dependce of a analysis pipeline and set the corresponding configuration file. + +[BioInstaller](https://github.com/JhuangLab/BioInstaller) can be used to download/install bioinformatics tools, dependences and databases in R relatively easily, and the information of installed softwares will be saved which can be used to generate configuration file. + +Moreover, BioInstaller provide a different way to provide softwares download/install for others. + +**Feature**: + +- Extendible +- Craw the source code and version information from the original site +- One step installation or download softwares and databases (Partial dependence supported) + +## Core function in BioInstaller + +```{r} +library(BioInstaller) + +# Show all avaliable softwares/dependece in default inst/extdata/github.toml +# and inst/extdata/nongithub.toml +install.bioinfo(show.all.names = TRUE) + +# Fetching versions of softwares +install.bioinfo('samtools', show.all.versions = TRUE) + +# Install 'demo' quite +download.dir <- sprintf('%s/demo_1', tempdir()) +install.bioinfo('demo', download.dir = download.dir, verbose = FALSE) + +# Install 'demo' with debug infomation +download.dir <- sprintf('%s/demo_2', tempdir()) +install.bioinfo('demo', download.dir = download.dir, verbose = TRUE) + +# Download demo source code +download.dir <- sprintf('%s/demo_3', tempdir()) +install.bioinfo('demo', download.dir = download.dir, + download.only = TRUE, verbose = TRUE) + +# Set download.dir rrr destdir (destdir like /usr/local +# including bin, lib, include and others), +# destdir will work if install step {{destdir}} be used +download.dir <- sprintf('%s/demo_source', tempdir()) +destdir <- sprintf('%s/demo', tempdir()) +install.bioinfo('demo', download.dir = download.dir, destdir = destdir) +``` + +## Storage meta information of databases and softwares + +When I install and download massive softwares and databases, I facing the problem how to found it. If we not to save the meta information when you download or install these softwares or databases, you would be in really dire straits. + +In fact, version, path, source code path and update time will be saved if you using BioInstaller to install some of softwares. Moreover, you can use some of function in BioInstaller to modify the information in `BIO_SOFWARES_DB_ACTIVE` database, a TOML format file. + +```{r} +temp.db <- tempfile() +set.biosoftwares.db(temp.db) +is.biosoftwares.db.active(temp.db) +params <- list(name = 'demo', comments = 'This is a demo.') +do.call(change.info, params) +get.info('demo') +del.info('demo') +``` + + diff --git a/vignettes/write_configuration_file.Rmd b/vignettes/write_configuration_file.Rmd new file mode 100644 index 0000000..d8621ee --- /dev/null +++ b/vignettes/write_configuration_file.Rmd @@ -0,0 +1,93 @@ +--- +title: "Examples of Templet Configuration File" +author: "Jianfeng Li" +date: "`r Sys.Date()`" +output: rmarkdown::html_vignette +vignette: > + %\VignetteIndexEntry{Examples of Templet Configuration File} + %\VignetteEngine{knitr::rmarkdown} + \usepackage[utf8]{inputenc} +--- + +```{r, echo = FALSE} +knitr::opts_chunk$set(comment = "#>", collapse = TRUE) +``` + +BioInstaller using [configr](https://github.com/Miachol/configr) to parse all of configuration files, so you can use some of `code` to set all of item in configuration file which can be parsed by `configr` package. Example of `code` can be found below. + +## github.toml and nongithub.toml + +Built-in configuration files: `github.toml` and `nongithub.toml` let us to download/install several softwares/dependence by default parameters of BioInstaller. `install.bioinfo(show.all.names = TRUE)` can found all of avaliable softwares, dependence in github.toml and nongithub.toml. + +### Github Softwares +Github softwares version control can be done `git2r` package. Source url be setted by `github_url`. + +If `use_git2r` be setted to `false`, BioInstaller will use the [git](https://en.wikipedia.org/wiki/Git) of your system. + +In addition, when `use_git2r` be setted to `false` and `recursive_clone` be setted to `true`, the behaviour is like that `git clone --recursive https://path/repo` + +```toml +[bwa] +github_url = "https://github.com/lh3/bwa" +after_failure = "echo 'fail!'" +after_success = "echo 'successful!'" +make_dir = ["./"] +bin_dir = ["./"] + +[bwa.before_install] +linux = "" +mac = "" + +[bwa.install] +linux = "make" +mac = "make" +``` + +### Non-Github Softwares or Databases + +Non-Github softwares version control need to write a function parsing URL and use `{{version}}` to replace in the `source_url`. + +`url_all_download` be setted to `true` if need to download mulitple files. [rvest](https://cran.r-project.org/package=rvest) and [RCurl](https://cran.r-project.org/package=RCurl) packages can be used to parse the version infomation of non-github softwares or databases. +`version_order_fixed` can be setted to `true` if you don't want to using the built-in version reorder function. + +If you set `url_all_download` to `false`, which can let us using multiple mirror to avoid one of invalid URL. + +```toml +[gmap] +# {{version}} will be parsed to your install.bioinfo `version` parameter +# or the newest version parsed from fetched data. +source_url = "http://research-pub.gene.com/gmap/src/{{version}}.tar.gz" +after_failure = "echo 'fail!'" +after_success = "echo 'successful!'" +make_dir = ["./"] +bin_dir = ["./"] + +[gmap.before_install] +linux = "" +mac = "" + +[gmap.install] +linux = "./configure --prefix=`pwd` && make && make install" +mac = ["sed -i s/\"## CFLAGS='-O3 -m64' .*\"/\"CFLAGS='-O3 -m64'\"/ config.site", +"./configure --prefix=`pwd` && make && make install"] +``` + +## nongithub_databases_blast.toml + +The configuration file can be used to download NCBI blast database. You can use this file: `install.bioinfo(nongithub.cfg = system.file('extdata', 'nongithub_databases_blast.toml', package = 'BioInstaller'), show.all.names = TRUE)`. + +BioInstaller using [configr](https://github.com/Miachol/configr) `glue` to reduce the length of files name. That can let us using less word to storage more files name. More usefile databases FTP url can be accessed in the future. I hope you can set your own configuration file not only using the BioInstaller built-in configuration files. + +```{r} +library(configr) +library(BioInstaller) +blast.databases <- system.file('extdata', + 'nongithub_databases_blast.toml', package = 'BioInstaller') + +read.config(blast.databases)$blast_nr$source_url +read.config(blast.databases, glue.parse = TRUE)$blast_nr$source_url +mask.github <- tempfile() +file.create(mask.github) +install.bioinfo(nongithub.cfg = blast.databases, github.cfg = mask.github, + show.all.names = TRUE) +```