Skip to content

Commit

Permalink
vignettes be added
Browse files Browse the repository at this point in the history
  • Loading branch information
Miachol committed Jun 23, 2017
1 parent db4e35b commit 52f2a97
Show file tree
Hide file tree
Showing 3 changed files with 177 additions and 1 deletion.
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,14 @@ BioInstaller package
==============

## Introduction
[Conda](https://conda.io/docs/intro.html) and [Bioconda](http://bioconda.github.io/) have made it easy to install many packages and bio-softwares conveniently. Yet, learning how to install and compile bioinformatics softwares were still necessary. Because, the experience will help you to improve the ability of debugging.
[Conda](https://conda.io/docs/intro.html) and [Bioconda](http://bioconda.github.io/) have made us easy to install many packages and bio-softwares conveniently. Yet, learning how to install and compile bioinformatics softwares were still necessary. Because, the experience will help you to improve the ability of debugging.

Especialy, when start a NGS analysis work in a new computer or system, you need costs so much time and energy to establish a complete set of softwares and dependce of a analysis pipeline and set the corresponding configuration file.

[BioInstaller](https://github.com/JhuangLab/BioInstaller) can be used to download/install bioinformatics tools, dependences and databases in R relatively easily, and the information of installed softwares will be saved which can be used to generate configuration file. More detail can be founded in [Document](http://bioinfo.rjh.com.cn/labs/jhuang/tools/BioInstaller/) website.

Moreover, BioInstaller provide a different way to provide softwares download/install for others.

## Installation

### CRAN
Expand Down
81 changes: 81 additions & 0 deletions vignettes/BioInstaller.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
---
title: "Introduction to BioInstaller"
author: "Jianfeng Li"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Introduction to BioInstaller}
%\VignetteEngine{knitr::rmarkdown}
\usepackage[utf8]{inputenc}
---

```{r, echo = FALSE}
knitr::opts_chunk$set(comment = "#>", collapse = TRUE)
```

## Introduction
[Conda](https://conda.io/docs/intro.html) and [BioContainer](https://biocontainers.pro) have made it easy to install many packages and bio-softwares conveniently. Yet, learning how to install and compile bioinformatics softwares were still necessary. Because, the experience will help you to improve the ability of debugging.

Especialy, when start a NGS analysis work in a new computer or system, you need costs so much time and energy to
establish a complete set of softwares and dependce of a analysis pipeline and set the corresponding configuration file.

[BioInstaller](https://github.com/JhuangLab/BioInstaller) can be used to download/install bioinformatics tools, dependences and databases in R relatively easily, and the information of installed softwares will be saved which can be used to generate configuration file.

Moreover, BioInstaller provide a different way to provide softwares download/install for others.

**Feature**:

- Extendible
- Craw the source code and version information from the original site
- One step installation or download softwares and databases (Partial dependence supported)

## Core function in BioInstaller

```{r}
library(BioInstaller)
# Show all avaliable softwares/dependece in default inst/extdata/github.toml
# and inst/extdata/nongithub.toml
install.bioinfo(show.all.names = TRUE)
# Fetching versions of softwares
install.bioinfo('samtools', show.all.versions = TRUE)
# Install 'demo' quite
download.dir <- sprintf('%s/demo_1', tempdir())
install.bioinfo('demo', download.dir = download.dir, verbose = FALSE)
# Install 'demo' with debug infomation
download.dir <- sprintf('%s/demo_2', tempdir())
install.bioinfo('demo', download.dir = download.dir, verbose = TRUE)
# Download demo source code
download.dir <- sprintf('%s/demo_3', tempdir())
install.bioinfo('demo', download.dir = download.dir,
download.only = TRUE, verbose = TRUE)
# Set download.dir rrr destdir (destdir like /usr/local
# including bin, lib, include and others),
# destdir will work if install step {{destdir}} be used
download.dir <- sprintf('%s/demo_source', tempdir())
destdir <- sprintf('%s/demo', tempdir())
install.bioinfo('demo', download.dir = download.dir, destdir = destdir)
```

## Storage meta information of databases and softwares

When I install and download massive softwares and databases, I facing the problem how to found it. If we not to save the meta information when you download or install these softwares or databases, you would be in really dire straits.

In fact, version, path, source code path and update time will be saved if you using BioInstaller to install some of softwares. Moreover, you can use some of function in BioInstaller to modify the information in `BIO_SOFWARES_DB_ACTIVE` database, a TOML format file.

```{r}
temp.db <- tempfile()
set.biosoftwares.db(temp.db)
is.biosoftwares.db.active(temp.db)
params <- list(name = 'demo', comments = 'This is a demo.')
do.call(change.info, params)
get.info('demo')
del.info('demo')
```


93 changes: 93 additions & 0 deletions vignettes/write_configuration_file.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
---
title: "Examples of Templet Configuration File"
author: "Jianfeng Li"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Examples of Templet Configuration File}
%\VignetteEngine{knitr::rmarkdown}
\usepackage[utf8]{inputenc}
---

```{r, echo = FALSE}
knitr::opts_chunk$set(comment = "#>", collapse = TRUE)
```

BioInstaller using [configr](https://github.com/Miachol/configr) to parse all of configuration files, so you can use some of `code` to set all of item in configuration file which can be parsed by `configr` package. Example of `code` can be found below.

## github.toml and nongithub.toml

Built-in configuration files: `github.toml` and `nongithub.toml` let us to download/install several softwares/dependence by default parameters of BioInstaller. `install.bioinfo(show.all.names = TRUE)` can found all of avaliable softwares, dependence in github.toml and nongithub.toml.

### Github Softwares
Github softwares version control can be done `git2r` package. Source url be setted by `github_url`.

If `use_git2r` be setted to `false`, BioInstaller will use the [git](https://en.wikipedia.org/wiki/Git) of your system.

In addition, when `use_git2r` be setted to `false` and `recursive_clone` be setted to `true`, the behaviour is like that `git clone --recursive https://path/repo`

```toml
[bwa]
github_url = "https://github.com/lh3/bwa"
after_failure = "echo 'fail!'"
after_success = "echo 'successful!'"
make_dir = ["./"]
bin_dir = ["./"]

[bwa.before_install]
linux = ""
mac = ""

[bwa.install]
linux = "make"
mac = "make"
```

### Non-Github Softwares or Databases

Non-Github softwares version control need to write a function parsing URL and use `{{version}}` to replace in the `source_url`.

`url_all_download` be setted to `true` if need to download mulitple files. [rvest](https://cran.r-project.org/package=rvest) and [RCurl](https://cran.r-project.org/package=RCurl) packages can be used to parse the version infomation of non-github softwares or databases.
`version_order_fixed` can be setted to `true` if you don't want to using the built-in version reorder function.

If you set `url_all_download` to `false`, which can let us using multiple mirror to avoid one of invalid URL.

```toml
[gmap]
# {{version}} will be parsed to your install.bioinfo `version` parameter
# or the newest version parsed from fetched data.
source_url = "http://research-pub.gene.com/gmap/src/{{version}}.tar.gz"
after_failure = "echo 'fail!'"
after_success = "echo 'successful!'"
make_dir = ["./"]
bin_dir = ["./"]

[gmap.before_install]
linux = ""
mac = ""

[gmap.install]
linux = "./configure --prefix=`pwd` && make && make install"
mac = ["sed -i s/\"## CFLAGS='-O3 -m64' .*\"/\"CFLAGS='-O3 -m64'\"/ config.site",
"./configure --prefix=`pwd` && make && make install"]
```

## nongithub_databases_blast.toml

The configuration file can be used to download NCBI blast database. You can use this file: `install.bioinfo(nongithub.cfg = system.file('extdata', 'nongithub_databases_blast.toml', package = 'BioInstaller'), show.all.names = TRUE)`.

BioInstaller using [configr](https://github.com/Miachol/configr) `glue` to reduce the length of files name. That can let us using less word to storage more files name. More usefile databases FTP url can be accessed in the future. I hope you can set your own configuration file not only using the BioInstaller built-in configuration files.

```{r}
library(configr)
library(BioInstaller)
blast.databases <- system.file('extdata',
'nongithub_databases_blast.toml', package = 'BioInstaller')
read.config(blast.databases)$blast_nr$source_url
read.config(blast.databases, glue.parse = TRUE)$blast_nr$source_url
mask.github <- tempfile()
file.create(mask.github)
install.bioinfo(nongithub.cfg = blast.databases, github.cfg = mask.github,
show.all.names = TRUE)
```

0 comments on commit 52f2a97

Please sign in to comment.