-
Notifications
You must be signed in to change notification settings - Fork 79
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
47 changed files
with
3,227 additions
and
522 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,3 +6,4 @@ doc/_build | |
.floo | ||
.flooignore | ||
out | ||
test_target |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,6 +4,40 @@ | |
|
||
MiXCR is a universal software for fast and accurate analysis of raw T- or B- cell receptor repertoire sequencing data. | ||
|
||
- Easy to use. Default pipeline can be executed without any additional parameters (see *Usage* section) | ||
|
||
- TCR and IG repertoires | ||
|
||
- Following species are supported *out-of-the-box* using built-in library: | ||
- human | ||
- mouse | ||
- rat (only TRB and TRA) | ||
- *... several new species will be available soon* | ||
|
||
- Efficiently extract repertoires from most of (if not *all*) types of TCR/IG-containing raw sequencing data: | ||
- data from all specialized RepSeq sample preparation protocols | ||
- RNA-Seq | ||
- WGS | ||
- single-cell data | ||
- *etc..* | ||
|
||
- Has optional CDR3 reconstruction step, that allows to *recover full hypervariable region from several disjoint reads*. Uses sophisticated algorithms protecting from false-positive assemblies at the same time having best in class efficiency. | ||
|
||
- Assemble clonotypes, applying several *error-correction* algorithms to eliminate artificial diversity arising from PCR and sequencing errors | ||
|
||
- Clonotypes can be assembled based on CDR3 sequence (default) as well as any other region, including *full-length* variable sequence (from beginning of FR1 to the end of FR4) | ||
|
||
- Provides exhaustive output information for clonotypes and per-read alignments: | ||
- nucleotide and amino acid sequences of all immunologically relevant regions (FR1, CDR1, ..., CDR3, etc..) | ||
- identified V, D, J, C genes | ||
- nucleotide and amino acid mutations in germline regions | ||
- variable region topology (number of end V / D / J nucleotide deletions, length of P-segments, number of non-template N nucleotides) | ||
- sequencing quality scores for any extracted sequence | ||
- several other useful pieces of information | ||
|
||
- Completely transparent pipeline, possible to track individual read fate from raw fastq entry to clonotype. Several useful tools available to evaluate pipeline performance: human readable alignments visualization, diff tool for alignment and clonotype files, etc... | ||
|
||
|
||
## Installation / Download | ||
|
||
#### Using Homebrew on Mac OS X or Linux (linuxbrew) | ||
|
@@ -17,7 +51,7 @@ to upgrade already installed MiXCR to the newest version: | |
|
||
#### Manual install (any OS) | ||
|
||
* download latest MiXCR version from [release page](https://github.com/milaboratory/mixcr/releases/latest) | ||
* download latest stable MiXCR build from [release page](https://github.com/milaboratory/mixcr/releases/latest) | ||
* unzip the archive | ||
* add resulting folder to your ``PATH`` variable | ||
* or add symbolic link for ``mixcr`` script to your ``bin`` folder | ||
|
@@ -30,20 +64,35 @@ to upgrade already installed MiXCR to the newest version: | |
|
||
## Usage | ||
|
||
Here is a very simple example of analysis of raw human RepSeq data: | ||
#### Enriched RepSeq Data | ||
|
||
Here is a very simple usage example that will extract repertoire data (in the form of clonotypes list) from raw sequencing data of enriched RepSeq library: | ||
|
||
mixcr align -r log.txt input_R1.fastq.gz input_R2.fastq.gz alignments.vdjca | ||
mixcr assemble -r log.txt alignments.vdjca clones.clns | ||
mixcr exportClones clones.clns clones.txt | ||
|
||
this sequence of commands will produce a tab-delimited list of clones (`clones.txt`) assembled by their CDR3 sequences with extensive information on their abundancies, V, D and J genes etc. | ||
this will produce a tab-delimited list of clones (`clones.txt`) assembled by their CDR3 sequences with extensive information on their abundances, V, D and J genes, mutations in germline regions, topology of VDJ junction etc. | ||
|
||
#### Repertoire extraction from RNA-Seq | ||
|
||
For more details see documentation. | ||
MiXCR is equally effective in extraction of repertoire information from non-enriched data, like RNA-Seq or WGS. This example illustrates usage for RNA-Seq: | ||
|
||
mixcr align -p rna-seq -r log.txt input_R1.fastq.gz input_R2.fastq.gz alignments.vdjca | ||
mixcr assemblePartial alignments.vdjca alignment_contigs.vdjca | ||
mixcr assemble -r log.txt alignment_contigs.vdjca clones.clns | ||
mixcr exportClones clones.clns clones.txt | ||
|
||
#### Further reading | ||
|
||
MiXCR pipeline is very flexible, and can be applied to raw data from broad spectrum of experimental setups. For detailed description of MiXCR features and options please see documentation. | ||
|
||
## Documentation | ||
|
||
Detailed documentation can be found at https://mixcr.readthedocs.io/ | ||
|
||
If you haven't found the answer to your question in the docs, or have any suggestions concerning new features, feel free to create an issue here, on GitHub, or write an email to [email protected] . | ||
|
||
## Build | ||
|
||
Dependancy: | ||
|
@@ -63,7 +112,6 @@ To build MiXCR from source: | |
``` | ||
./build.sh | ||
``` | ||
|
||
|
||
## License | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,76 @@ | ||
#!/bin/bash | ||
|
||
# "Integration" tests for MiXCR | ||
# Test standard analysis pipeline results | ||
|
||
# Linux readlink -f alternative for Mac OS X | ||
function readlinkUniversal() { | ||
targetFile=$1 | ||
|
||
cd `dirname $targetFile` | ||
targetFile=`basename $targetFile` | ||
|
||
# iterate down a (possible) chain of symlinks | ||
while [ -L "$targetFile" ] | ||
do | ||
targetFile=`readlink $targetFile` | ||
cd `dirname $targetFile` | ||
targetFile=`basename $targetFile` | ||
done | ||
|
||
# compute the canonicalized name by finding the physical path | ||
# for the directory we're in and appending the target file. | ||
phys_dir=`pwd -P` | ||
result=$phys_dir/$targetFile | ||
echo $result | ||
} | ||
|
||
os=`uname` | ||
delta=100 | ||
|
||
dir="" | ||
|
||
case $os in | ||
Darwin) | ||
dir=$(dirname "$(readlinkUniversal "$0")") | ||
;; | ||
Linux) | ||
dir="$(dirname "$(readlink -f "$0")")" | ||
;; | ||
FreeBSD) | ||
dir=$(dirname "$(readlinkUniversal "$0")") | ||
;; | ||
*) | ||
echo "Unknown OS." | ||
exit 1 | ||
;; | ||
esac | ||
|
||
rm -rf ${dir}/test_target | ||
mkdir ${dir}/test_target | ||
|
||
cp ${dir}/src/test/resources/sequences/*.fastq ${dir}/test_target/ | ||
|
||
cd ${dir}/test_target/ | ||
|
||
PATH=${dir}:${PATH} | ||
|
||
which mixcr | ||
|
||
mixcr -v | ||
|
||
function go_assemble { | ||
mixcr assemble -r $1.clns.report $1.vdjca $1.clns || exit 1 | ||
for c in TCR IG TRB TRA TRG TRD IGH IGL IGK ALL | ||
do | ||
mixcr exportClones -c ${c} -s $1.clns $1.clns.${c}.txt || exit 1 | ||
done | ||
} | ||
|
||
for s in sample_IGH test; | ||
do | ||
mixcr align -r ${s}_paired.vdjca.report ${s}_R1.fastq ${s}_R2.fastq ${s}_paired.vdjca || exit 1 | ||
go_assemble ${s}_paired | ||
mixcr align -r ${s}_single.vdjca.report ${s}_R1.fastq ${s}_single.vdjca || exit 1 | ||
go_assemble ${s}_single | ||
done |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Submodule repseqio
updated
from 199a91 to 958e01
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.