-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
73 changed files
with
2,180 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
--- | ||
hide: | ||
- navigation | ||
- toc | ||
--- | ||
|
||
# Contact | ||
|
||
The complete Imputation Server 2 source code is available on [GitHub](https://github.com/genepi/imputationserver2) and has been developed by [Lukas Forer](https://genepi.i-med.ac.at/team/forer-lukas/) and [Sebastian Schönherr](https://genepi.i-med.ac.at/team/schoenherr-sebastian/) from the Institute of Genetic Epidemiology, Medical University of Innsbruck. | ||
|
||
Feel free to create issues and pull requests. Before contacting us, please have a look at the [FAQ page](faq) first. | ||
|
||
## Michigan Imputation Server Team | ||
|
||
Michigan Imputation Server provides a free genotype imputation service using Minimac4. You can upload phased or unphased GWAS genotypes and receive phased and imputed genomes in return. For all uploaded data sets an extensive QC is performed. | ||
|
||
* [Christian Fuchsberger](mailto:[email protected]) | ||
* [Lukas Forer](mailto:[email protected]) | ||
* [Sebastian Schönherr](mailto:[email protected]) | ||
* [Sayantan Das](mailto:[email protected]) | ||
* [Gonçalo Abecasis](mailto:[email protected]) | ||
|
||
Please contact [Christian Fuchsberger](mailto:[email protected]) in case of other problems. | ||
|
||
|
||
## TOPMed Imputation Server Team | ||
|
||
Michigan Imputation Server provides a free genotype imputation service using Minimac4. You can upload phased or unphased GWAS genotypes and receive phased and imputed genomes in return. For all uploaded data sets an extensive QC is performed. | ||
|
||
* [Albert Smith](mailto:[email protected]) | ||
* [Andy Boughton](mailto:[email protected]) | ||
|
||
Please use [this addrees](mailto:[email protected]) for all inquiries. | ||
|
||
|
||
## Imputation engine: [Minimac4](http://genome.sph.umich.edu/wiki/Minimac4) | ||
|
||
Minimac4 is a lower memory and more computationally efficient implementation of the genotype imputation algorithms in minimac/mininac2/minimac3. | ||
|
||
* [Sayantan Das](mailto:[email protected]) | ||
* [Christian Fuchsberger](mailto:[email protected]) | ||
* [Gonçalo Abecasis](mailto:[email protected]) | ||
|
||
## Cloud framework: [Cloudgene](https://www.cloudgene.io/) | ||
|
||
Cloudgene is a framework to build Software As A Service (SaaS) platforms for data analysis pipelines. By connecting command-line programs, scripts or Hadoop applications to Cloudgene, a powerful web application can be created within minutes. Cloudgene supports the complete workflow including data transfer, program execution and data export. Cloudgene is developed at the Division of Genetic Epidemiology Innsbruck in cooperation with the Center for Statistical Genetics, University of Michigan. | ||
|
||
* [Lukas Forer](mailto:[email protected]) | ||
* [Sebastian Schönherr](mailto:[email protected]) | ||
|
||
## Phasing engine: [Eagle2](https://data.broadinstitute.org/alkesgroup/Eagle/) | ||
|
||
For haplotype phasing Eagle2 is used. Eagle2 attains high accuracy across a broad range of cohort sizes by efficiently leveraging information from large external reference panels (such as the Haplotype Reference Consortium; HRC). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
# Data Security | ||
|
||
Since data is transfered to our server located in Michigan, a wide array of security measures are in force: | ||
|
||
- The complete interaction with the server is secured with HTTPS. | ||
- Input data is deleted from our servers as soon it is not needed anymore. | ||
- We only store the number of samples and markers analyzed, we don't ever "look" at your data in anyway. | ||
- All results are encrypted with a strong one-time password - thus, only you can read them. | ||
- After imputation is finished, the data uploader has 7 days to use an encrypted connection to get results back. | ||
- The complete source code is available in a [public Github repository](https://github.com/genepi/imputationserver/tree/qc-refactoring). | ||
|
||
## Who has access? | ||
|
||
To upload and download data, users must register with a unique e-mail address and strong password. Each user can only download imputation results for samples that they have themselves uploaded; no other imputation server users will be able to access your data. | ||
|
||
## Cookies | ||
|
||
We value your privacy and are committed to transparency regarding the use of cookies on our website. Below, we outline our cookie policy to provide you with clarity and assurance. | ||
|
||
### What are cookies? | ||
Cookies are small text files that are placed on your device when you visit a website. They serve various purposes, including enhancing user experience, facilitating website functionality, and analyzing website traffic. | ||
|
||
### How do we use cookies? | ||
We use cookies only for the purpose of facilitating login functionality. These cookies help us recognize your device and authenticate your access to our platform securely. We do not track any personal information or analyze user activities through cookies. | ||
|
||
### Why do we use cookies? | ||
Cookies are essential for providing seamless login experiences to our users. By storing authentication information, cookies enable you to access your account efficiently without the need for repetitive login procedures. We respect your privacy and limit cookie usage exclusively to login purposes. | ||
|
||
|
||
## What security or firewalls protect access? | ||
|
||
A wide array of security measures are in force on the imputation servers: | ||
|
||
- SSH login to the servers is restricted to only systems administrators. | ||
- Direct root login via SSH is not allowed from the public Internet. | ||
- The public-facing side of the servers sits behind the School of Public Health's Checkpoint virtual firewall instance where a default-deny policy is used on inbound traffic; only explicitly allowed TCP ports are passed. | ||
- The School of Public Health also makes use of NIDS technologies such as Snort and Peakflow on its network links for traffic analysis and threat detection. | ||
- On imputation server itself, updates are run regularly by systems administrators who follow several zero-day computer security announcement lists; the OSSEC HIDS is used for log analysis and anomaly detection; and Denyhosts is used to thwart brute-force SSH login attacks. | ||
|
||
|
||
## What encryption of the data is used while the data are present? | ||
|
||
Imputation results are encrypted with a one-time password generated by the system. The password consists of lower characters, upper characters, special characters and numbers with max. 3 duplicates. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
# Frequently Asked Questions | ||
|
||
## I did not receive a password for my imputation job | ||
Michigan Imputation Server creates a random password for each imputation job. This password is not stored on server-side at any time. If you didn't receive a password, please check your Mail SPAM folder. Please note that we are not able to re-send you the password. | ||
|
||
## Unzip command is not working | ||
Please check the following points: (1) When selecting AES256 encryption, please use 7z to unzip your files (Debian: `sudo apt-get install p7zip-full`). For our default encryption all common programs should work. (2) If your password includes special characters (e.g. \\), please put single or double quotes around the password when extracting it from the command line (e.g. `7z x -p"PASSWORD" chr_22.zip`). | ||
|
||
## Extending expiration date or reset download counter | ||
Your data is available for 7 days. In case you need an extension, please let [us](/contact) know. | ||
|
||
## How can I improve the download speed? | ||
[aria2](https://aria2.github.io/) tries to utilize your maximum download bandwidth. Please keep in mind to raise the k parameter significantly (-k, --min-split-size=SIZE). You will otherwise hit the Michigan Imputation Server download limit for each file (thanks to Anthony Marcketta for point this out). | ||
|
||
## Can I download all results at once? | ||
We provide wget command for all results. Please open the results tab. The last column in each row includes direct links to all files. | ||
|
||
## Can I set up Michigan Imputation Server locally? | ||
We are providing a single-node Docker image that can be used to impute from Hapmap2 and 1000G Phase3 locally. Click [here](/docker) to give it a try. For usage in production, we highly recommend setting up a Hadoop cluster. | ||
|
||
## Your web service looks great. Can I set up my own web service as well? | ||
All web service functionality is provided by [Cloudgene](http://www.cloudgene.io/). Please contact us, in case you want to set up your own service. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,173 @@ | ||
# Getting started | ||
|
||
To use Michigan Imputation Server, a [registration](https://imputationserver.sph.umich.edu/index.html#!pages/register) is required. | ||
We send an activation mail to the provided address. Please follow the instructions in the email to activate your account. If it doesn't arrive, ensure you have entered the correct email address and check your spam folder. | ||
|
||
**After the email address has been verified, the service can be used without any costs.** | ||
|
||
Please cite this paper if you use Michigan Imputation Server in your GWAS study: | ||
|
||
> Das S, Forer L, Schönherr S, Sidore C, Locke AE, Kwong A, Vrieze S, Chew EY, Levy S, McGue M, Schlessinger D, Stambolian D, Loh PR, Iacono WG, Swaroop A, Scott LJ, Cucca F, Kronenberg F, Boehnke M, Abecasis GR, Fuchsberger C. [Next-generation genotype imputation service and methods](https://www.ncbi.nlm.nih.gov/pubmed/27571263). Nature Genetics 48, 1284–1287 (2016). | ||
|
||
## Setup your first imputation job | ||
|
||
Please [login](https://imputationserver.sph.umich.edu/index.html#!pages/login) with your credentials and click on the **Run** tab to start a new imputation job. The submission dialog allows you to specify the properties of your imputation job. | ||
|
||
![](images/submit-job01.png) | ||
|
||
The following options are available: | ||
|
||
### Reference Panel | ||
|
||
Our server offers genotype imputation from different reference panels. The most accurate and largest panel is **HRC (Version r1.1 2016)**. Please select one that fulfills your needs and supports the population of your input data: | ||
|
||
- HRC (Version r1.1 2016) | ||
- HLA Imputation Panel: two-field (four-digit) and G-group resolution | ||
- HRC (Version r1 2015) | ||
- 1000 Genomes Phase 3 (Version 5) | ||
- 1000 Genomes Phase 1 (Version 3) | ||
- CAAPA - African American Panel | ||
- HapMap 2 | ||
|
||
More details about all available reference panels can be found [here](https://imputationserver.readthedocs.io/en/latest/reference-panels/). | ||
|
||
### Upload VCF files from your computer | ||
|
||
When using the file upload, data is uploaded from your local file system to Michigan Imputation Server. By clicking on **Select Files** an open dialog appears where you can select your VCF files: | ||
|
||
![](images/upload-data01.png) | ||
|
||
Multiple files can be selected using the `ctrl`, `cmd` or `shift` keys, depending on your operating system. | ||
After you have confirmed your choice, all selected files are listed in the submission dialog: | ||
|
||
![](images/upload-data02.png) | ||
|
||
Please make sure that all files fulfill the [requirements](/prepare-your-data). | ||
|
||
|
||
!!! important | ||
Since version 1.7.2 URL-based uploads (sftp and http) are no longer supported. Please use direct file uploads instead. | ||
|
||
### Build | ||
Please select the build of your data. Currently the options **hg19** and **hg38** are supported. Michigan Imputation Server automatically updates the genome positions (liftOver) of your data. All reference panels except TOPMed are based on hg19 coordinates. | ||
|
||
### rsq Filter | ||
To minimize the file size, Michigan Imputation Server includes a r<sup>2</sup> filter option, excluding all imputed SNPs with a r<sup>2</sup>-value (= imputation quality) smaller then the specified value. | ||
|
||
### Phasing | ||
|
||
If your uploaded data is *unphased*, Eagle v2.4 will be used for phasing. In case your uploaded VCF file already contains phased genotypes, please select the "No phasing" option. | ||
|
||
| Algorithm | Description | | ||
| ---------- |-------------| | ||
| **Eagle v2.4** | The [Eagle](https://data.broadinstitute.org/alkesgroup/Eagle/) algorithm estimates haplotype phase using the HRC reference panel. This method is also suitable for single sample imputation. After phasing or imputation you will receive phased genotypes in your VCF files. | | ||
|
||
### Population | ||
|
||
Please select the population of your uploaded samples. This information is used to compare the allele frequencies between your data and the reference panel. Please note that not every reference panel supports all sub-populations. | ||
|
||
| Population | Supported Reference Panels | | ||
| ----------- | ---------------------------| | ||
| **AFR** | all | | ||
| **AMR** | all | | ||
| **EUR** | all | | ||
| **Mixed** | all | | ||
| **AA** | CAAPA | | ||
| **ASN** | 1000 Genomes Phase 1 (Version 3) | | ||
| **EAS** | 1000 Genomes Phase 3 (Version 5) | | ||
| **SAS** | 1000 Genomes Phase 3 (Version 5) | | ||
|
||
In case your population is not listed or your samples are from different populations, please select **Mixed** to skip the allele frequency check. For mixed populations, no QC-Report will be created. | ||
|
||
### Mode | ||
|
||
Please select if you want to run **Quality Control & Imputation**, **Quality Control & Phasing Only** or **Quality Control Only**. | ||
|
||
|
||
### AES 256 encryption | ||
|
||
All Imputation Server results are encrypted by default. Please tick this checkbox if you want to use AES 256 encryption instead of the default encryption method. Please note that AES encryption does not work with standard unzip programs. We recommend to use 7z instead. | ||
|
||
|
||
## Start your imputation job | ||
|
||
After confirming our *Terms of Service*, the imputation process can be started immediately by clicking on **Start Imputation**. Input Validation and Quality Control are executed immediately to give you feedback about the data-format and its quality. If your data passed this steps, your job is added to our imputation queue and will be processed as soon as possible. You can check the position in the queue on the job summary page. | ||
|
||
![](images/queue01.png) | ||
|
||
We notify you by email as soon as the job is finished or your data don't pass the Quality Control steps. | ||
|
||
### Input Validation | ||
|
||
In a first step we check if your uploaded files are valid and we calculate some basic statistics such as amount of samples, chromosomes and SNPs. | ||
|
||
![](images/input-validation01.png) | ||
|
||
After Input Validation has finished, basic statistics can be viewed directly in the web interface. | ||
|
||
![](images/input-validation02.png) | ||
|
||
If you encounter problems with your data please read this tutorial about [Data Preparation](/prepare-your-data) to ensure your data is in the correct format. | ||
|
||
### Quality Control | ||
|
||
In this step we check each variant and exclude it in case of: | ||
|
||
1. contains invalid alleles | ||
2. duplicates | ||
3. indels | ||
4. monomorphic sites | ||
5. allele mismatch between reference panel and uploaded data | ||
6. SNP call rate < 90% | ||
|
||
All filtered variants are listed in a file called `statistics.txt` which can be downloaded by clicking on the provided link. More informations about our QC pipeline can be found [here](/pipeline). | ||
|
||
![](images/quality-control02.png) | ||
|
||
If you selected a population, we compare the allele frequencies of the uploaded data with those from the reference panel. The result of this check is available in the QC report and can be downloaded by clicking on `qcreport.html`. | ||
|
||
### Pre-phasing and Imputation | ||
|
||
Imputation is achieved with Minimac4. The progress of all uploaded chromosomes is updated in real time and visualized with different colors. | ||
|
||
![](images/imputation01.png) | ||
|
||
### Data Compression and Encryption | ||
|
||
If imputation was successful, we compress and encrypt your data and send you a random password via mail. | ||
|
||
![](images/compression01.png) | ||
|
||
This password is not stored on our server at any time. Therefore, if you lost the password, there is no way to resend it to you. | ||
|
||
## Download results | ||
|
||
The user is notified by email, as soon as the imputation job has finished. A zip archive including the results can be downloaded directly from the server. To decrypt the results, a one-time password is generated by the server and included in the email. The QC report and filter statistics can be displayed and downloaded as well. | ||
|
||
![](images/job-results.png) | ||
|
||
!!! important "All data is deleted automatically after 7 days" | ||
Be sure to download all needed data in this time period. We send you a reminder 48 hours before we delete your data. Once your job hast the state **retired**, we are not able to recover your data! | ||
|
||
|
||
### Download via a web browser | ||
|
||
All results can be downloaded directly via your browser by clicking on the filename. | ||
|
||
![](images/share-data02.png) | ||
|
||
In order to download results via the commandline using `wget`or `aria2` you need to click on the **share** symbol (located right to the file size) to get the needed private links. | ||
|
||
![](images/share-data01.png) | ||
|
||
A new dialog appears which provides you the private link. Click on the tab **wget command** to get a copy & paste ready command that can be used on Linux or MacOS to download the file in you terminal: | ||
|
||
|
||
### Download all results at once | ||
|
||
To download all files of a folder (for example folder **Imputation Results**) you can click on the **share** symbol of the folder: | ||
|
||
![](images/share-data02.png) | ||
|
||
A new dialog appears which provides you all private links at once. Click on the tab **wget commands** to get copy & paste ready commands that can be used on Linux or MacOS to download all files. |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
--- | ||
hide: | ||
- navigation | ||
- toc | ||
--- | ||
|
||
# Michigan Imputation Server<br><small>Free Next-Generation Genotype Imputation Platform</small> | ||
|
||
|
||
[Michigan Imputation Server](https://imputationserver.sph.umich.edu) provides a free genotype imputation service using [Minimac4](http://genome.sph.umich.edu/wiki/Minimac4). You can upload phased or unphased GWAS genotypes and receive phased and imputed genomes in return. Our server offers imputation from 1000 Genomes (Phase 1 and 3), CAAPA, [HRC](http://www.haplotype-reference-consortium.org/) and the [TOPMed](http://nhlbiwgs.org/) reference panel. For all uploaded datasets an extensive QC is performed. The complete source code is hosted on [GitHub](https://github.com/genepi/imputationserver2/). | ||
|
||
Please cite this paper if you use Michigan Imputation Server in your publication: | ||
|
||
> Das S, Forer L, Schönherr S, Sidore C, Locke AE, Kwong A, Vrieze S, Chew EY, Levy S, McGue M, Schlessinger D, Stambolian D, Loh PR, Iacono WG, Swaroop A, Scott LJ, Cucca F, Kronenberg F, Boehnke M, Abecasis GR, Fuchsberger C. [Next-generation genotype imputation service and methods](https://www.ncbi.nlm.nih.gov/pubmed/27571263). Nature Genetics 48, 1284–1287 (2016). | ||
--- | ||
|
||
![](images/index.png) | ||
|
Oops, something went wrong.