- Every time you read something like <your_name>, or or this is just a place holder. Replace it with the actual name!
Create a local MS Word File name it 024_ngs_phd_<surename>.docx
. Replace with your actual name.
Copy commands, useful information, links etc to this file. This file should help you to reproduce steps from home after the class is over.
Connect to genepi-lehre.i-med.ac.at
with your username/password using the Windows Powershell
Try to navigate on the Bash, create a folder and navigate around.
eGFR_SNPs.csv
and HDL_SNPs.txt
are located in the folder teaching/ngs/data/unix/snp_lookup.
How many lines are included in each file? (Tip: You can either navigate to the folder with cd
or you can execute grep directly from your home directory).
Grep the SNP rs13326165
and the SNP rs17173637
from eGFR_SNPs.csv
and write it to a file.
Now grep the SNP rs133299
from eGFR_SNPs.csv
. How many lines are displayed? How would you interpret the output? What happens if you use the following grep command and what is the difference?: grep -w rs133299 teaching/ngs/data/unix/snp_lookup/eGFR_SNPs.csv
Now, try to find the SNPs your boss asked you. Use the grep
command to output the lines from eGFR_SNPs.csv
.
As a pattern file use HDL_SNPs.txt
. Also add the -w
option.
Why do we need to add -w
? (eGFR_SNPs.csv
and HDL_SNPs.txt
). How many SNPs did you find? Write them to a file and copy it to Windows.
Data: teaching/ngs/data/unix/snp_lookup
In the first exercise we align data with bwa mem
:
- Create a folder
mapping
underteaching/students/<q-number>
and change to this folder. - Copy the files
4153_S13_L001_R1_001.fastq.gz
and4153_S13_L001_R2_001.fastq.gz
from here:~/teaching/ngs/data/fastq/exercises/miseq
usingcp <path_to_file> .
. (The point at the end of command means that the data is copied to the current location). - Reference is available here:
~/teaching/ngs/data/ref/kiv2_6.fasta
(No need to executebwa index
) - Update the following command from the Getting Started Guide:
./bwa mem ref.fa read1.fq read2.fq | gzip -3 > aln-pe.sam.gz
Now, we convert the file to the BAM format.
- Use samtools to convert and sort a SAM file to a BAM file. Ask Google or ChatGPT for help.
- Create an index with
samtools index <bam_file>
. This will create an index file. Why is an index needed?
Run samtools depth <aligned-file-sorted.bam>
on the file and interpret the output. Learn about the -a
parameter and add it to your command. Write the output to a file.
Download the file to Windows with WinSCP.
Install "Tablet" (*.exe available in the Shared Drive) and load the BAM file via Open Assembly. You also need to specify the reference, you can find the KIV_2.fasta reference in the Shared Drive.
Checkout freebayes and call your variants. As an input the aligned file (aligned.bam) is required. Write the output to a file ending with .vcf. (freebayes > out.vcf)
Similar to our You can also use a different variant caller
Similar to our "minimal variant calling experiment" you can also combine "bcftools mpileup" with "bcftools call".
bcftools mpileup -f <ref.fa> <input.bam> | bcftools call -m -v -Ov -o <out.vcf> -
Bcftools are utilities for variant calling and manipulating VCFs and BCFs. Try to learn the bcftools convert command extract a region from the vcf file.
Again, go to the bcftools website and learn about three bcftools commands from the "list of commands".
Execute the pipeline with a test profile.
git clone https://github.com/genepi/ngs-class
cd ngs-class/nf-preprocess
export NXF_SINGULARITY_CACHEDIR=/mnt/genepi-lehre/teaching/ngs/singualarity/
nextflow run main.nf -profile singularity,test
Write a config file for your project (e.g. projectXY.config
). The file looks similar to this one but you need to adapt the paths to your data.
params {
input = "test-data/*.fastq"
output = "fastp-test"
}
Execute it as follows:
export NXF_SINGULARITY_CACHEDIR=/mnt/genepi-lehre/teaching/ngs/singualarity/
nextflow run main.nf -c projectXY.config -profile singularity
Go to https://nf-co.re/ and find a pipeline which could be useful in your field. Try to understand how to execute it!