A comprehensive bioinformatics tool for Loss of Heterozygosity (LOH) analysis in HLA regions, designed for parallel processing of multiple genomic samples.
- Authors: Edwin Huang + LLMs
- Categories: HLA analysis
- Source Repository: https://github.com/mskcc/lohhla
- Contact: [email protected]
- BAM List File (Text file)
- Contains paths to tumor BAM files
- HLA Type List File (Text file)
- Contains corresponding HLA type file paths
- Optional: Normal BAM file
- Coverage plots
- HLA allele mapping results
- LOH analysis reports
- Temporary intermediate files (if clean-up disabled)
Parameter | Description | Default Value | Type |
---|---|---|---|
Output Directory | Specifies location for analysis results | Current working directory | File path |
Normal BAM File | Optional reference normal sample | FALSE | File path |
HLA FASTA Location | Reference HLA sequence database | ~/lohhla/data/hla_all.fasta | File path |
HLA Exon Location | HLA exon boundary information for plotting | ~/lohhla/data/hla.dat | File path |
Minimum Coverage | Minimum read coverage at mismatch sites | 30 | Numeric |
K-mer Size | Size of genomic fragments for read mapping | 50 | Numeric |
Mismatch Tolerance | Maximum allowed mismatches in read-to-allele mapping | 1 | Numeric |
Mapping Step | Perform mapping to HLA alleles | TRUE | Boolean |
Fishing Step | Identify reads matching specific k-mers | TRUE | Boolean |
Plotting Step | Generate visualization of analysis results | TRUE | Boolean |
Coverage Step | Analyze coverage differences across regions | TRUE | Boolean |
Clean Up | Remove temporary files after analysis | TRUE | Boolean |
Ignore Warnings | Continue execution despite non-critical warnings | TRUE | Boolean |
Parallel Cores | Number of CPU cores for parallel processing | 1 | Numeric |
Novoalign Directory | Path to alignment tool executable | Empty | File path |
GATK Directory | Path to Genome Analysis Toolkit executable | Empty | File path |
- R environment
- Required R libraries: optparse, parallel
- External tools: Novoalign, GATK (optional)
- Sufficient computational resources based on sample complexity and selected cores
- Cancer genomics research
- HLA region variation analysis
- Immunogenomics studies
- Personalized medicine investigations
- Requires matched BAM and HLA type files
- Performance dependent on input data quality
- Computational intensity increases with sample complexity
- Prepare input BAM and HLA type files
- Configure analysis parameters
- Select appropriate computational resources
- Execute wrapper script
- Review generated results