Skip to content

Commit

Permalink
Merge pull request #128 from torres-alexis/DEV_GeneLab_Reference_Anno…
Browse files Browse the repository at this point in the history
…tations_vGL-DPPD-7110-A

[GL_RefAnnotTable] Execution fixes
  • Loading branch information
bnovak32 authored Nov 5, 2024
2 parents e3dfb4b + dcdf589 commit ef2958f
Show file tree
Hide file tree
Showing 5 changed files with 98 additions and 68 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -187,6 +187,8 @@ lib_path <- file.path(getwd())

# Define variables associated with current pipeline and annotation table versions
GL_DPPD_ID <- "GL-DPPD-7110-A"
workflow_version <- ""

ref_tab_path <- "https://raw.githubusercontent.com/nasa/GeneLab_Data_Processing/master/GeneLab_Reference_Annotations/Pipeline_GL-DPPD-7110_Versions/GL-DPPD-7110-A/GL-DPPD-7110-A_annotations.csv"
readme_path <- "https://github.com/nasa/GeneLab_Data_Processing/tree/master/GeneLab_Reference_Annotations/Workflow_Documentation/GL_RefAnnotTable-A/README.md"

Expand All @@ -212,6 +214,7 @@ library(rtracklayer)
**Output Data:**

- `GL_DPPD_ID` (variable specifying the GeneLab Data Processing Pipeline Document ID)
- `workflow_version` (variable specifying the [current version of the workflow](https://github.com/nasa/GeneLab_Data_Processing/tree/DEV_GeneLab_Reference_Annotations_vGL-DPPD-7110-A/GeneLab_Reference_Annotations/Workflow_Documentation/GL_RefAnnotTable-A))
- `ref_tab_path` (variable specifying the path to the reference table CSV file)
- `readme_path` (variable specifying the path to the README file)
- `currently_accepted_orgs` (variable specifying the list of currently supported organisms)
Expand Down Expand Up @@ -240,7 +243,7 @@ target_info <- ref_table %>%

# Extract the relevant columns from the reference table
target_taxid <- target_info$taxon # Taxonomic identifier
target_org_db <- target_info$annotations # org.eg.db R package
target_org_db <- target_info$bioconductor_annotations # org.eg.db R package
gtf_link <- target_info$gtf # Path to reference assembly GTF
target_short_name <- target_info$name # PANTHER / UNIPROT short name; blank if not available
ref_source <- target_info$ref_source # Reference files source
Expand Down Expand Up @@ -280,7 +283,7 @@ if ( file.exists(out_table_filename) ) {
**Output Data:**

- `target_taxid` (variable specifying the taxonomic identifier for the target organism)
- `target_org_db` (variable specifying the name of the org.db R package for the target organism)
- `target_org_db` (variable specifying the name of the org.eg.db R package for the target organism if it is hosted by Bioconductor)
- `gtf_link` (variable specifying the URL to the GTF file for the target organism)
- `target_short_name` (variable specifying the PANTHER/UNIPROT short name for the target organism)
- `ref_source` (variable specifying the source of the reference files, e.g., "ensembl", "ensembl_plants", "ensembl_bacteria", "ncbi")
Expand Down
Loading

0 comments on commit ef2958f

Please sign in to comment.