Skip to content

Commit

Permalink
Merge pull request #11 from theiagen/jwe-gc-count-dev
Browse files Browse the repository at this point in the history
Adding GC content calculations
  • Loading branch information
kevinlibuit authored Jul 22, 2024
2 parents 054e9ec + 1780629 commit 8b07859
Show file tree
Hide file tree
Showing 2 changed files with 12 additions and 1 deletion.
3 changes: 2 additions & 1 deletion .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,8 @@ jobs: # Define jobs for the workflow
# Save the expected output to a file
echo "Processing FASTQ file: ./data/sample.fastq" > expected_output.txt
echo "Number of reads in ./data/sample.fastq: 2" >> expected_output.txt
echo "GC Content of reads in ./data/sample.fastq: 50%" >> expected_output.txt
# Capture the actual output of the script and save it to a file
./bin/fastq-peek.sh ./data/sample.fastq > actual_output.txt
Expand Down
10 changes: 10 additions & 0 deletions bin/fastq-peek.sh
Original file line number Diff line number Diff line change
Expand Up @@ -25,4 +25,14 @@ LINE_COUNT=$(wc -l < "$FASTQ_FILE")
## Calculate the number of reads (4 lines per read)
READ_COUNT=$((LINE_COUNT / 4))

##Count A, C, T, and G nucleotides in reads
TOTAL_BASE_COUNT=$(grep -o "A\|C\|T\|G" $FASTQ_FILE | wc -l)

##Count G and C nucleotides in reads
GC_COUNT=$(grep -o "G\|C" $FASTQ_FILE | wc -l)

##Calculate GC Content of reads (X100 for % of total nts)
GC_CONTENT=$(($GC_COUNT*100/$TOTAL_BASE_COUNT))

echo "Number of reads in $FASTQ_FILE: $READ_COUNT"
echo "GC Content of reads in $FASTQ_FILE: $GC_CONTENT%"

0 comments on commit 8b07859

Please sign in to comment.