Skip to content

Commit

Permalink
Merge pull request #21 from samhorsfield96/lowercase_gff
Browse files Browse the repository at this point in the history
Converts FM-index to uppercase to avoid indexing issues.
  • Loading branch information
samhorsfield96 authored Jan 10, 2024
2 parents 1d85587 + 5a69483 commit fe65d60
Show file tree
Hide file tree
Showing 2 changed files with 15 additions and 0 deletions.
11 changes: 11 additions & 0 deletions src/match_string.cpp
Original file line number Diff line number Diff line change
@@ -1,6 +1,14 @@
// ggCaller header
#include "match_string.h"

// code from
// https://stackoverflow.com/questions/735204/convert-a-string-in-c-to-upper-case
char ascii_toupper_char(char c) {
return ('a' <= c && c <= 'z')
? c ^ 0x20
: c; // ^ autovectorizes to PXOR: runs on more ports than paddb
}

// index fasta files
std::pair<fm_index_coll, std::vector<size_t>> index_fasta(const std::string& fasta_file,
const bool write_idx)
Expand Down Expand Up @@ -41,6 +49,9 @@ std::pair<fm_index_coll, std::vector<size_t>> index_fasta(const std::string& fas
kseq_destroy(seq);
gzclose(fp);

// convert string to uppercase to avoid indexing issues
std::transform(reference_seq.begin(), reference_seq.end(), reference_seq.begin(), ::ascii_toupper_char);

sdsl::construct_im(ref_index, reference_seq, 1); // generate index
if (write_idx)
{
Expand Down
4 changes: 4 additions & 0 deletions src/match_string.h
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,10 @@
#include "kseq.h"

// match_strings
// code from
// https://stackoverflow.com/questions/735204/convert-a-string-in-c-to-upper-case
char ascii_toupper_char(char c);

std::pair<fm_index_coll, std::vector<size_t>> index_fasta(const std::string& fasta_file,
const bool write_idx);

Expand Down

0 comments on commit fe65d60

Please sign in to comment.