Skip to content

naiveGC's implementation details

tonyyzy edited this page May 14, 2018 · 1 revision

Currently, naiveGC reads fasta file (exclude trackline) one byte at a time, checks if the character is '\n' (newline) or '' (end of file). The script then checks if the character is 'G' or 'C', if yes then percentage_GC += unit percentage. A local counter and a global counter is used to count the position in the given window size and the position on the whole sequence, respectively. Once the local counter reaches the specified window size, a line (global_position percentage_GC) is written to the result file. The unit percentage is calculated by (1/window_size) * 100.

TODO: More efficient implementation. Involve click python library for better command line arguments handling.

Clone this wiki locally