-
Notifications
You must be signed in to change notification settings - Fork 446
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix CRAM embed_ref=2 with seqs overlapping ref end.
If the sequences align off the end of the reference and we are creating consensus on the fly, then the consensus generated also steps beyond the reference length. Although this longer reference is embedded, it is trimmed back by the CRAM decoder which validates against the declared reference length in SQ LN, leading to Ns appearing in the decoder. Therefore we now validate in the encoder too, which also needed refs_from_header updates to parse the LN tag so the encoder can trim. Note we already overloaded r->length==0 for an indication that we've not parsed the fa/fai file yet, so we can't just naively fill this out from the SQ LN header. We could hold this information elsewhere via a proper flag and modify all the places that utilise that knowledge, but the simplest (and safest) fix is to have a separate variable used for this one specific case. An example of failure could be seen in: ./test/test_view -C -o embed_ref=2 test/c1#bounds.sam |\ ./test/test_view -
- Loading branch information
1 parent
2a59ad4
commit 24e4e31
Showing
4 changed files
with
18 additions
and
5 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters