Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue with alignAndCallConsensus.pl script #221

Open
s-travers opened this issue Oct 25, 2023 · 0 comments
Open

issue with alignAndCallConsensus.pl script #221

s-travers opened this issue Oct 25, 2023 · 0 comments
Labels

Comments

@s-travers
Copy link

I am following the TE curation guidelines published in the 2021 Storer et al. Current Protocols paper, and I am noticing an issue with the 'alignAndCallConsensus.pl' script on a dataset I'm testing (copia elements from Drosophila melanogaster). When I run the script interactively, extension of the consensus sequence seems to work fine when extending in both directions ('x' option). However, when I hit one of the TE edges and just want to continue extending either the 5' or 3' edge (using the '5' or '3' options) the script seems to ignore these options and always continues extending both edges. I get the same result if I start the extension process with just the '5' or '3' options and not 'x' (i.e., it still extends both directions anyway). Dr. Storer suggested the issue is due to the Hpad length (200), as she was able to recreate it using the Hpad of 200, down to Hpad lengths of 100 nucleotides, but an Hpad of 99 nt or less behaved as expected in terms of extension.

Reproduction steps

  1. I am running this out of a Google Colab Notebook, using the Anaconda installation of RepeatModeler.
  2. I run through 5 iterations with the interactive option adding 200bp H-pads using this command:
    "alignAndCallConsensus.pl -c copia_con.fa -e copia_elements.fa -int -ma 14 -hp 200"
  3. After each of these iterations the script appears to run as it should, adding 200bp flanks on both ends. I accept the changes using 'x' since I don't run into any ambiguous sequence yet.
  4. After iteration 5 the consensus appears to hit some ambiguous sequence on the 3' edge (screenshot attached: 'iteration5.png'), so I attempt to only extend the 5' edge by entering the '5' option, however as you can see in the screenshot 'iteration6.png', it still continues to extend both edges and consensus extends 400bp in length. (Note: I probably tried to stop extending the 3' edge prematurely here as it resolves that initial ambiguous sequence encountered on the previous iteration, but this is just to illustrate the issue)
  5. I enter '5' again for the next iteration, just to see if it will respond appropriately this time. However, I get the same result, it continues to extend both edges and adds another 400 bases in total to the consensus ('iteration7.png').

Let me know if you need any other info.
Thanks!

Iteration5 Iteration6 Iteration7
@s-travers s-travers added the bug label Oct 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant