Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem implementing a reverse-strand specific analysis #204

Open
sknaack opened this issue Aug 12, 2023 · 2 comments
Open

Problem implementing a reverse-strand specific analysis #204

sknaack opened this issue Aug 12, 2023 · 2 comments

Comments

@sknaack
Copy link

sknaack commented Aug 12, 2023

Hi All,

I'm applying RSEM to a (reverse-strand-specific) RibTag-seq data set in mouse for the mouse 39 assembly. Such data is successfully treated as RNA-seq data in a reverse strand orientation. I've successfully aligned this data with STAR v2.7.10, and even obtained forward and reverse strand counts from this. When I apply RSEM 1.3.0 to the same data for the same 39 assembly (calling the same version of STAR for the default - internally organized - alignment per rsem-calculate-expression) with the --forward-prob 0 and/or --strandedness reverse options, the expected counts are in line with the number of + strand counts , rather than the - strand counts. A few example genes present several 10k alignments in - orientation, but a few 100 in + strand orientation. These same genes present expected counts that are closer to the numbers of + strand alignments by several factors.

In an earlier analysis of these data for the 38 assembly the --forward-prob 0 option successfully produced expected counts similar to the raw counts of reverse strand alignments. Basically the --forward-prob 0 and --strandedness reverse options are somehow insufficient to specify reverse strand alignments.

Does anyone have experience that speaks to this? with newer versions of STAR and RSEM 1.3.0? Any tips to correct this? Should I try bowtie2 in case it's a particularity of using STAR that is causing this? Thanks much in advance fro your assistance.

Best,

Sara

@cndewey
Copy link
Member

cndewey commented Aug 17, 2023

Hi Sara,

I'm not aware of any issues with these options and I use them regularly in my own analyses. Specifying "--forward-prob 0" should be sufficient for your use case. To help debug the situation, I would suggest running STAR externally (as you have done) with the "--quantMode TranscriptomeSAM" option, and then feeding the resulting transcriptome BAM to RSEM as input. This should ensure that the RSEM output is as consistent as possible with the alignments that you are getting from running STAR separately. Another aspect to look at is how you are obtaining "raw" forward and reverse strand counts from your STAR alignments. It can be easy to mistakenly flip the strands in such analyses. Hopefully, one of those two strategies will resolve your issue.

Best,
Colin

@sknaack
Copy link
Author

sknaack commented Aug 25, 2023

Thanks Colin! In fact I traced it down to this previously indicated matter: https://stackoverflow.com/questions/74971057/unable-to-run-rsem-tool-because-of-error-rsem-parse-alignments-no-such-file-or In short, it was a technical error in the indexing as it was running on my machine.

This was a bit of a OS hitch, but perhaps something useful to note. Once I recompiled RSEM with the (extremely few) line changes to buildReadIndex.cpp, PairedEndHit.h and SingleHit.h that were recommended my results made sense. In the end I technically used the --strandedness reverse argument, but the RSEM expected counts results are now plenty comparable to the raw (reverse strand) counts from STAR, and identical to results generated in a different OS. Thanks again!

Best,

Sara

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants