add rasusa wrapper #152

mbhall88 · 2020-08-06T08:02:23Z

This PR will eventually try and add a rasusa wrapper (it currently works). But it is not to be merged yet as it relates to work in snakemake/snakemake#532 and #150

Note: I also reordered some of the tests to try and maintain alphabetical ordering. Not sure if this is being adhered to anymore?

vsoch · 2020-08-06T18:07:24Z

This looks great! We will want to update for any possible changes after discussion in #153. For example, if we are able to build the container from the environment.yaml, you might not need to provide it. It's also not clear if we would want container: Dockerfile, leaving it out entirely, or something else.

mbhall88 · 2020-11-13T01:09:19Z

Just wondering if this PR is required for the container stuff anymore? I think you had a different way of testing this out didn't you @vsoch ?

If not, I'll remove the container stuff and contribute it as a wrapper for rasusa as one doesn't exist yet.

vsoch · 2020-11-13T01:17:43Z

I had a PR in to snakemake (which would be required here) but it looks like it's gone stale. I'm not sure if this work is still a goal for snakemake, or some other strategy is going to be used. Probably we should ask @johanneskoester.

vsoch · 2020-11-13T01:18:42Z

But @mbhall88 you can probably remove the container stuff and contribute as a wrapper without it! Don't let our slowness / this feature development hold you back! :)

into rasusa

… into rasusa

dlaehnemann

The recipe looks good. Just two questions:

There's a lot of formatting going on in test.py. Are you sure you are using the latest black? Just want to avoid unnecessary changes, in case this will be reverted with the next best merge...
Just out of curiosity: What can rasusa do for me, that samtools view -s cannot.

mbhall88 · 2020-11-15T23:34:35Z

The recipe looks good. Just two questions:

1. There's a lot of formatting going on in `test.py`. Are you sure you are using the latest `black`? Just want to avoid unnecessary changes, in case this will be reverted with the next best merge...

I'm using black, version 20.8b1 which seems to be the latest? Anyway, would the CI checks have detected if the formatting was incorrect?

2. Just out of curiosity: What can rasusa do for me, that [`samtools view -s`](https://www.htslib.org/doc/samtools-view.html) cannot.

The difference is that samtools/seqtk only subsample based on the number of reads, Whereas rasusa subsamples to a given (theoretical) coverage; it makes no assumptions about the lengths of your reads. Which is sort of fine if you have Illumina data and you know all your reads are the same length, and you can be bothered to calculate how many you would need to get to a certain coverage. But this breaks down when working with long reads where they are never all the same length.

I'm hoping to write this up as an application note in Bioinformatics or something similar soon.
I talk about this a little in the benchmark section.

dlaehnemann · 2020-11-16T11:47:11Z

The GitHub Action does run a black check (not sure which version, would have to check the output of the GitHub Action run on this branch, I guess). And this PR passes, so it should all be fine and I'll merge.

And thanks for the elaboration on rasusa, this does sound good. It could even be used to take peaks out of crazily over-covered regions while leaving everything else alone. This could avoid slowing down downstream tools with excessive local coverage. Definitely worth a write-up!

mbhall88 · 2020-11-16T23:29:25Z

It could even be used to take peaks out of crazily over-covered regions while leaving everything else alone.

It won't really do this. For that, you would need an alignment file and to use something like VariantBam (see here for more info).

add rasusa wrapper

3936f36

mbhall88 assigned vsoch and mbhall88 Aug 6, 2020

mbhall88 added the inprogress label Aug 6, 2020

mbhall88 mentioned this pull request Aug 6, 2020

start of work for wrapper to support --use-singularity if a container… snakemake/snakemake#532

Closed

mbhall88 added 4 commits November 13, 2020 15:01

remove container key

701ec1a

Merge branch 'master' of https://github.com/snakemake/snakemake-wrappers

9118bd3

into rasusa

Merge branch 'rasusa' of https://github.com/mbhall88/snakemake-wrappers…

cfc4ce6

… into rasusa

add rasusa test back

2a5b56b

mbhall88 added ready2merge and removed inprogress labels Nov 13, 2020

mbhall88 changed the title ~~[WIP] add rasusa wrapper~~ add rasusa wrapper Nov 13, 2020

mbhall88 requested review from dlaehnemann and johanneskoester and removed request for dlaehnemann November 13, 2020 05:49

dlaehnemann approved these changes Nov 13, 2020

View reviewed changes

dlaehnemann merged commit d8ac1e0 into snakemake:master Nov 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add rasusa wrapper #152

add rasusa wrapper #152

mbhall88 commented Aug 6, 2020 •

edited

Loading

vsoch commented Aug 6, 2020

mbhall88 commented Nov 13, 2020 •

edited

Loading

vsoch commented Nov 13, 2020

vsoch commented Nov 13, 2020

dlaehnemann left a comment

mbhall88 commented Nov 15, 2020 •

edited

Loading

dlaehnemann commented Nov 16, 2020

mbhall88 commented Nov 16, 2020

add rasusa wrapper #152

add rasusa wrapper #152

Conversation

mbhall88 commented Aug 6, 2020 • edited Loading

vsoch commented Aug 6, 2020

mbhall88 commented Nov 13, 2020 • edited Loading

vsoch commented Nov 13, 2020

vsoch commented Nov 13, 2020

dlaehnemann left a comment

Choose a reason for hiding this comment

mbhall88 commented Nov 15, 2020 • edited Loading

dlaehnemann commented Nov 16, 2020

mbhall88 commented Nov 16, 2020

mbhall88 commented Aug 6, 2020 •

edited

Loading

mbhall88 commented Nov 13, 2020 •

edited

Loading

mbhall88 commented Nov 15, 2020 •

edited

Loading