-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi-threading approach #55
Comments
Hi @Teklu67. When you say "a long time", how long are we talking? And how large is your file? |
Thanks so much for the quick response. It finished sampling 30x from a fq of 690 Gb (60x coverage) in 2 days. Because I have the resources to run using several threads I thought it will finish much faster if there was an option for multi-threading. Thanks! |
Wow, that's a very big fastq file! Is it compressed (e.g., gzip)? How did you install rasusa? |
Yes it is for tetraploid wheat and compressed .gz format. I installed it through conda. |
Is your data Illumina? There's not really too much I can offer in the way of speeding rasusa up sorry. At some point I will look into whether multi-threading the IO is possible (i.e. batching reads). I'll leave this open and add it to my list of things to investigate in the coming months. Sorry, I can't do it faster, but have a lot of other research projects I am trying to juggle. However, if you (or anyone else) would like to have a go at it, I would be very happy to receive a pull request. |
It is ONT data. That is ok, thank you for your time |
In the mean time, I would suggest maybe trying to split the file up into subsets, and then randomly subsample each subset. |
Another suggestion: I suspect most of the runtime is (de)compressing the data. Switching to zstd instead of gzip should drastically improve time spent on decompression |
Hi,
This is a very useful program but it is taking long time to sub-sample from a large fastq file. I am running it on a server and would like to run it using multi-threading but I am novice to programming and not sure how to do that. Any help please?
Thanks,
The text was updated successfully, but these errors were encountered: