Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory consumption issue #4

Open
Mr-Milk opened this issue Dec 8, 2020 · 7 comments
Open

Memory consumption issue #4

Mr-Milk opened this issue Dec 8, 2020 · 7 comments

Comments

@Mr-Milk
Copy link

Mr-Milk commented Dec 8, 2020

I tried to scan motif on a genome region with hg38 build with -t 18 corresponded to my CPU number

but it raised:

OSError: [Errno 12] Cannot allocate memory

And then I tried with -t 8, the program ate up to around 50G of my RAM. I ran it on WSL2 ubuntu 20.04 TLS.

image

@hongduosun
Copy link
Member

Sorry, but how many regions were scanned?

@Mr-Milk
Copy link
Author

Mr-Milk commented Dec 8, 2020

More than 200K

@hongduosun
Copy link
Member

I'm afraid this is a temporary limit for MotifScan because only small parts of codes are refactored using C to speed up calculating motif scores. So every single motif score is stored and passing back to Python and this requires O(n_region * length_per_region * n_motif) memories.
I'll improve this in the next update.

@Mr-Milk
Copy link
Author

Mr-Milk commented Dec 8, 2020

Thanks for your answer. Just a little suggestion, I looked at your code, the parallelism is using python's multiprocessing which might be the reason for such huge memory consumption. Since it will basically copy the whole process of the current python process. It might help if you could try to implement the parallelism from C-side.

@hongduosun
Copy link
Member

Thanks a lot for your advice!

@hongduosun
Copy link
Member

This has been fixed in v1.3.0 after using pthread in the C extension. Thanks again!

@Mr-Milk
Copy link
Author

Mr-Milk commented Jan 22, 2021

I tried it with the same dataset, at some point, the programme still ate up all of my RAM and caused a system exit 😥, but it's sure better than before 🥰. Is it possible to free some unused memory, save the results to the disk during the process?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants