Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testing of pangolearn analysis mode #433

Closed
wm75 opened this issue Apr 13, 2022 · 5 comments
Closed

Testing of pangolearn analysis mode #433

wm75 opened this issue Apr 13, 2022 · 5 comments

Comments

@wm75
Copy link
Contributor

wm75 commented Apr 13, 2022

I'm trying to update our Galaxy wrapper of pangolin to v4, but our tests with --analysis-mode pangolearn are failing without a proper error message. The same tests are passing locally for me.
Could it be that the pangolearn model has just become too resource-hungry by now to be compatible with our CI?
If that's the case, is there a less challenging way to test the pangolearn mode?

Here's my currently stalled PR: galaxyproject/tools-iuc#4494
I've pulled the important parts of the CI output into the conversation to make it easier for you to help if you can.

Thanks a lot for any insight!

@aineniamh
Copy link
Member

Have a read through issue #395, we've added a warning to the latest version as random forest does take more RAM, but actually shows greater accuracy whilst still being a very fast method!

@wm75
Copy link
Contributor Author

wm75 commented Apr 13, 2022

Thanks @aineniamh. Had missed the issue with the actual numbers.
Is there no workaround for testing purposes then? If possible, I would like to have the pangolearn branch covered by our Galaxy wrapper tests and not just test usher exclusively.

@wm75
Copy link
Contributor Author

wm75 commented Apr 13, 2022

btw, my local machine only has 8GB of memory, and if I close all other programs I can make our tests (which use just one single sequence) pass using one thread only. So at least under ideal conditions you can still get away with less than the 12GB mentioned in #395 :)

@aineniamh
Copy link
Member

So I think the actual ram usage is over 7GB because the github actions tests fail on a small file- but for macosx they have 14G available so they pass. I'm not the developer of pangolearn (she's sadly for us all moved on to a new role) so I'm not going to be much help with playing with the RAM requirements. I know that if we decrease the number of components in the random forest the RAM requirements decrease too, but with a big decrease in assignment accuracy too though.

@wm75
Copy link
Contributor Author

wm75 commented Apr 13, 2022

Thanks for the clarifications! At least it's good to know what's going on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants