Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues arising with Pangolin v4.0.x #31

Closed
ArtPoon opened this issue Apr 14, 2022 · 6 comments
Closed

Issues arising with Pangolin v4.0.x #31

ArtPoon opened this issue Apr 14, 2022 · 6 comments

Comments

@ArtPoon
Copy link
Contributor

ArtPoon commented Apr 14, 2022

  • Usher seems to consume an enormous amount of time - switching --analysis-mode to pangolearn helps a lot
  • beware of the upgrade process, I encountered major issues with faToVcf and ended up having to scrap the entire conda environment and re-install from scratch
@ArtPoon
Copy link
Contributor Author

ArtPoon commented Apr 14, 2022

Needed to get mpi4py in the conda environment:
conda install -c conda-forge mpi4py openmpi

@ArtPoon
Copy link
Contributor Author

ArtPoon commented Apr 14, 2022

Wow, memory consumption has skyrocketed. I used to be able to run Pangolin on 8 cores via MPI, but now just 2 cores are each consuming over 33% of my RAM (about 32GB total, so 10GB each).

@ArtPoon
Copy link
Contributor Author

ArtPoon commented May 2, 2022

Rolled pangolin back to version 3.1.20

Updated data files:

(pangolin) art@orolo:~/git/duotang/data_needed$ pangolin --update-data
pangolearn updated to 2022-04-22
constellations updated to v0.1.9
pango-designation updated to v1.8

Ran into an error:

loading model 05/02/2022, 11:59:07
/home/art/miniconda3/envs/pangolin/lib/python3.8/site-packages/sklearn/base.py:329: UserWarning: Trying to unpickle estimator DecisionTreeClassifier from version 1.0.1 when using version 0.23.1. This might lead to breaking code or invalid results. Use at your own risk.
  warnings.warn(
processing block of 752 sequences 05/02/2022, 11:59:09
[Mon May  2 11:59:10 2022]
Error in rule pangolearn:
    jobid: 0
    output: /tmp/tmpvpuwm01v/lineage_report.pass_qc.csv

RuleException:
AttributeError in line 112 of /home/art/miniconda3/envs/pangolin/lib/python3.8/site-packages/pangolin/scripts/pangolearn.smk:
'DecisionTreeClassifier' object has no attribute 'n_features_'
  File "/home/art/miniconda3/envs/pangolin/lib/python3.8/site-packages/pangolin/scripts/pangolearn.smk", line 112, in __rule_pangolearn
  File "/home/art/miniconda3/envs/pangolin/lib/python3.8/site-packages/pangolin/pangolearn/pangolearn.py", line 170, in assign_lineage
  File "/home/art/miniconda3/envs/pangolin/lib/python3.8/site-packages/sklearn/tree/_classes.py", line 922, in predict_proba
  File "/home/art/miniconda3/envs/pangolin/lib/python3.8/site-packages/sklearn/tree/_classes.py", line 395, in _validate_X_predict
  File "/home/art/miniconda3/envs/pangolin/lib/python3.8/concurrent/futures/thread.py", line 57, in run
Exiting because a job execution failed. Look above for error message
Exiting because a job execution failed. Look above for error message

@ArtPoon
Copy link
Contributor Author

ArtPoon commented May 2, 2022

Looks like we have to update sklearn to version 1.0.1

@ArtPoon
Copy link
Contributor Author

ArtPoon commented May 2, 2022

This did the trick:

(pangolin) art@orolo:~/git/duotang/data_needed$ pip install scikit-learn=="1.0.1"

@ArtPoon
Copy link
Contributor Author

ArtPoon commented May 3, 2022

RAM consumption is now a known issue:
cov-lineages/pangolin#395

@ArtPoon ArtPoon closed this as completed May 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant