-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SLURM HPC architecture for running AGNOSTOS #17
Comments
Hi @mhyleung |
Thank you Antonio. Just to be safe, we have changed the cluster.yaml file and set the runs to a terabyte memory. Running it on 10 contig sequences at the moment and so far no glaring errors. Just a related question: of the different steps within AGNOSTOS, which one is the most computationally intensive to the point of needing MPI? We are now at the mmseq_clustering_update snakemake (step 4 or 5?) without activating mpi, but it seems to be running fine for now. It this simply because we are running on a small dataset, or we cannot get away with using openmpi because now the database is so big even if the input set of sequences is small? Regards Marcus |
If you only have one node, you should be fine if you remove any MPI as you will not be able to upscale to hundreds of processes. Maybe one bottleneck will be in the step when you do the all-vs-all HHblits search to get the communities. |
Thanks Antonio as usual. Just out of curiosity if I may ask, when you and your team run AGNOSTOS on multiple nodes, how much cpu and mem is used per node? Thanks Marcus |
Hi Marcus! |
Dear all
Our server has a following architecture. We currently have the same head and compute node. It seems to be ok submitting SLURM jobs as single nodes, but we are currently struggling to set up multi-nodes with our server. We suspect that SLURM requires some sort of networking between computer and head nodes with open ports. We just have 96cpus, but they seem to be separated into 2 NUMA nodes between the 96 CPUs.
We were wondering if AGNOSTOS can work via SLURM operating on a single node, or is it necessary to have AGNOSTOS running on a multi-node system?
Thank you very much
Regards
Marcus
The text was updated successfully, but these errors were encountered: