-
Notifications
You must be signed in to change notification settings - Fork 8
XSEDE 2012 Tutorial
pradeepmantha edited this page Jun 29, 2012
·
25 revisions
Do you want to run large-scale data/compute intensive applications on "distributed, heterogenous" HPC clusters with "minimal" queue waiting time? Then the following tutorial is for you.
- What software tools could help me to solve that?
- How/Why does the software solve the problems?
- Great! Any simple examples?
- Good! Any real science example?
- I want to try it on my own on CyberInfrastructures/ or on My own Cluster, any support?
ToDo's for Team: [ Need to be removed once addressed ]
We will also demo a multi-machine run. Because of the logistics issues involved, we will need to submit this job either:
- In the morning, have it run for the entire day
- In a reserved queue.
For the queue reservation, Yaakoub is the person to contact. I will need the demo size, the username from which the demo will be submitted and so on.
Other things to consider including queue wait times. I can and will add the training accounts to an allocation with escalated privilege and a reserved chassis. This means people using the training accounts will have hardware reserved for them to run.
Which version of Bliss & Pilot-API need to be used? Current roadblocks -
Latest version of BigJob not released - because -
AndreL - Enough testing not done to release the package.
Ole, Melissa, Pradeep - Waiting for the new package to release to test. ( deadlock? )
- Somehow reluctant to test directly from source.
Solution - If alpha package is released by AndreL and installed in a separate directory by AndreM,
- Ole, Melissa, Pradeep test the alpha package
- if problems found report all the bugs, new production version will be released.
- No bugs- Great, thats our tutorial version of Pilot-API and Bliss.