-
Notifications
You must be signed in to change notification settings - Fork 23
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
10 changed files
with
843 additions
and
25 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,68 @@ | ||
cff-version: 1.2.0 | ||
title: RADICAL-Pilot | ||
message: >- | ||
If you use this software, please cite it using the | ||
metadata from this file. | ||
type: software | ||
authors: | ||
- given-names: Andre | ||
family-names: Merzky | ||
- given-names: Matteo | ||
family-names: Turilli | ||
- given-names: Mikhail | ||
family-names: Titov | ||
- given-names: Aymen | ||
family-names: Al-Saadi | ||
- given-names: Shantenu | ||
family-names: Jha | ||
identifiers: | ||
- type: url | ||
value: 'https://github.com/radical-cybertools/radical.pilot' | ||
description: GitHub repository | ||
- type: doi | ||
value: 10.1109/TPDS.2021.3105994 | ||
repository-code: 'https://github.com/radical-cybertools/radical.pilot' | ||
url: 'https://radicalpilot.readthedocs.io/' | ||
abstract: >- | ||
RADICAL-Pilot (RP) is a Pilot system written in Python and | ||
specialized in executing applications composed of many | ||
computational tasks on high performance computing (HPC) | ||
platforms. As a Pilot system, RP separates resource | ||
acquisition from using those resources to execute | ||
application tasks. Resources are acquired by submitting a | ||
job to the batch system of an HPC machine. Once the job is | ||
scheduled on the requested resources, RP can directly | ||
schedule and launch application tasks on those resources. | ||
Thus, tasks are not scheduled via the batch system of the | ||
HPC platform, but directly on the acquired resources. | ||
keywords: | ||
- High Performance Computing (HPC) | ||
- Pilot Job | ||
- Scientific Computing | ||
license: MIT | ||
references: | ||
- type: article | ||
scope: Cite this paper if you want to reference the general concepts of the software. | ||
authors: | ||
- family-names: Merzky | ||
given-names: Andre | ||
orcid: 'https://orcid.org/0000-0002-7228-4327' | ||
- family-names: Turilli | ||
given-names: Matteo | ||
orcid: 'https://orcid.org/0000-0003-0527-1435' | ||
- family-names: Titov | ||
given-names: Mikhail | ||
orcid: 'https://orcid.org/0000-0003-2357-7382' | ||
- family-names: Al-Saadi | ||
given-names: Aymen | ||
orcid: 'https://orcid.org/0000-0001-7491-4946' | ||
- family-names: Jha | ||
given-names: Shantenu | ||
orcid: 'https://orcid.org/0000-0002-5040-026X' | ||
title: "Design and Performance Characterization of RADICAL-Pilot on Leadership-Class Platforms" | ||
year: 2022 | ||
journal: IEEE Transactions on Parallel and Distributed Systems | ||
volume: 33 | ||
issue: 4 | ||
pages: 818-829 | ||
doi: 10.1109/TPDS.2021.3105994 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,39 +1,129 @@ | ||
# RADICAL-Pilot (RP) | ||
|
||
[![Build Status](https://github.com/radical-cybertools/radical.pilot/actions/workflows/ci.yml/badge.svg)](https://github.com/radical-cybertools/radical.pilot/actions/workflows/ci.yml) | ||
[![Documentation Status](https://readthedocs.org/projects/radicalpilot/badge/?version=stable)](http://radicalpilot.readthedocs.io/en/stable/?badge=stable) | ||
[![codecov](https://codecov.io/gh/radical-cybertools/radical.pilot/branch/devel/graph/badge.svg)](https://codecov.io/gh/radical-cybertools/radical.pilot) | ||
[![OpenSSF Best Practices](https://www.bestpractices.dev/projects/8224/badge)](https://www.bestpractices.dev/projects/8224) | ||
|
||
RADICAL-Pilot (RP) is a Pilot system written in Python and specialized | ||
in executing applications composed of many computational tasks on high | ||
performance computing (HPC) platforms. As a Pilot system, RP separates resource | ||
acquisition from using those resources to execute application tasks. Resources | ||
are acquired by submitting a job to the batch system of an HPC machine. Once | ||
the job is scheduled on the requested resources, RP can directly schedule and | ||
launch application tasks on those resources. Thus, tasks are not scheduled via | ||
the batch system of the HPC platform, but directly on the acquired resources. | ||
RADICAL-Pilot (RP) executes heterogeneous tasks with maximum concurrency and at | ||
scale. RP can concurrently execute up to $10^5$ heterogeneous tasks, including | ||
single/multi core/GPU and MPI/OpenMP. Tasks can be stand-alone executables or | ||
Python functions and both types of task can be concurrently executed. | ||
|
||
RP is a [Pilot system](https://doi.org/10.1145/3177851), i.e., it separates | ||
resource acquisition from using those resources to execute application tasks. RP | ||
acquires resources by submitting a job to an HPC platform, and it can directly | ||
schedule and launch computational tasks on those resources. Thus, tasks are | ||
directly scheduled on the acquired resources, not via the batch system of the | ||
HPC platform. RP supports concurrently using single/multiple pilots on | ||
single/multiple | ||
[high performance computing (HPC) platforms](https://radicalpilot.readthedocs.io/en/stable/supported.html). | ||
|
||
RP is written in Python and exposes a simple yet powerful | ||
[API](https://radicalpilot.readthedocs.io/en/stable/apidoc.html). In 15 lines of | ||
code, you can execute an arbitrary number of executables with maximum | ||
concurrency on a | ||
[Linux container](https://hub.docker.com/u/radicalcybertools) | ||
or, by changing `resource`, on one of the | ||
[supported HPC platforms](https://radicalpilot.readthedocs.io/en/stable/supported.html). | ||
|
||
```python | ||
import radical.pilot as rp | ||
|
||
# Create a session | ||
session = rp.Session() | ||
|
||
# Create a pilot manager and a pilot | ||
pmgr = rp.PilotManager(session=session) | ||
pd_init = {'resource': 'local.localhost', | ||
'runtime' : 30, | ||
'cores' : 4} | ||
pdesc = rp.PilotDescription(pd_init) | ||
pilot = pmgr.submit_pilots(pdesc) | ||
|
||
# Crate a task manager and describe your tasks | ||
tmgr = rp.TaskManager(session=session) | ||
tmgr.add_pilots(pilot) | ||
tds = list() | ||
for i in range(8): | ||
td = rp.TaskDescription() | ||
td.executable = 'sleep' | ||
td.arguments = ['10'] | ||
tds.append(td) | ||
|
||
# Submit your tasks for execution | ||
tmgr.submit_tasks(tds) | ||
tmgr.wait_tasks() | ||
|
||
# Close your session | ||
session.close(cleanup=True) | ||
``` | ||
|
||
## Quick Start | ||
|
||
Run RP's [quick start tutorial](https://mybinder.org/v2/gh/radical-cybertools/radical.pilot/HEAD?labpath=docs%2Fsource%2Fgetting_started.ipynb) directly on Binder. No installation needed. | ||
|
||
After going through the tutorial, install RP and start to code your application: | ||
|
||
```shell | ||
python -m venv ~/.ve/radical-pilot | ||
. ~/.ve/radical-pilot/bin/activate | ||
pip install radical.pilot | ||
``` | ||
|
||
Note that other than `venv`, you can also use | ||
[`virtualenv`](https://radicalpilot.readthedocs.io/en/stable/getting_started.html#Virtualenv), | ||
[`conda`](https://radicalpilot.readthedocs.io/en/stable/getting_started.html#Conda) | ||
or | ||
[`spack`](https://radicalpilot.readthedocs.io/en/stable/getting_started.html#Spack). | ||
|
||
For some inspiration, see our RP application | ||
[examples](https://github.com/radical-cybertools/radical.pilot/tree/devel/examples), | ||
starting from | ||
[00_getting_started.py](https://github.com/radical-cybertools/radical.pilot/blob/devel/examples/00_getting_started.py) | ||
. | ||
|
||
## Documentation | ||
|
||
Full system description and usage examples are available at: | ||
https://radicalpilot.readthedocs.io/en/stable/ | ||
[RP user documentation](https://radicalpilot.readthedocs.io/en/stable/) uses Sphinx, and it is published on Read the Docs. | ||
|
||
[RP tutorials](https://mybinder.org/v2/gh/radical-cybertools/radical.pilot/HEAD) can be run via Binder. | ||
|
||
## Developers | ||
|
||
RP development uses Git and | ||
[GitHub](https://github.com/radical-cybertools/radical.pilot). RP **requires** | ||
Python3, a virtual environment and a GNU/Linux OS. Clone, install and | ||
test RP: | ||
|
||
Additional information is provided in the | ||
[wiki](https://github.com/radical-cybertools/radical.pilot/wiki) section of RP | ||
GitHub repository. | ||
```shell | ||
python -m venv ~/.ve/rp-docs | ||
. ~/.ve/rp-docs/bin/activate | ||
git clone [email protected]:radical-cybertools/radical.pilot.git | ||
cd radical.pilot | ||
pip install -r requirements-docs.txt | ||
sphinx-build -M html docs/source/ docs/build/ | ||
``` | ||
|
||
## Code | ||
RP documentation uses tutorials coded as Jupyter notebooks. `Sphinx` and | ||
`nbsphinx` run RP locally to execute those tutorials. Successful compilation of | ||
the documentation also serves as a validation of your local development | ||
environment. | ||
|
||
Generally, the `master` branch reflects the RP release published on | ||
[PyPI](https://pypi.org/project/radical.pilot/), and is considered stable: | ||
it should work 'out of the box' for the supported backends. For a list of | ||
supported backends, please refer to the documentation. | ||
## Provide Feedback | ||
|
||
The `devel` branch (and any branch other than master) may not correspond to the | ||
published documentation and, specifically, may have dependencies which need to | ||
be resolved manually. | ||
Have a question, feature request or you found a bug? Feel free to open a | ||
[support ticket](https://github.com/radical-cybertools/radical.pilot/issues). | ||
For vulnerabilities, please draft a private | ||
[security advisory](https://github.com/radical-cybertools/radical.pilot/security/advisories). | ||
|
||
## Integration Tests status | ||
These badges show the state of the current integration tests on different HPCs RADICAL Pilot supports | ||
## Contributing | ||
|
||
[![ORNL Summit Integration Tests](https://github.com/radical-cybertools/radical.pilot/actions/workflows/summit.yml/badge.svg)](https://github.com/radical-cybertools/radical.pilot/actions/workflows/summit.yml) | ||
[![PSC Bridges2 Integration Tests](https://github.com/radical-cybertools/radical.pilot/actions/workflows/bridges.yml/badge.svg)](https://github.com/radical-cybertools/radical.pilot/actions/workflows/bridges.yml) | ||
We welcome everyone that wants to contribute to RP development. We are an open | ||
and welcoming community, committed to making participation a harassment-free | ||
experience for everyone. See our | ||
[Code of Conduct](https://radicalpilot.readthedocs.io/en/stable/process/code_of_conduct.html), | ||
relevant | ||
[technical documentation](https://radicalpilot.readthedocs.io/en/stable/process/contributing.html) | ||
and feel free to | ||
[get in touch](https://github.com/radical-cybertools/radical.pilot/issues). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,93 @@ | ||
.. _branching_model: | ||
|
||
Branching Model | ||
=============== | ||
|
||
RADICAL-Pilot (RP) uses `git-flow | ||
<http://nvie.com/posts/a-successful-git-branching-model/>`__ as branching model, | ||
with some simplifications. We follow `semantic version numbering | ||
<http://semver.org/>`__. | ||
|
||
- Release candidates and releases are tagged in the ``master`` branch (we do | ||
not use dedicated release branches at this point). | ||
|
||
- A release is prepared by: | ||
|
||
- Tagging a release candidate on ``devel`` (e.g. ``v1.23RC4``); | ||
- testing that RC; | ||
- problems are fixed in ``devel``, toward a new release candidate; | ||
- once the RC is found stable, ``devel`` is merged to master, the release is | ||
tagged on master (e.g. ``v1.23``) and shipped to PyPI. | ||
|
||
- Urgent hotfix releases: | ||
|
||
- Branch from master to ``hotfix/problem_name``; | ||
- fix the problem in that branch; | ||
- test that branch; | ||
- merge back to master and prepare release candidate for hotfix release. | ||
|
||
- Normal bug fixes: | ||
|
||
- Branch of ``devel``, naming convention: ``fix/issue_1234`` (reference | ||
GitHub issue); | ||
- fix in that branch, and test; | ||
- create pull request toward ``devel``; | ||
- code review, then merge. | ||
|
||
- Major development activities go into feature branches: | ||
|
||
- Branch ``devel`` into ``feature/feature_name``; | ||
- work on that feature branch; | ||
- on completion, merge ``devel`` into the feature branch (that should be | ||
done frequently if possible, to avoid large deviation (== pain) of the | ||
branches); | ||
- test the feature branch; | ||
- create a pull request for merging the feature branch into ``devel`` (that | ||
should be a fast-forward now); | ||
- merging of feature branches into ``devel`` should be discussed with the | ||
group *before* they are performed, and only after code review. | ||
|
||
- Documentation changes are handled like fix or feature branches, depending on | ||
size and impact, similar to code changes. | ||
|
||
Branch Naming | ||
------------- | ||
|
||
- ``devel``, ``master``: *never* commit directly to those; | ||
- ``feature/abc``: development of new features; | ||
- ``fix/abc_123``: referring to ticket 123; | ||
- ``hotfix/abc_123``: referring to ticket 123, to be released right after merge | ||
to master; | ||
- ``experiment/sc16``: experiments toward a specific publication etc. Cannot be | ||
merged, they will be converted to tags after experiments conclude; | ||
- ``project/xyz``: branch for a dedicated group of people, usually contains | ||
unreleased features/fixes, and is not expected to be merged back; | ||
- ``tmp/abc``: temporary branch, will be deleted soon; | ||
- ``test/abc``: test some integration, like a merge of two feature branches. | ||
|
||
For the latter: assume you want to test how ``feature/a`` works in combination | ||
with ``feature/b``, then: | ||
|
||
- ``git checkout feature/a``; | ||
- ``git checkout -b test/a_b``; | ||
- ``git merge feature/b``; | ||
- do tests. | ||
|
||
Branching Policies | ||
------------------ | ||
|
||
All branches are ideally short living. To support this, only a limited number of | ||
branches should be open at any point in time. Like, only ``N`` branches for | ||
fixes and ``M << N`` branches for features should be open for each developer - | ||
other features / issues have to wait. | ||
|
||
Some additional rules | ||
--------------------- | ||
|
||
- Commits, in particular for bug fixes, should be self-contained so make it | ||
easy to use ``git cherry-pick``, so that bug fixes can quickly be transferred | ||
to other branches (such as hotfixes). | ||
- Do not use ``git rebase``, unless you *really* know what you are doing. | ||
- You may not want to use the tools available for ``git-flow`` -- those have | ||
given us inconsistent results in the past, partially because they used | ||
rebase. |
Oops, something went wrong.