Skip to content

Commit

Permalink
update to latest master
Browse files Browse the repository at this point in the history
  • Loading branch information
benclifford committed Aug 14, 2024
2 parents 0b00a7c + 2067b40 commit 3d1334b
Show file tree
Hide file tree
Showing 54 changed files with 573 additions and 788 deletions.
6 changes: 3 additions & 3 deletions .github/workflows/parsl+flux.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -31,12 +31,12 @@ jobs:
run: |
pytest parsl/tests/ -k "not cleannet and not unix_filesystem_permissions_required" --config parsl/tests/configs/local_threads.py --random-order --durations 10
- name: Start Flux and Test Parsl with Flux
- name: Test Parsl with Flux
run: |
flux start pytest parsl/tests/test_flux.py --config local --random-order
pytest parsl/tests/test_flux.py --config local --random-order
- name: Test Parsl with Flux Config
run: |
flux start pytest parsl/tests/ -k "not cleannet and not unix_filesystem_permissions_required" --config parsl/tests/configs/flux_local.py --random-order --durations 10
pytest parsl/tests/ -k "not cleannet and not unix_filesystem_permissions_required" --config parsl/tests/configs/flux_local.py --random-order --durations 10
2 changes: 1 addition & 1 deletion README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ For Developers

3. Install::

$ cd parsl
$ cd parsl # only if you didn't enter the top-level directory in step 2 above
$ python3 setup.py install

4. Use Parsl!
Expand Down
8 changes: 4 additions & 4 deletions docs/historical/changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -334,7 +334,7 @@ New Functionality
* New launcher: `parsl.launchers.WrappedLauncher` for launching tasks inside containers.

* `parsl.channels.SSHChannel` now supports a ``key_filename`` kwarg `issue#1639 <https://github.com/Parsl/parsl/issues/1639>`_
* ``parsl.channels.SSHChannel`` now supports a ``key_filename`` kwarg `issue#1639 <https://github.com/Parsl/parsl/issues/1639>`_

* Newly added Makefile wraps several frequent developer operations such as:

Expand Down Expand Up @@ -442,7 +442,7 @@ New Functionality
module, parsl.data_provider.globus

* `parsl.executors.WorkQueueExecutor`: a new executor that integrates functionality from `Work Queue <http://ccl.cse.nd.edu/software/workqueue/>`_ is now available.
* New provider to support for Ad-Hoc clusters `parsl.providers.AdHocProvider`
* New provider to support for Ad-Hoc clusters ``parsl.providers.AdHocProvider``
* New provider added to support LSF on Summit `parsl.providers.LSFProvider`
* Support for CPU and Memory resource hints to providers `(github) <https://github.com/Parsl/parsl/issues/942>`_.
* The ``logging_level=logging.INFO`` in `parsl.monitoring.MonitoringHub` is replaced with ``monitoring_debug=False``:
Expand All @@ -468,7 +468,7 @@ New Functionality
* Several test-suite improvements that have dramatically reduced test duration.
* Several improvements to the Monitoring interface.
* Configurable port on `parsl.channels.SSHChannel`.
* Configurable port on ``parsl.channels.SSHChannel``.
* ``suppress_failure`` now defaults to True.
* `parsl.executors.HighThroughputExecutor` is the recommended executor, and ``IPyParallelExecutor`` is deprecated.
* `parsl.executors.HighThroughputExecutor` will expose worker information via environment variables: ``PARSL_WORKER_RANK`` and ``PARSL_WORKER_COUNT``
Expand Down Expand Up @@ -532,7 +532,7 @@ New Functionality
* Cleaner user app file log management.
* Updated configurations using `parsl.executors.HighThroughputExecutor` in the configuration section of the userguide.
* Support for OAuth based SSH with `parsl.channels.OAuthSSHChannel`.
* Support for OAuth based SSH with ``parsl.channels.OAuthSSHChannel``.

Bug Fixes
^^^^^^^^^
Expand Down
13 changes: 3 additions & 10 deletions docs/reference.rst
Original file line number Diff line number Diff line change
Expand Up @@ -38,15 +38,9 @@ Configuration
Channels
========

.. autosummary::
:toctree: stubs
:nosignatures:

parsl.channels.base.Channel
parsl.channels.LocalChannel
parsl.channels.SSHChannel
parsl.channels.OAuthSSHChannel
parsl.channels.SSHInteractiveLoginChannel
Channels are deprecated in Parsl. See
`issue 3515 <https://github.com/Parsl/parsl/issues/3515>`_
for further discussion.

Data management
===============
Expand Down Expand Up @@ -109,7 +103,6 @@ Providers
:toctree: stubs
:nosignatures:

parsl.providers.AdHocProvider
parsl.providers.AWSProvider
parsl.providers.CobaltProvider
parsl.providers.CondorProvider
Expand Down
16 changes: 9 additions & 7 deletions docs/userguide/checkpoints.rst
Original file line number Diff line number Diff line change
Expand Up @@ -49,15 +49,17 @@ during development. Using app caching will ensure that only modified apps are re
App equivalence
^^^^^^^^^^^^^^^

Parsl determines app equivalence by storing the hash
of the app function. Thus, any changes to the app code (e.g.,
its signature, its body, or even the docstring within the body)
will invalidate cached values.
Parsl determines app equivalence using the name of the app function:
if two apps have the same name, then they are equivalent under this
relation.

However, Parsl does not traverse the call graph of the app function,
so changes inside functions called by an app will not invalidate
Changes inside the app, or by functions called by an app will not invalidate
cached values.

There are lots of other ways functions might be compared for equivalence,
and `parsl.dataflow.memoization.id_for_memo` provides a hook to plug in
alternate application-specific implementations.


Invocation equivalence
^^^^^^^^^^^^^^^^^^^^^^
Expand Down Expand Up @@ -92,7 +94,7 @@ Attempting to cache apps invoked with other, non-hashable, data types will
lead to an exception at invocation.

In that case, mechanisms to hash new types can be registered by a program by
implementing the ``parsl.dataflow.memoization.id_for_memo`` function for
implementing the `parsl.dataflow.memoization.id_for_memo` function for
the new type.

Ignoring arguments
Expand Down
81 changes: 45 additions & 36 deletions docs/userguide/configuring.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,15 +15,14 @@ queues, durations, and data management options.
The following example shows a basic configuration object (:class:`~parsl.config.Config`) for the Frontera
supercomputer at TACC.
This config uses the `parsl.executors.HighThroughputExecutor` to submit
tasks from a login node (`parsl.channels.LocalChannel`). It requests an allocation of
tasks from a login node. It requests an allocation of
128 nodes, deploying 1 worker for each of the 56 cores per node, from the normal partition.
To limit network connections to just the internal network the config specifies the address
used by the infiniband interface with ``address_by_interface('ib0')``

.. code-block:: python
from parsl.config import Config
from parsl.channels import LocalChannel
from parsl.providers import SlurmProvider
from parsl.executors import HighThroughputExecutor
from parsl.launchers import SrunLauncher
Expand All @@ -36,7 +35,6 @@ used by the infiniband interface with ``address_by_interface('ib0')``
address=address_by_interface('ib0'),
max_workers_per_node=56,
provider=SlurmProvider(
channel=LocalChannel(),
nodes_per_block=128,
init_blocks=1,
partition='normal',
Expand Down Expand Up @@ -197,22 +195,6 @@ Stepping through the following question should help formulate a suitable configu
are on a **native Slurm** system like :ref:`configuring_nersc_cori`


4) Where will the main Parsl program run and how will it communicate with the apps?

+------------------------+--------------------------+---------------------------------------------------+
| Parsl program location | App execution target | Suitable channel |
+========================+==========================+===================================================+
| Laptop/Workstation | Laptop/Workstation | `parsl.channels.LocalChannel` |
+------------------------+--------------------------+---------------------------------------------------+
| Laptop/Workstation | Cloud Resources | No channel is needed |
+------------------------+--------------------------+---------------------------------------------------+
| Laptop/Workstation | Clusters with no 2FA | `parsl.channels.SSHChannel` |
+------------------------+--------------------------+---------------------------------------------------+
| Laptop/Workstation | Clusters with 2FA | `parsl.channels.SSHInteractiveLoginChannel` |
+------------------------+--------------------------+---------------------------------------------------+
| Login node | Cluster/Supercomputer | `parsl.channels.LocalChannel` |
+------------------------+--------------------------+---------------------------------------------------+

Heterogeneous Resources
-----------------------

Expand Down Expand Up @@ -324,9 +306,13 @@ and Work Queue does not require Python to run.
Accelerators
------------

Many modern clusters provide multiple accelerators per compute note, yet many applications are best suited to using a single accelerator per task.
Parsl supports pinning each worker to difference accelerators using ``available_accelerators`` option of the :class:`~parsl.executors.HighThroughputExecutor`.
Provide either the number of executors (Parsl will assume they are named in integers starting from zero) or a list of the names of the accelerators available on the node.
Many modern clusters provide multiple accelerators per compute note, yet many applications are best suited to using a
single accelerator per task. Parsl supports pinning each worker to different accelerators using
``available_accelerators`` option of the :class:`~parsl.executors.HighThroughputExecutor`. Provide either the number of
executors (Parsl will assume they are named in integers starting from zero) or a list of the names of the accelerators
available on the node. Parsl will limit the number of workers it launches to the number of accelerators specified,
in other words, you cannot have more workers per node than there are accelerators. By default, Parsl will launch
as many workers as the accelerators specified via ``available_accelerators``.

.. code-block:: python
Expand All @@ -337,7 +323,6 @@ Provide either the number of executors (Parsl will assume they are named in inte
worker_debug=True,
available_accelerators=2,
provider=LocalProvider(
channel=LocalChannel(),
init_blocks=1,
max_blocks=1,
),
Expand All @@ -346,7 +331,38 @@ Provide either the number of executors (Parsl will assume they are named in inte
strategy='none',
)
For hardware that uses Nvidia devices, Parsl allows for the oversubscription of workers to GPUS. This is intended to make use of Nvidia's `Multi-Process Service (MPS) <https://docs.nvidia.com/deploy/mps/>`_ available on many of their GPUs that allows users to run multiple concurrent processes on a single GPU. The user needs to set in the ``worker_init`` commands to start MPS on every node in the block (this is machine dependent). The ``available_accelerators`` option should then be set to the total number of GPU partitions run on a single node in the block. For example, for a node with 4 Nvidia GPUs, to create 8 workers per GPU, set ``available_accelerators=32``. GPUs will be assigned to workers in ascending order in contiguous blocks. In the example, workers 0-7 will be placed on GPU 0, workers 8-15 on GPU 1, workers 16-23 on GPU 2, and workers 24-31 on GPU 3.
It is possible to bind multiple/specific accelerators to each worker by specifying a list of comma separated strings
each specifying accelerators. In the context of binding to NVIDIA GPUs, this works by setting ``CUDA_VISIBLE_DEVICES``
on each worker to a specific string in the list supplied to ``available_accelerators``.

Here's an example:

.. code-block:: python
# The following config is trimmed for clarity
local_config = Config(
executors=[
HighThroughputExecutor(
# Starts 2 workers per node, each bound to 2 GPUs
available_accelerators=["0,1", "2,3"],
# Start a single worker bound to all 4 GPUs
# available_accelerators=["0,1,2,3"]
)
],
)
GPU Oversubscription
""""""""""""""""""""

For hardware that uses Nvidia devices, Parsl allows for the oversubscription of workers to GPUS. This is intended to
make use of Nvidia's `Multi-Process Service (MPS) <https://docs.nvidia.com/deploy/mps/>`_ available on many of their
GPUs that allows users to run multiple concurrent processes on a single GPU. The user needs to set in the
``worker_init`` commands to start MPS on every node in the block (this is machine dependent). The
``available_accelerators`` option should then be set to the total number of GPU partitions run on a single node in the
block. For example, for a node with 4 Nvidia GPUs, to create 8 workers per GPU, set ``available_accelerators=32``.
GPUs will be assigned to workers in ascending order in contiguous blocks. In the example, workers 0-7 will be placed
on GPU 0, workers 8-15 on GPU 1, workers 16-23 on GPU 2, and workers 24-31 on GPU 3.

Multi-Threaded Applications
---------------------------
Expand All @@ -372,7 +388,6 @@ Select the best blocking strategy for processor's cache hierarchy (choose ``alte
worker_debug=True,
cpu_affinity='alternating',
provider=LocalProvider(
channel=LocalChannel(),
init_blocks=1,
max_blocks=1,
),
Expand Down Expand Up @@ -412,18 +427,12 @@ These include ``OMP_NUM_THREADS``, ``GOMP_COMP_AFFINITY``, and ``KMP_THREAD_AFFI
Ad-Hoc Clusters
---------------

Any collection of compute nodes without a scheduler can be considered an
ad-hoc cluster. Often these machines have a shared file system such as NFS or Lustre.
In order to use these resources with Parsl, they need to set-up for password-less SSH access.

To use these ssh-accessible collection of nodes as an ad-hoc cluster, we use
the `parsl.providers.AdHocProvider` with an `parsl.channels.SSHChannel` to each node. An example
configuration follows.
Parsl's support of ad-hoc clusters of compute nodes without a scheduler
is deprecated.

.. literalinclude:: ../../parsl/configs/ad_hoc.py

.. note::
Multiple blocks should not be assigned to each node when using the `parsl.executors.HighThroughputExecutor`
See
`issue #3515 <https://github.com/Parsl/parsl/issues/3515>`_
for further discussion.

Amazon Web Services
-------------------
Expand Down
5 changes: 1 addition & 4 deletions docs/userguide/examples/config.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
from parsl.channels import LocalChannel
from parsl.config import Config
from parsl.executors import HighThroughputExecutor
from parsl.providers import LocalProvider
Expand All @@ -8,9 +7,7 @@
HighThroughputExecutor(
label="htex_local",
cores_per_worker=1,
provider=LocalProvider(
channel=LocalChannel(),
),
provider=LocalProvider(),
)
],
)
3 changes: 1 addition & 2 deletions docs/userguide/execution.rst
Original file line number Diff line number Diff line change
Expand Up @@ -47,8 +47,7 @@ Parsl currently supports the following providers:
7. `parsl.providers.AWSProvider`: This provider allows you to provision and manage cloud nodes from Amazon Web Services.
8. `parsl.providers.GoogleCloudProvider`: This provider allows you to provision and manage cloud nodes from Google Cloud.
9. `parsl.providers.KubernetesProvider`: This provider allows you to provision and manage containers on a Kubernetes cluster.
10. `parsl.providers.AdHocProvider`: This provider allows you manage execution over a collection of nodes to form an ad-hoc cluster.
11. `parsl.providers.LSFProvider`: This provider allows you to schedule resources via IBM's LSF scheduler.
10. `parsl.providers.LSFProvider`: This provider allows you to schedule resources via IBM's LSF scheduler.



Expand Down
7 changes: 7 additions & 0 deletions docs/userguide/mpi_apps.rst
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,13 @@ An example for ALCF's Polaris supercomputer that will run 3 MPI tasks of 2 nodes
)
.. warning::
Please note that ``Provider`` options that specify per-task or per-node resources, for example,
``SlurmProvider(cores_per_node=N, ...)`` should not be used with :class:`~parsl.executors.high_throughput.MPIExecutor`.
Parsl primarily uses a pilot job model and assumptions from that context do not translate to the MPI context. For
more info refer to :
`github issue #3006 <https://github.com/Parsl/parsl/issues/3006>`_

Writing an MPI App
------------------

Expand Down
11 changes: 5 additions & 6 deletions docs/userguide/plugins.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,8 @@ executor to run code on the local submitting host, while another executor can
run the same code on a large supercomputer.


Providers, Launchers and Channels
---------------------------------
Providers and Launchers
-----------------------
Some executors are based on blocks of workers (for example the
`parsl.executors.HighThroughputExecutor`: the submit side requires a
batch system (eg slurm, kubernetes) to start worker processes, which then
Expand All @@ -34,10 +34,9 @@ add on any wrappers that are needed to launch the command (eg srun inside
slurm). Providers and launchers are usually paired together for a particular
system type.

A `Channel` allows the commands used to interact with an `ExecutionProvider` to be
executed on a remote system. The default channel executes commands on the
local system, but a few variants of an `parsl.channels.SSHChannel` are provided.

Parsl also has a deprecated ``Channel`` abstraction. See
`issue 3515 <https://github.com/Parsl/parsl/issues/3515>`_
for further discussion.

File staging
------------
Expand Down
8 changes: 1 addition & 7 deletions parsl/channels/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,16 +3,10 @@
if TYPE_CHECKING:
from parsl.channels.base import Channel
from parsl.channels.local.local import LocalChannel
from parsl.channels.oauth_ssh.oauth_ssh import OAuthSSHChannel
from parsl.channels.ssh.ssh import SSHChannel
from parsl.channels.ssh_il.ssh_il import SSHInteractiveLoginChannel

lazys = {
'Channel': 'parsl.channels.base',
'SSHChannel': 'parsl.channels.ssh.ssh',
'LocalChannel': 'parsl.channels.local.local',
'SSHInteractiveLoginChannel': 'parsl.channels.ssh_il.ssh_il',
'OAuthSSHChannel': 'parsl.channels.oauth_ssh.oauth_ssh',
}

import parsl.channels as px
Expand All @@ -33,4 +27,4 @@ def lazy_loader(name):

px.__getattr__ = lazy_loader # type: ignore[method-assign]

__all__ = ['Channel', 'SSHChannel', 'LocalChannel', 'SSHInteractiveLoginChannel', 'OAuthSSHChannel']
__all__ = ['Channel', 'LocalChannel']
4 changes: 2 additions & 2 deletions parsl/channels/oauth_ssh/oauth_ssh.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@

import paramiko

from parsl.channels.ssh.ssh import SSHChannel
from parsl.channels.ssh.ssh import DeprecatedSSHChannel
from parsl.errors import OptionalModuleMissing

try:
Expand All @@ -17,7 +17,7 @@
logger = logging.getLogger(__name__)


class OAuthSSHChannel(SSHChannel):
class DeprecatedOAuthSSHChannel(DeprecatedSSHChannel):
"""SSH persistent channel. This enables remote execution on sites
accessible via ssh. This channel uses Globus based OAuth tokens for authentication.
"""
Expand Down
2 changes: 1 addition & 1 deletion parsl/channels/ssh/ssh.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ def _auth(self, username, *args):
return


class SSHChannel(Channel, RepresentationMixin):
class DeprecatedSSHChannel(Channel, RepresentationMixin):
''' SSH persistent channel. This enables remote execution on sites
accessible via ssh. It is assumed that the user has setup host keys
so as to ssh to the remote host. Which goes to say that the following
Expand Down
4 changes: 2 additions & 2 deletions parsl/channels/ssh_il/ssh_il.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,12 @@

import paramiko

from parsl.channels.ssh.ssh import SSHChannel
from parsl.channels.ssh.ssh import DeprecatedSSHChannel

logger = logging.getLogger(__name__)


class SSHInteractiveLoginChannel(SSHChannel):
class DeprecatedSSHInteractiveLoginChannel(DeprecatedSSHChannel):
"""SSH persistent channel. This enables remote execution on sites
accessible via ssh. This channel supports interactive login and is appropriate when
keys are not set up.
Expand Down
Loading

0 comments on commit 3d1334b

Please sign in to comment.