Update checkpoints.rst

Parsl · Aug 9, 2024 · fc19a01 · fc19a01
1 parent 573ba8a
commit fc19a01
Showing 1 changed file with 50 additions and 56 deletions.
diff --git a/docs/userguide/checkpoints.rst b/docs/userguide/checkpoints.rst
@@ -3,32 +3,31 @@
 Memoization and checkpointing
 -----------------------------
 
-When an app is invoked several times with the same parameters, Parsl can
-reuse the result from the first invocation without executing the app again.
+When an :ref:`App <app>` is invoked several times with the same parameters, Parsl can
+reuse the result from the first invocation without executing the :ref:`App <app>` again.
 
 This can save time and computational resources.
 
 This is done in two ways:
 
 * Firstly, *app caching* will allow reuse of results within the same run.
 
-* Building on top of that, *checkpointing* will store results on the filesystem
+* Building on top of that, *:ref:`Checkpointing <checkpointing>`* will store results on the filesystem
   and reuse those results in later runs.
 
 .. _label-appcaching:
 
 App caching
 ===========
 
-
 There are many situations in which a program may be re-executed
 over time. Often, large fragments of the program will not have changed 
-and therefore, re-execution of apps will waste valuable time and 
-computation resources. Parsl's app caching solves this problem by 
-storing results from apps that have successfully completed
+and therefore, re-execution of :ref:`Apps <app>` will waste valuable time and 
+computation resources. Parsl's :ref:`App caching <memoization>` solves this problem by 
+storing results from :ref:`Apps <app>` that have successfully completed
 so that they can be re-used. 
 
-App caching is enabled by setting the ``cache``
+:ref:`App caching <memoization>` is enabled by setting the ``cache``
 argument in the :func:`~parsl.app.app.python_app` or :func:`~parsl.app.app.bash_app` 
 decorator to ``True`` (by default it is ``False``). 
 
@@ -38,33 +37,31 @@ decorator to ``True`` (by default it is ``False``).
    def hello (msg, stdout=None):
        return 'echo {}'.format(msg)
 			
-App caching can be globally disabled by setting ``app_cache=False``
+:ref:`App caching <memoization>` can be globally disabled by setting ``app_cache=False``
 in the :class:`~parsl.config.Config`.
 
-App caching can be particularly useful when developing interactive programs such as when
-using a Jupyter notebook. In this case, cells containing apps are often re-executed
-during development. Using app caching will ensure that only modified apps are re-executed.
-
+:ref:`App caching <memoization>` can be particularly useful when developing interactive programs such as when
+using a Jupyter notebook. In this case, cells containing :ref:`Apps <app>` are often re-executed
+during development. Using :ref:`App caching <memoization>` will ensure that only modified :ref:`Apps <app>` are re-executed.
 
 App equivalence 
 ^^^^^^^^^^^^^^^
 
-Parsl determines app equivalence using the name of the app function:
-if two apps have the same name, then they are equivalent under this
+Parsl determines :ref:`App <app>` equivalence using the name of the :ref:`App <app>` function:
+if two :ref:`Apps <app>` have the same name, then they are equivalent under this
 relation.
 
-Changes inside the app, or by functions called by an app will not invalidate
+Changes inside the :ref:`App <app>`, or by functions called by an :ref:`App <app>` will not invalidate
 cached values.
 
 There are lots of other ways functions might be compared for equivalence,
 and `parsl.dataflow.memoization.id_for_memo` provides a hook to plug in
 alternate application-specific implementations.
 
-
 Invocation equivalence 
 ^^^^^^^^^^^^^^^^^^^^^^
 
-Two app invocations are determined to be equivalent if their
+Two :ref:`App <app>` invocations are determined to be equivalent if their
 input arguments are identical.
 
 In simple cases, this follows obvious rules:
@@ -79,9 +76,8 @@ In simple cases, this follows obvious rules:
   y = 7
   f(y).result()
 
-
 Internally, equivalence is determined by hashing the input arguments, and
-comparing the hash to hashes from previous app executions.
+comparing the hash to hashes from previous :ref:`App <app>` executions.
 
 This approach can only be applied to data types for which a deterministic hash
 can be computed.
@@ -90,7 +86,7 @@ By default Parsl can compute sensible hashes for basic data types:
 str, int, float, None, as well as more some complex types:
 functions, and dictionaries and lists containing hashable types.
 
-Attempting to cache apps invoked with other, non-hashable, data types will 
+Attempting to cache :ref:`Apps <app>` invoked with other, non-hashable, data types will 
 lead to an exception at invocation.
 
 In that case, mechanisms to hash new types can be registered by a program by
@@ -101,34 +97,33 @@ Ignoring arguments
 ^^^^^^^^^^^^^^^^^^
 
 On occasion one may wish to ignore particular arguments when determining
-app invocation equivalence - for example, when generating log file
+:ref:`App <app>` invocation equivalence - for example, when generating log file
 names automatically based on time or run information. 
 Parsl allows developers to list the arguments to be ignored
-in the ``ignore_for_cache`` app decorator parameter:
+in the ``ignore_for_cache`` :ref:`App <app>` decorator parameter:
 
 .. code-block:: python
 
    @bash_app(cache=True, ignore_for_cache=['stdout'])
    def hello (msg, stdout=None):
        return 'echo {}'.format(msg)
 
-
 Caveats
 ^^^^^^^
 
-It is important to consider several important issues when using app caching:
+It is important to consider several important issues when using :ref:`App caching <memoization>`:
 
-- Determinism: App caching is generally useful only when the apps are deterministic.
-  If the outputs may be different for identical inputs, app caching will obscure
-  this non-deterministic behavior. For instance, caching an app that returns
+- Determinism: :ref:`App caching <memoization>` is generally useful only when the :ref:`Apps <app>` are deterministic.
+  If the outputs may be different for identical inputs, :ref:`App caching <memoization>` will obscure
+  this non-deterministic behavior. For instance, caching an :ref:`App <app>` that returns
   a random number will result in every invocation returning the same result.
 
-- Timing: If several identical calls to an app are made concurrently having
-  not yet cached a result, many instances of the app will be launched.
+- Timing: If several identical calls to an :ref:`App <app>` are made concurrently having
+  not yet cached a result, many instances of the :ref:`App <app>` will be launched.
   Once one invocation completes and the result is cached
   all subsequent calls will return immediately with the cached result.
 
-- Performance: If app caching is enabled, there may be some performance
+- Performance: If :ref:`App caching <memoization>` is enabled, there may be some performance
   overhead especially if a large number of short duration tasks are launched rapidly.
   This overhead has not been quantified.
 
@@ -139,27 +134,27 @@ Checkpointing
 
 Large-scale Parsl programs are likely to encounter errors due to node failures, 
 application or environment errors, and myriad other issues. Parsl offers an
-application-level checkpointing model to improve resilience, fault tolerance, and
+application-level :ref:`Checkpointing <checkpointing>` model to improve resilience, fault tolerance, and
 efficiency.
 
 .. note::
-   Checkpointing builds on top of app caching, and so app caching must be
-   enabled. If app caching is disabled in the config ``Config.app_cache``, checkpointing will
+   :ref:`Checkpointing <checkpointing>` builds on top of :ref:`App caching <memoization>`, and so :ref:`App caching <memoization>` must be
+   enabled. If :ref:`App caching <memoization>` is disabled in the config ``Config.app_cache``, :ref:`Checkpointing <checkpointing>` will
    not work.
 
-Parsl follows an incremental checkpointing model, where each checkpoint file contains
+Parsl follows an incremental :ref:`Checkpointing <checkpointing>` model, where each checkpoint file contains
 all results that have been updated since the last checkpoint.
 
 When a Parsl program loads a checkpoint file and is executed, it will use 
-checkpointed results for any apps that have been previously executed. 
-Like app caching, checkpoints
-use the hash of the app and the invocation input parameters to identify previously computed
-results. If multiple checkpoints exist for an app (with the same hash)
+checkpointed results for any :ref:`Apps <app>` that have been previously executed. 
+Like :ref:`App caching <memoization>`, checkpoints
+use the hash of the :ref:`App <app>` and the invocation input parameters to identify previously computed
+results. If multiple checkpoints exist for an :ref:`App <app>` (with the same hash)
 the most recent entry will be used.
 
-Parsl provides four checkpointing modes:
+Parsl provides four :ref:`Checkpointing <checkpointing>` modes:
 
-1. ``task_exit``: a checkpoint is created each time an app completes or fails
+1. ``task_exit``: a checkpoint is created each time an :ref:`App <app>` completes or fails
    (after retries if enabled). This mode minimizes the risk of losing information
    from completed tasks.
 
@@ -189,7 +184,7 @@ Parsl provides four checkpointing modes:
       from parsl.configs.local_threads import config
       config.checkpoint_mode = 'dfk_exit'
 
-4. ``manual``: in addition to these automated checkpointing modes, it is also possible
+4. ``manual``: in addition to these automated :ref:`Checkpointing <checkpointing>` modes, it is also possible
    to manually initiate a checkpoint by calling ``DataFlowKernel.checkpoint()`` in the
    Parsl program code.
 
@@ -204,16 +199,15 @@ Parsl provides four checkpointing modes:
 In all cases the checkpoint file is written out to the ``runinfo/RUN_ID/checkpoint/`` directory.
 
 .. Note:: Checkpoint modes ``periodic``, ``dfk_exit``, and ``manual`` can interfere with garbage collection.
-          In these modes task information will be retained after completion, until checkpointing events are triggered.
-
+          In these modes task information will be retained after completion, until :ref:`Checkpointing <checkpointing>` events are triggered.
 
 Creating a checkpoint
 ^^^^^^^^^^^^^^^^^^^^^
 
-Automated checkpointing must be explicitly enabled in the Parsl configuration.
-There is no need to modify a Parsl  program as checkpointing will occur transparently.
-In the following example, checkpointing is enabled at task exit. The results of
-each invocation of the ``slow_double`` app will be stored in the checkpoint file.
+Automated :ref:`Checkpointing <checkpointing>` must be explicitly enabled in the Parsl configuration.
+There is no need to modify a Parsl program as :ref:`Checkpointing <checkpointing>` will occur transparently.
+In the following example, :ref:`Checkpointing <checkpointing>` is enabled at task exit. The results of
+each invocation of the ``slow_double`` :ref:`App <app>` will be stored in the checkpoint file.
 
 .. code-block:: python
 
@@ -237,10 +231,10 @@ each invocation of the ``slow_double`` app will be stored in the checkpoint file
 
     print([d[i].result() for i in range(5)])
 
-Alternatively, manual checkpointing can be used to explictly specify when the checkpoint
-file should be saved. The following example shows how manual checkpointing can be used.
+Alternatively, manual :ref:`Checkpointing <checkpointing>` can be used to explicitly specify when the checkpoint
+file should be saved. The following example shows how manual :ref:`Checkpointing <checkpointing>` can be used.
 Here, the ``dfk.checkpoint()`` function will save the results of the prior invocations 
-of the ``slow_double`` app.
+of the ``slow_double`` :ref:`App <app>`.
 
 .. code-block:: python
 
@@ -271,14 +265,14 @@ of the ``slow_double`` app.
 Resuming from a checkpoint
 ^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-When resuming a program from a checkpoint Parsl allows the user to select
+When resuming a program from a checkpoint, Parsl allows the user to select
 which checkpoint file(s) to use. 
 Checkpoint files are stored in the ``runinfo/RUNID/checkpoint`` directory.
 
 The example below shows how to resume using all available checkpoints. 
-Here, the program re-executes the same calls to the ``slow_double`` app
+Here, the program re-executes the same calls to the ``slow_double`` :ref:`App <app>`
 as above and instead of waiting for results to be computed, the values
-from the checkpoint file are are immediately returned.
+from the checkpoint file are immediately returned.
 
 .. code-block:: python
 
@@ -292,8 +286,8 @@ from the checkpoint file are are immediately returned.
 		
 		# Rerun the same workflow
     d = []
-    for i in range(5):
+    for i in range 5:
         d.append(slow_double(i))
 
-    # wait for results
+    # Wait for results
     print([d[i].result() for i in range(5)])