Merge pull request #1223 from datalad-handbook/mslw-its

Fix "it's" vs "its" usage
datalad-handbook · May 22, 2024 · a23359d · a23359d
2 parents 8a5a72f + 523fa30
commit a23359d
Show file tree

Hide file tree

Showing 21 changed files with 26 additions and 26 deletions.
diff --git a/docs/basics/101-106-nesting.rst b/docs/basics/101-106-nesting.rst
@@ -94,7 +94,7 @@ we can set subdatasets to previous states, or *update* them.
 
 .. index::
    pair: temporary working directory change; with Git
-.. find-out-more:: Do I have to navigate into the subdataset to see it's history?
+.. find-out-more:: Do I have to navigate into the subdataset to see its history?
 
    Previously, we used :shcmd:`cd` to navigate into the subdataset, and
    subsequently opened the Git log. This is necessary, because a :gitcmd:`log`

diff --git a/docs/basics/101-107-summary.rst b/docs/basics/101-107-summary.rst
@@ -91,7 +91,7 @@ Currently, this can be considered "best-practice building": Frequent :dlcmd:`sta
 commands, :dlcmd:`save` commands to save dataset modifications,
 and concise :term:`commit message`\s are the main take always from this. You can already explore
 the history of a dataset and you know about many types of provenance information
-captured by DataLad, but for now, its been only informative, and has not been used
+captured by DataLad, but for now, it has been only informative, and has not been used
 for anything more fancy. Later on, we will look into utilizing the history
 in order to undo mistakes, how the origin of files or datasets becomes helpful
 when sharing datasets or removing file contents, and how to make changes to large

diff --git a/docs/basics/101-110-run2.rst b/docs/basics/101-110-run2.rst
@@ -396,7 +396,7 @@ Make a note of this behavior in your ``notes.txt`` file.
 Save yourself the preparation time
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-Its generally good practice to specify ``--input`` and ``--output`` even if your input files are already retrieved and your output files unlocked -- it makes sure that a recomputation can succeed, even if inputs are not yet retrieved, or if output needs to be unlocked.
+It's generally good practice to specify ``--input`` and ``--output`` even if your input files are already retrieved and your output files unlocked -- it makes sure that a recomputation can succeed, even if inputs are not yet retrieved, or if output needs to be unlocked.
 However, the internal preparation steps of checking that inputs exist or that outputs are unlocked can take a bit of time, especially if it involves checking a large number of files.
 
 If you want to avoid the expense of unnecessary preparation steps you can make use of the ``--assume-ready`` argument of :dlcmd:`run`.

diff --git a/docs/basics/101-121-siblings.rst b/docs/basics/101-121-siblings.rst
@@ -17,7 +17,7 @@ But why does this need to be a one-way street? "I want to
 provide helpful information for you as well!", says your
 room mate. "How could you get any insightful notes that
 I make in my dataset, or maybe the results of our upcoming
-mid-term project? Its a bit unfair that I can get your work,
+mid-term project? It's a bit unfair that I can get your work,
 but you cannot get mine."
 
 .. index::

diff --git a/docs/basics/101-134-summary.rst b/docs/basics/101-134-summary.rst
@@ -47,7 +47,7 @@ Now what can I do with it?
 
 For one, you will not be surprised if you ever see a subdataset being shown as
 ``modified`` by :dlcmd:`status`: You now know that if a subdataset
-evolves, it's most recent state needs to be explicitly saved to the superdatasets
+evolves, its most recent state needs to be explicitly saved to the superdataset's
 history.
 
 On a different matter, you are now able to capture and share analysis provenance that

diff --git a/docs/basics/101-136-filesystem.rst b/docs/basics/101-136-filesystem.rst
@@ -1133,7 +1133,7 @@ both  ``--recursive`` and ``--reckless [availability|undead|kill]`` flags are ne
 to traverse into subdatasets and to remove content that does not have verified remotes.
 
 Be aware, though, that deleting a dataset in which ever way will
-irretrievably delete the dataset, it's contents, and it's history.
+irretrievably delete the dataset, its contents, and its history.
 
 Summary
 ^^^^^^^

diff --git a/docs/basics/101-139-hostingservices.rst b/docs/basics/101-139-hostingservices.rst
@@ -41,7 +41,7 @@ How to add a sibling on a Git repository hosting site: The manual way
 
 #. If you pick the :term:`SSH` URL, make sure to have an :term:`SSH key` set up. This usually requires generating an SSH key pair if you do not have one yet, and uploading the public key to the repository hosting service. The :find-out-more:`on SSH keys <fom-sshkey>` points to a useful tutorial for this.
 
-#. Use the URL to add the repository as a sibling. There are two commands that allow you to do that; both require you give the sibling a name of your choice (common name choices are ``upstream``, or a short-cut for your user name or the hosting platform, but its completely up to you to decide):
+#. Use the URL to add the repository as a sibling. There are two commands that allow you to do that; both require that you give the sibling a name of your choice (common name choices are ``upstream``, or a short-cut for your user name or the hosting platform, but it's completely up to you to decide):
 
    #. ``git remote add <name> <url>``
    #. ``datalad siblings add --dataset . --name <name> --url <url>``

diff --git a/docs/basics/101-146-gists.rst b/docs/basics/101-146-gists.rst
@@ -10,7 +10,7 @@ This section is a selection of code snippets tuned to perform specific,
 non-trivial tasks in datasets. Often, they are not limited to single commands of
 the version control tools you know, but combine helpful other command line
 tools and general Unix command line magic. Just like
-`GitHub gists <https://gist.github.com>`_, its a collection of lightweight
+`GitHub gists <https://gist.github.com>`_, it's a collection of lightweight
 and easily accessible tips and tricks. For a more basic command overview,
 take a look at the :ref:`cheat`. The
 `tips collection of git-annex <https://git-annex.branchable.com/tips>`_ is also

diff --git a/docs/beyond_basics/101-145-hooks.rst b/docs/beyond_basics/101-145-hooks.rst
@@ -141,7 +141,7 @@ And here is how to set the values for these variables:
   command the hook operates on, and any key from the result evaluation can be
   expanded to the respective value in the result dictionary. Curly braces need to
   be escaped by doubling them.
-  This is not the easiest specification there is, but its also not as hard as it
+  This is not the easiest specification there is, but it's also not as hard as it
   may sound. Here is how this could look like for a :dlcmd:`unlock`::
 
      $ unlock {{"dataset": "{dsarg}", "path": "{path}"}}

diff --git a/docs/beyond_basics/101-160-gobig.rst b/docs/beyond_basics/101-160-gobig.rst
@@ -50,7 +50,7 @@ begin to see performance issues in datasets.
 Bench marking in DataLad datasets with varying, but large amounts of tiny files
 on different file systems and different git-annex repository versions show that
 a mere :dlcmd:`save` or :dlcmd:`status` command
-can take from 15 minutes up to several hours. Its neither fun nor feasible to
+can take from 15 minutes up to several hours. It's neither fun nor feasible to
 work with performance drops like this -- so how can this be avoided?
 
 General advice: Use several subdatasets

diff --git a/docs/beyond_basics/101-170-dataladrun.rst b/docs/beyond_basics/101-170-dataladrun.rst
@@ -227,7 +227,7 @@ Importantly, the ``$JOBID`` isn't hardcoded into the script but it can be given
 The code snippet above uses a bash :term:`environment variable` (``$JOBID``, as indicated by the all-upper-case variable name with a leading ``$``).
 It will be defined in the job submission -- this is shown and explained in detail in the respective paragraph below.
 
-Next, its time for the :dlcmd:`containers-run` command.
+Next, it's time for the :dlcmd:`containers-run` command.
 The invocation will depend on the container and dataset configuration (both of which are demonstrated in the real-life example in the next section), and below, we pretend that the container invocation only needs an input file and an output file.
 These input file is specified via a bash variables (``$inputfile``) that will be defined in the script and provided at the time of job submission via command line argument from the job scheduler, and the output file name is based on the input file name.
 
@@ -311,7 +311,7 @@ Here's how the full general script looks like.
 
     # Done - job handler should clean up workspace
 
-Its a short script that encapsulates a complete workflow.
+It's a short script that encapsulates a complete workflow.
 Think of it as the sequence of necessary DataLad commands you would need to do in order to compute a job.
 You can save this script into your analysis dataset, e.g., as ``code/analysis_job.sh``, and make it executable (such that it is executed automatically by the program specified in the :term:`shebang`)using ``chmod +x code/analysis_job.sh``.
 

diff --git a/docs/beyond_basics/101-171-enki.rst b/docs/beyond_basics/101-171-enki.rst
@@ -14,7 +14,7 @@ Walkthrough: Parallel ENKI preprocessing with fMRIprep
 
 The previous section has been an overview on parallel, provenance-tracked computations in DataLad datasets.
 While the general workflow entails a complete setup, it is usually easier to understand it by seeing it applied to a concrete usecase.
-Its even more informative if that use case includes some complexities that do not exist in the "picture-perfect" example but are likely to arise in real life.
+It is even more informative if that use case includes some complexities that do not exist in the "picture-perfect" example but are likely to arise in real life.
 Therefore, the following walk-through in this section is a write-up of an existing and successfully executed analysis.
 
 The analysis

diff --git a/docs/beyond_basics/101-179-gitignore.rst b/docs/beyond_basics/101-179-gitignore.rst
@@ -96,7 +96,7 @@ a ``tmp/`` directory in the ``DataLad-101`` dataset:
 
    $ datalad save -m "add something to ignore" .gitignore
 
-This ``.gitignore`` file is very minimalistic, but its sufficient to show
+This ``.gitignore`` file is very minimalistic, but it's sufficient to show
 how it works. If you now create a ``tmp/`` directory, all of its contents would be
 ignored by your datasets version control. Let's do so, and add a file into it
 that we do not (yet?) want to save to the dataset's history.

diff --git a/docs/beyond_basics/101-181-metalad.rst b/docs/beyond_basics/101-181-metalad.rst
@@ -173,8 +173,8 @@ The following call would add the metadata entry to the current dataset, ``cozy-s
    single: configuration item; datalad.dataset.id
 .. find-out-more:: meta-add validity checks
 
-	When adding metadata for the first time, its not uncommon to run into errors.
-	Its quite easy, for example, to miss a comma or quotation mark when creating a JSON object by hand.
+	When adding metadata for the first time, it is not uncommon to run into errors.
+	It is quite easy, for example, to miss a comma or quotation mark when creating a JSON object by hand.
 	But there are also some internal checks that might be surprising.
 	If you want to add the metadata above to your own dataset, you should make sure to adjust the ``dataset_id`` to the ID of your own dataset, found via the command ``datalad configuration get datalad.dataset.id`` - otherwise you'll see an error [#f4]_, and likewise the ``dataset_version``.
 	And in case you'd supply the ``extraction_time`` as "this morning at 8AM" instead of a time stamp, the command will be unhappy as well.
@@ -407,7 +407,7 @@ As with DataLad and other Python packages, you might want to do the installation
 
 .. [#f1] It may seem like an unnecessary duplicated effort to record the names of contained files or certain file properties as metadata in a dataset already containing these files. However, metadata can be very useful whenever the primary data can't be shared, for example due to its large size or sensitive nature, allowing consumers to, for example, derive anonymized information, aggregate data with search queries, or develop code and submit it to the data holders to be ran on their behalf.
 
-.. [#f2] `JSON <https://en.wikipedia.org/wiki/JSON>`_ is a language-independent, open and lightweight data interchange format. Data is represented as human readable text, organized in key-value pairs (e.g., 'name': 'Bob') or arrays, and thus easily readable by both humans and machines. A *JSON object* is a collection of key-value pairs. Its enclosed in curly brackets, and individual pairs in the object are separated by commas.
+.. [#f2] `JSON <https://en.wikipedia.org/wiki/JSON>`_ is a language-independent, open and lightweight data interchange format. Data is represented as human readable text, organized in key-value pairs (e.g., 'name': 'Bob') or arrays, and thus easily readable by both humans and machines. A *JSON object* is a collection of key-value pairs. It's enclosed in curly brackets, and individual pairs in the object are separated by commas.
 
 .. [#f3] A Unix timestamp is widely used in computing and measures time as the number of seconds passed since January 1st, 1970. The timestamp in the example metadata entry (``1675113291.1464975``) translates to January 30th, 2023, 22:14:51.146497 with the code snippet below. Lots of software tools have the ability to generate timestamps for you, for example Python's `time <https://docs.python.org/3/library/time.html>`_ module or the command ``date +%s`` in a command line on Unix systems.
 

diff --git a/docs/code_from_chapters/ABCD.rst b/docs/code_from_chapters/ABCD.rst
@@ -383,7 +383,7 @@ This allows others to very easily rerun computations, but it also spares yoursel
 Computational reproducibility
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-Its fantastic to have means to recompute a command automatically, but the ability to re-execute a command is often not enough.
+It's fantastic to have means to recompute a command automatically, but the ability to re-execute a command is often not enough.
 If you don't have the required Python packages available, or in a wrong version, running the script and computing the results will fail.
 In order to be *computationally* reproducible the run record does not only need to link code, command, and data, but also encapsulate the *software* that is necessary for a computation::
 

diff --git a/docs/code_from_chapters/DLBasicsMPI.rst b/docs/code_from_chapters/DLBasicsMPI.rst
@@ -294,7 +294,7 @@ DataLad save can in addition also attach an identifier in the form of a :term:`t
 
 The :dlcmd:`run` command can run this script in a way that links the script to the results it produces and the data it was computed from.
 In principle, the command is simple: Execute any command, save the resulting changes in the dataset, and associate them as well as all other optional information provided.
-Because each :dlcmd:`run` ends with a :dlcmd:`save`, its recommended to start with a clean dataset (see :ref:`chapter_run` for details on how to use it in unclean datasets)::
+Because each :dlcmd:`run` ends with a :dlcmd:`save`, it's recommended to start with a clean dataset (see :ref:`chapter_run` for details on how to use it in unclean datasets)::
 
    datalad status
 
@@ -348,7 +348,7 @@ This allows others to very easily rerun computations, but it also spares yoursel
 Computational reproducibility
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-Its fantastic to have means to recompute a command automatically, but the ability to re-execute a command is often not enough.
+It's fantastic to have means to recompute a command automatically, but the ability to re-execute a command is often not enough.
 If you don't have the required Python packages available, or in a wrong version, running the script and computing the results will fail.
 In order to be *computationally* reproducible the run record does not only need to link code, command, and data, but also encapsulate the *software* that is necessary for a computation::
 

diff --git a/docs/code_from_chapters/dgpa.rst b/docs/code_from_chapters/dgpa.rst
@@ -284,7 +284,7 @@ Let's share this data with our friends and collaborators.
 There are many ways to do this (section :ref:`chapter_thirdparty` has all the details), but
 a convenient way is `Gin <https://gin.g-node.org>`_, a free hosting service for DataLad datasets.
 
-First, you need to head over to `gin.g-node.org <https://gin.g-node.org>`__, log in, and upload an :term:`SSH key`. Then, under your user account, create a new repository, and copy it's SSH URL.
+First, you need to head over to `gin.g-node.org <https://gin.g-node.org>`__, log in, and upload an :term:`SSH key`. Then, under your user account, create a new repository, and copy its SSH URL.
 A step by step instruction with screenshots is in the section :ref:`gin`.
 
 .. importantnote:: The 0.16 release will have a convenience command

diff --git a/docs/code_from_chapters/osoh.rst b/docs/code_from_chapters/osoh.rst
@@ -344,7 +344,7 @@ To get an overview on publishing datasets, however, you best go to :ref:`shareth
 
 Another convenient way is `Gin <https://gin.g-node.org>`_, a free hosting service for DataLad datasets.
 
-First, you need to head over to `gin.g-node.org <https://gin.g-node.org>`__, log in, and upload an :term:`SSH key`. Then, under your user account, create a new repository, and copy it's SSH URL.
+First, you need to head over to `gin.g-node.org <https://gin.g-node.org>`__, log in, and upload an :term:`SSH key`. Then, under your user account, create a new repository, and copy its SSH URL.
 A step by step instruction with screenshots is in the section :ref:`gin`::
 
    datalad create-sibling-gin \

diff --git a/docs/code_from_chapters/usecase_ml_code.rst b/docs/code_from_chapters/usecase_ml_code.rst
@@ -280,7 +280,7 @@ DataLad save can in addition also attach an identifier in the form of a :term:`t
 
 The :dlcmd:`run` command can run this script in a way that links the script to the results it produces and the data it was computed from.
 In principle, the command is simple: Execute any command, save the resulting changes in the dataset, and associate them as well as all other optional information provided.
-Because each :dlcmd:`run` ends with a :dlcmd:`save`, its recommended to start with a clean dataset (see :ref:`chapter_run` for details on how to use it in unclean datasets)::
+Because each :dlcmd:`run` ends with a :dlcmd:`save`, it's recommended to start with a clean dataset (see :ref:`chapter_run` for details on how to use it in unclean datasets)::
 
    datalad status
 
@@ -334,7 +334,7 @@ This allows others to very easily rerun computations, but it also spares yoursel
 Computational reproducibility
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-Its fantastic to have means to recompute a command automatically, but the ability to re-execute a command is often not enough.
+It's fantastic to have means to recompute a command automatically, but the ability to re-execute a command is often not enough.
 If you don't have the required Python packages available, or in a wrong version, running the script and computing the results will fail.
 In order to be *computationally* reproducible the run record does not only need to link code, command, and data, but also encapsulate the *software* that is necessary for a computation::
 

diff --git a/docs/code_from_chapters/yale.rst b/docs/code_from_chapters/yale.rst
@@ -278,7 +278,7 @@ Let's share this data with our friends and collaborators.
 There are many ways to do this (section :ref:`chapter_thirdparty` has all the details), but
 a convenient way is `Gin <https://gin.g-node.org>`_, a free hosting service for DataLad datasets.
 
-First, you need to head over to `gin.g-node.org <https://gin.g-node.org>`__, log in, and upload an :term:`SSH key`. Then, under your user account, create a new repository, and copy it's SSH URL.
+First, you need to head over to `gin.g-node.org <https://gin.g-node.org>`__, log in, and upload an :term:`SSH key`. Then, under your user account, create a new repository, and copy its SSH URL.
 A step by step instruction with screenshots is in the section :ref:`gin`.
 
 You can register this URL as a sibling dataset to your own dataset using :dlcmd:`siblings add`::

diff --git a/docs/usecases/ml-analysis.rst b/docs/usecases/ml-analysis.rst
@@ -376,7 +376,7 @@ We will use the following script for this:
 It will load the trained and dumped model and use it to test its prediction performance on the yet unseen test data.
 To evaluate the model performance, it calculates the accuracy of the prediction, i.e., the proportion of correctly labeled images, prints it to the terminal, and saves it into a json file in the superdataset.
 As this script constitutes the last analysis step, let's save it with a :term:`tag`.
-Its entirely optional to do this, but just as commit messages are an easier way for humans to get an overview of a commits contents, a tag is an easier way for humans to identify a change than a commit hash.
+It is entirely optional to do this, but just as commit messages are an easier way for humans to get an overview of a commits contents, a tag is an easier way for humans to identify a change than a commit hash.
 With this script set up, we're ready for analysis, and thus can tag this state ``ready4analysis`` to identify it more easily later.
 
 .. runrecord:: _examples/ml-114