Skip to content

Commit

Permalink
Merge branch 'main' into dvc3
Browse files Browse the repository at this point in the history
  • Loading branch information
adswa authored Oct 6, 2023
2 parents 98ec1f5 + 5f98ad6 commit b5c6523
Show file tree
Hide file tree
Showing 162 changed files with 1,612 additions and 1,280 deletions.
2 changes: 1 addition & 1 deletion .appveyor.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ shallow_clone: false
environment:
DTS: datalad_next
APPVEYOR_BUILD_WORKER_IMAGE: Ubuntu2004
INSTALL_SYSPKGS: python3-virtualenv graphicsmagick-imagemagick-compat moreutils jq uidmap
INSTALL_SYSPKGS: graphicsmagick-imagemagick-compat moreutils jq uidmap
# go with whatever is most recent to catch anything that might break
# with them
INSTALL_GITANNEX: git-annex -m snapshot
Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ SHELL := /bin/bash
# this pattern rule lets you run "make build" (or any other target
# in docs/Makefile) in this directory as though you were in docs/
%:
cd docs && make $@
$(MAKE) -C docs $@

clean-build:
rm -rf docs/_build
Expand Down
2 changes: 1 addition & 1 deletion docs/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
#

# You can set these variables from the command line.
SPHINXOPTS = -W -T -v
SPHINXOPTS = -W -T -v -n
SPHINXBUILD = python -m sphinx
PAPER =
BUILDDIR = _build
Expand Down
Binary file added docs/_static/distribits-teaser.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 3 additions & 1 deletion docs/basics/101-101-create.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,9 @@ concepts of DataLad datasets together by creating one.
Find a nice place on your computer's file system to put a dataset for ``DataLad-101``,
and create a fresh, empty dataset with the :dlcmd:`create` command.

Note the command structure of :dlcmd:`create` (optional bits are enclosed in ``[ ]``)::
Note the command structure of :dlcmd:`create` (optional bits are enclosed in ``[ ]``):

.. code-block:: bash
datalad create [--description "..."] [-c <config options>] PATH
Expand Down
14 changes: 9 additions & 5 deletions docs/basics/101-102-populate.rst
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ Below is a short list of optional readings. We decide to download them (they
are all free, in total about 15 MB), and save them in ``DataLad-101/books``.

- Additional reading about the command line: `The Linux Command Line <https://sourceforge.net/projects/linuxcommand/files/TLCL/19.01/TLCL-19.01.pdf/download>`_
- An intro to Python: `A byte of Python <https://github.com/swaroopch/byte-of-python/releases/download/v14558db59a326ba99eda0da6c4548c48ccb4cd0f/byte-of-python.pdf>`_
- An intro to Python: `A byte of Python <https://github.com/swaroopch/byte-of-python/releases/download/vadb91fc6fce27c58e3f931f5861806d3ccd1054c/byte-of-python.pdf>`_

You can either visit the links and save them in ``books/``,
or run the following commands [#f2]_ to download the books right from the terminal.
Expand All @@ -46,7 +46,9 @@ are presented here, make sure to check the :windows-wit:`on peculiarities of its

In Unix shells, ``\`` can be used to split a command into several lines, for example to aid readability.
Standard Windows terminals (including the Anaconda prompt) do not support this.
They instead use the ``^`` character::
They instead use the ``^`` character:

.. code-block:: bash
$ wget -q https://sourceforge.net/projects/linuxcommand/files/TLCL/19.01/TLCL-19.01.pdf/download ^
-O TLCL.pdf
Expand All @@ -62,7 +64,7 @@ are presented here, make sure to check the :windows-wit:`on peculiarities of its
$ cd books
$ wget -q https://sourceforge.net/projects/linuxcommand/files/TLCL/19.01/TLCL-19.01.pdf/download \
-O TLCL.pdf
$ wget -q https://homepages.uc.edu/~becktl/byte_of_python.pdf \
$ wget -q https://github.com/swaroopch/byte-of-python/releases/download/vadb91fc6fce27c58e3f931f5861806d3ccd1054c/byte-of-python.pdf \
-O byte-of-python.pdf
# get back into the root of the dataset
$ cd ../
Expand All @@ -75,12 +77,14 @@ curl <ww-curl-instead-wget>`.
:name: ww-curl-instead-wget

Many versions of Windows do not ship with the tool ``wget``.
You can install it, but it may be easier to use the pre-installed ``curl`` command::
You can install it, but it may be easier to use the pre-installed ``curl`` command:

.. code-block:: bash
$ cd books
$ curl -L https://sourceforge.net/projects/linuxcommand/files/TLCL/19.01/TLCL-19.01.pdf/download \
-o TLCL.pdf
$ curl -L https://homepages.uc.edu/~becktl/byte_of_python.pdf \
$ curl -L https://github.com/swaroopch/byte-of-python/releases/download/vadb91fc6fce27c58e3f931f5861806d3ccd1054c/byte-of-python.pdf \
-o byte-of-python.pdf
$ cd ../
Expand Down
6 changes: 4 additions & 2 deletions docs/basics/101-103-modify.rst
Original file line number Diff line number Diff line change
Expand Up @@ -62,11 +62,13 @@ root of your ``DataLad-101`` dataset:
Heredocs rely on Unix-type redirection and multi-line commands -- which is not supported on most native Windows terminals or the Anaconda prompt on Windows.
If you are using an Anaconda prompt or a Windows terminal other than Git Bash, instead of executing heredocs, please open up an editor and paste and save the text into it.

The relevant text in the snippet below would be::
The relevant text in the snippet below would be:

.. code-block:: text
One can create a new dataset with 'datalad create [--description] PATH'.
The dataset is created empty
If you are using Git Bash, however, here docs will work just fine.

.. runrecord:: _examples/DL-101-103-101
Expand Down
8 changes: 6 additions & 2 deletions docs/basics/101-105-install.rst
Original file line number Diff line number Diff line change
Expand Up @@ -115,14 +115,18 @@ chapters in this handbook will demonstrate how useful this information can be.
important to keep in mind whenever you do not execute the :dlcmd:`clone` command
from the root of this dataset. Luckily, there is a shortcut: ``-d^`` will always
point to root of the top-most dataset. For example, if you navigate into ``recordings``,
the command would be::
the command would be:

.. code-block:: bash
datalad clone -d^ https://github.com/datalad-datasets/longnow-podcasts.git longnow
.. find-out-more:: What if I do not install into an existing dataset?

If you do not install into an existing dataset, you only need to omit the ``-d/--dataset``
option. You can try::
option. You can try:

.. code-block:: bash
datalad clone https://github.com/datalad-datasets/longnow-podcasts.git
Expand Down
4 changes: 3 additions & 1 deletion docs/basics/101-106-nesting.rst
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,9 @@ we can set subdatasets to previous states, or *update* them.
a Git command let's you run the command as if Git was started in this path
instead of the current working directory.
Thus, from the root of ``DataLad-101``, this command would have given you the
subdataset's history as well::
subdataset's history as well:

.. code-block:: bash
$ git -C recordings/longnow log --oneline
Expand Down
12 changes: 9 additions & 3 deletions docs/basics/101-107-summary.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,9 @@ and making simple modifications *locally*.

* An empty dataset can be created with the :dlcmd:`create` command. It's useful to add a description
to the dataset and use the ``-c text2git`` configuration, but we will see later why.
This is the command structure::
This is the command structure:

.. code-block:: bash
datalad create --description "here is a description" -c text2git PATH
Expand All @@ -19,7 +21,9 @@ and making simple modifications *locally*.
exist in your dataset, specify the path to the precise file (change) that should be saved to history.
Remember, if you run a :dlcmd:`save` without
specifying a path, all untracked files and all file changes will be committed to the history together!
This is the command structure::
This is the command structure:

.. code-block:: bash
datalad save -m "here is a commit message" [PATH]
Expand All @@ -46,7 +50,9 @@ and experienced the concept of modular nesting datasets.

.. index:: ! datalad command; clone

* A published dataset can be installed with the :dlcmd:`clone` command::
* A published dataset can be installed with the :dlcmd:`clone` command:

.. code-block:: bash
$ datalad clone [--dataset PATH] SOURCE-PATH/URL [DESTINATION PATH]
Expand Down
4 changes: 3 additions & 1 deletion docs/basics/101-108-run.rst
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,9 @@ will write it into the script.
.. windows-wit:: Here's a script for Windows users

Please use an editor of your choice to create a file ``list_titles.sh`` inside of the ``code`` directory.
These should be the contents::
These should be the contents:

.. code-block:: bash
for i in recordings/longnow/Long_Now__Seminars*/*.mp3; do
# get the filename
Expand Down
11 changes: 8 additions & 3 deletions docs/basics/101-109-rerun.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,8 @@ Let's actually take a look into this file now:
.. runrecord:: _examples/DL-101-109-101
:language: console
:workdir: dl-101/DataLad-101
:lines: 1-15
:lines: 1-3,5-7
:append: -✂--✂-
:notes: The script produced a simple list of podcast titles. let's take a look into our output file. What's cool is that is was created in a way that the code and output are linked:
:cast: 02_reproducible_execution

Expand All @@ -33,7 +34,9 @@ with the following, fixed script:

.. windows-wit:: Here's a script adjustment for Windows users

Please use an editor of your choice to replace the contents of ``list_titles.sh`` inside of the ``code`` directory with the following::
Please use an editor of your choice to replace the contents of ``list_titles.sh`` inside of the ``code`` directory with the following:

.. code-block:: bash
for i in recordings/longnow/Long_Now*/*.mp3; do
# get the filename
Expand Down Expand Up @@ -175,7 +178,9 @@ of the dataset and the previous commit (called "``HEAD~1``" in Git terminology [
When executing this command, you will see *all* files being modified between the most recent and the second-most recent commit.
On a technical level, this is correct given the underlying file handling on Windows, and chapter :ref:`chapter_gitannex` will shed light on why that is.

For now, to get the same output as shown in the code snippet below, use the following command where ``main`` (or ``master``) is the name of your default branch::
For now, to get the same output as shown in the code snippet below, use the following command where ``main`` (or ``master``) is the name of your default branch:

.. code-block:: bash
datalad diff --from main --to HEAD~1
Expand Down
4 changes: 3 additions & 1 deletion docs/basics/101-113-summary.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,9 @@ command, and discovered the concept of *locked* content.

* With any :dlcmd:`run`, specify a commit message, and whenever appropriate, specify its inputs
to the executed command (using the ``-i``/``--input`` flag) and/or its output (using the ``-o``/
``--output`` flag). The full command structure is::
``--output`` flag). The full command structure is:

.. code-block:: bash
$ datalad run -m "commit message here" --input "path/to/input/" --output "path/to/output" "command"
Expand Down
10 changes: 5 additions & 5 deletions docs/basics/101-115-symlinks.rst
Original file line number Diff line number Diff line change
Expand Up @@ -143,17 +143,17 @@ unnecessary, superfluous thing to do, right?

The resulting symlinks that look like
your files but only point to the actual content in ``.git/annex/objects`` are
small in size. An ``ls -lah`` reveals that all of these symlinks have roughly the same,
small in size. An ``ls -lh`` reveals that all of these symlinks have roughly the same,
small size of ~130 Bytes:

.. runrecord:: _examples/DL-101-115-103
:language: console
:workdir: dl-101/DataLad-101/books
:realcommand: ls -lah --time-style=long-iso
:realcommand: ls -lh --time-style=long-iso
:notes: Symlinks are super small in size, just the amount of characters in the symlink!
:cast: 03_git_annex_basics

$ ls -lah
$ ls -lh

Here you can see the reason why content is symlinked: Small file size means that
*Git can handle those symlinks*!
Expand Down Expand Up @@ -229,12 +229,12 @@ to manage the file system in a DataLad dataset (:ref:`filesystem`).
.. runrecord:: _examples/DL-101-115-104
:language: console
:workdir: dl-101/DataLad-101/books
:realcommand: ls -lah --time-style=long-iso TLCL.pdf
:realcommand: ls -lh --time-style=long-iso TLCL.pdf
:notes: how does the symlink relate to the shasum of the file?
:cast: 03_git_annex_basics

# take a look at the last part of the target path:
$ ls -lah TLCL.pdf
$ ls -lh TLCL.pdf

Let's take a closer look at the structure of the symlink.
The key from the hash function is the last part of the name of the file the symlink links to (in which the actual data content is stored).
Expand Down
8 changes: 6 additions & 2 deletions docs/basics/101-116-sharelocal.rst
Original file line number Diff line number Diff line change
Expand Up @@ -326,7 +326,9 @@ it only installed the subdataset to retrieve the meta data about file availabili

To explicitly install all potential subdatasets *recursively*, that is,
all of the subdatasets inside it as well, one can give the
``-r``/``--recursive`` option to :dlcmd:`get`::
``-r``/``--recursive`` option to :dlcmd:`get`:

.. code-block:: bash
datalad get -n -r <subds>
Expand All @@ -344,7 +346,9 @@ a few dozen levels of nested subdatasets right away.

However, there is a middle way [#f1]_: The ``--recursion-limit`` option let's
you specify how many levels of subdatasets should be installed together
with the first subdataset::
with the first subdataset:

.. code-block:: bash
datalad get -n -r --recursion-limit 1 <subds>
Expand Down
14 changes: 9 additions & 5 deletions docs/basics/101-121-siblings.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,9 @@ but you can not get mine."

Consider, for example, that your room mate might have googled about DataLad
a bit. In the depths of the web, he might have found useful additional information, such
a script on `dataset nesting <https://raw.githubusercontent.com/datalad/datalad.org/7e8e39b1f08d0a54ab521586f27ee918b4441d69/content/asciicast/seamless_nested_repos.sh>`_.
a script on `dataset nesting <https://raw.githubusercontent.com/datalad/datalad.org/7e8e39b1/content/asciicast/seamless_nested_repos.sh>`_.
Because he found this very helpful in understanding dataset
nesting concepts, he decided to download it `from GitHub <https://raw.githubusercontent.com/datalad/datalad.org/7e8e39b1f08d0a54ab521586f27ee918b4441d69/content/asciicast/seamless_nested_repos.sh>`_, and saved it in the ``code/`` directory.
nesting concepts, he decided to download it from GitHub, and saved it in the ``code/`` directory.

He does it using the datalad command :dlcmd:`download-url`
that you experienced in section :ref:`createDS` already: This command will
Expand All @@ -49,7 +49,7 @@ and run the following command
-d . \
-m "Include nesting demo from datalad website" \
-O code/nested_repos.sh \
https://raw.githubusercontent.com/datalad/datalad.org/7e8e39b1f08d0a54ab521586f27ee918b4441d69/content/asciicast/seamless_nested_repos.sh
https://raw.githubusercontent.com/datalad/datalad.org/7e8e39b1/content/asciicast/seamless_nested_repos.sh

Run a quick datalad status:

Expand Down Expand Up @@ -226,7 +226,9 @@ the former for a different lecture:

.. windows-wit:: Please use datalad diff --from main --to remotes/roommate/master

Please use the following command instead::
Please use the following command instead:

.. code-block:: bash
datalad diff --from main --to remotes/roommate/master
Expand All @@ -246,7 +248,9 @@ that there is a difference in ``notes.txt``! Let's ask

.. windows-wit:: Please use git diff master..remotes/roommate/master

Please use the following command instead::
Please use the following command instead:

.. code-block:: bash
git diff master..remotes/roommate/master
Expand Down
24 changes: 10 additions & 14 deletions docs/basics/101-122-config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,9 @@ Git configuration." the lecturer says.

At one point in time, you likely followed instructions such as
in :ref:`install` and configured your
*Git identity* with the commands::
*Git identity* with the commands:

.. code-block:: bash
git config --global --add user.name "Elena Piscopia"
git config --global --add user.email [email protected]
Expand Down Expand Up @@ -175,7 +177,9 @@ configuration to your repository's ``.git/config`` file, whereas ``--global``
would apply it as a user specific configuration, and ``--system`` as a
system-wide configuration.
If you would want to change this existing line in your ``.git/config``
file, you would replace ``--add`` with ``--replace-all`` such as in::
file, you would replace ``--add`` with ``--replace-all`` such as in:

.. code-block:: bash
git config --local --replace-all core.editor "vim"
Expand Down Expand Up @@ -206,28 +210,20 @@ it is also useful to find out which configurations are already set in
which way and where. For this, the :gitcmd:`config --list --show-origin`
is useful. It will display all configurations and their location:

.. code-block:: bash
.. code-block:: console
$ git config --list --show-origin
file:/home/bob/.gitconfig user.name=Bob McBobface
file:/home/bob/.gitconfig [email protected]
file:/home/bob/.gitconfig core.editor=vim
file:/home/bob/.gitconfig annex.security.allowed-url-schemes=http https file
file:.git/config core.repositoryformatversion=0
file:.git/config core.filemode=true
file:.git/config core.bare=false
file:.git/config core.logallrefupdates=true
file:.git/config annex.uuid=1f83595e-bcba-4226-aa2c-6f0153eb3c54
file:.git/config annex.version=5
file:.git/config annex.backends=MD5E
file:.git/config submodule.recordings/longnow.url=https://github.com/datalad-datasets/longnow-podcasts.git
file:.git/config submodule.recordings/longnow.url=https://github.com/
file:.git/config submodule.recordings/longnow.active=true
file:.git/config remote.roommate.url=../mock_user/onemoredir/DataLad-101
file:.git/config remote.roommate.fetch=+refs/heads/*:refs/remotes/roommate/*
file:.git/config remote.roommate.annex-uuid=a5ae24de-1533-4b09-98b9-cd9ba6bf303c
file:.git/config remote.roommate.annex-ignore=false
file:.git/config submodule.longnow.url=https://github.com/datalad-datasets/longnow-podcasts.git
file:.git/config submodule.longnow.url=https://github.com/✂
file:.git/config submodule.longnow.active=true
...
This example shows some configurations in the global ``.gitconfig``
file, and the configurations within ``DataLad-101/.git/config``.
Expand Down
Loading

0 comments on commit b5c6523

Please sign in to comment.