Skip to content

Commit

Permalink
NF: SINGULARITY_CMD=shell to record (bash) history+result of interact…
Browse files Browse the repository at this point in the history
…ive sessions

**Related**

This is a prototype for functionality which might be of interest
outside of this project, e.g. related:

- regular `datalad run` to record activities in the shell.

  - [`run --interactive`](datalad/datalad#2158 (comment))
  - [`run --shell`](datalad/datalad#2275)

  so here I am "implementing" it, solely for containerized environments ATM,
  via a "over the head" communication to the shim in environment variable

- `datalad run` for better record keeping, e.g.

  - [saving stdout/err](datalad/datalad#3385)

  so here I was not bothering to establish stdout/err capture but possibly
  could and might

- `reproman login`, or even `execute` (with or without --trace) and may be `run`
  where we could benefit from having an environment with a unified interface
  for interactive sessions which would also establish the record of activities

- just a regular shell environment to make a clear record of commands which were ran

- might eventually absorb/meld with the "opinionated .bashrc"
  proposed for the training curiculum:
  ReproNim/module-reproducible-basics#26
  which provides assistance/docs for more efficient use of cmdline
  and establishes 'infinite bash history'.

**reproshell???**

So it feels to me like a motivation for some kind of a  reproshell  independent
project which would be

- usable indepdendently and easily installable/bindable (e.g. into a container)
- parametrizeable to be invoked from the shim here and/or by datalad or reproman
  so could just take care about capturing all sidecar files into specified
  locations

**Could benefit from**

- knowing more about "datalad (containers-)run" invocation

Implemented now within `singularity_run` shim, which could have benefited
from having additional information about how exactly it was `ran` and
also to instruct datalad run "upstairs" that there is now an additional file in
[extra_outputs](datalad/datalad#3094).
Hence there is datalad/datalad#3422

- [`datalad run` being able to 'cover' multiple commits](datalad/datalad#3265)

Interactivity creates ambiguity for `rerun` semantic:

- run record ATM would say "reinvoke interactive session" which might be
  desireable on its own (e.g. to redo something manually in that original
  container)

- but for "automated reproducibility" we do have all information (bash history
  file, which is a list of commands to run) possibly recorded in another
  commit, which is ATM is not associated with the "run" record

So may be with somehow [tagging run
commits](datalad/datalad#3371) it could be possible
to disambiguate/select specific run commits/records?

<details>
<summary>**Example**</summary>

	(dev) 1 13348.....................................:Wed 15 May 2019 06:12:24 PM EDT:.
	(git-annex)hopa:~/proj/repronim/containers[enh-shell]git-annex
	$> SINGULARITY_CMD=shell datalad containers-run -n repronim-reproin
	[INFO   ] Making sure inputs are available (this may take some time)
	[INFO   ] == Command start (output follows) =====
	<ome/yoh/proj/repronim/containers$ echo "I will do something useful today"
	I will do something useful today
	singularity:repronim-reproin > yoh@hopa:/home/yoh/proj/repronim/containers$ touch my-results
	singularity:repronim-reproin > yoh@hopa:/home/yoh/proj/repronim/containers$ cd images/
	singularity:repronim-reproin > yoh@hopa:/home/yoh/proj/repronim/containers/images$ ls
	bids  README.md  repronim
	singularity:repronim-reproin > yoh@hopa:/home/yoh/proj/repronim/containers/images$ cd ../
	singularity:repronim-reproin > yoh@hopa:/home/yoh/proj/repronim/containers$ ls
	binds  images  LICENSE	my-results  README.md  scripts
	<pa:/home/yoh/proj/repronim/containers$ rm LICENSE ; echo 'nobody needs those'
	nobody needs those
	singularity:repronim-reproin > yoh@hopa:/home/yoh/proj/repronim/containers$ exit
	add(ok): .repronim/bash_histories/0.1-3-ge25c927-2019-05-15T18:12:37-04:00 (file)
	save(ok): . (dataset)
	action summary:
	  add (ok: 1)
	  save (ok: 1)
	[INFO   ] == Command exit (modification check follows) =====
	delete(ok): LICENSE (file)
	add(ok): my-results (file)
	save(ok): . (dataset)
	action summary:
	  add (ok: 1)
	  delete (ok: 1)
	  get (notneeded: 1)
	  save (ok: 1)
	SINGULARITY_CMD=shell datalad containers-run -n repronim-reproin  3.42s user 1.74s system 9% cpu 54.068 total

	$> git log --stat HEAD^^..
	commit 89fed08617418e5ddb88ae11ee2c14db699acf31 (HEAD -> enh-shell)
	Author: Yaroslav Halchenko <[email protected]>
	Date:   Wed May 15 18:13:28 2019 -0400

		[DATALAD RUNCMD] ./scripts/singularity_cmd run images/rep...

		=== Do not change lines below ===
		{
		 "chain": [],
		 "cmd": "./scripts/singularity_cmd run images/repronim/repronim-reproin--0.5.4.sing ",
		 "dsid": "b02e63c2-62c1-11e9-82b0-52540040489c",
		 "exit": 0,
		 "extra_inputs": [],
		 "inputs": [
		  "images/repronim/repronim-reproin--0.5.4.sing"
		 ],
		 "outputs": [],
		 "pwd": "."
		}
		^^^ Do not change lines above ^^^

	 LICENSE    | 201 ---------------------------------------------------------------------------------------------
	 my-results |   1 +
	 2 files changed, 1 insertion(+), 201 deletions(-)

	commit 5aa3b3383c2746f7c1d07ecdcc73852eb0a30f17
	Author: Yaroslav Halchenko <[email protected]>
	Date:   Wed May 15 18:13:28 2019 -0400

		[REPRONIM/CONTAINERS]: bash history for the interactive session

		Actual changes might (or not, depending on the invocation) get committed in the next commit

	 .repronim/bash_histories/0.1-3-ge25c927-2019-05-15T18:12:37-04:00 | 7 +++++++
	 1 file changed, 7 insertions(+)

	$> cat .repronim/bash_histories/0.1-3-ge25c927-2019-05-15T18:12:37-04:00
	echo "I will do something useful today"
	touch my-results
	cd images/
	ls
	cd ../
	ls
	rm LICENSE ; echo 'nobody needs those'

</details>

**Additional possible features which might come here into a prototype**

- color info/error messages from the shim
- improve PS1 (probably multiline -- too much in a single line to still be
  able edit commands)
- indicate being [reproman --trace](ReproNim/reproman#416
- provide 'reactive' PS1 to alert user when he/she leaves the initial directory
  (thus the one outside of original dataset), possibly resulting in outputs which
  would not be recorded
  • Loading branch information
yarikoptic committed May 15, 2019
1 parent 2b6f83c commit a271e14
Showing 1 changed file with 95 additions and 1 deletion.
96 changes: 95 additions & 1 deletion scripts/singularity_cmd
Original file line number Diff line number Diff line change
Expand Up @@ -37,19 +37,113 @@ function info() {
: # echo -e "I: $@" >&2
}

function error() {
echo -e "E: $@" >&2
exit 1
}

function has_changes() {
git status -s | grep -q .
}

function singularity_version() {
singularity --version | sed -e 's,^[^0-9]*,,g'
}

# https://stackoverflow.com/a/24067243
function version_gt() {
test "$(printf '%s\n' "$@" | sort -V | head -n 1)" != "$1";
}

thisdir=$(dirname $0| xargs readlink -f)
updir=$(dirname "$thisdir")

cmd="${SINGULARITY_CMD:-$1}"; shift

# We might need to expand list of arguments
args=("$@")

#
# Pass other useful variables inside the container
#
if [ ! -z "${DATALAD_CONTAINER_NAME:-}" ]; then
export SINGULARITYENV_DATALAD_CONTAINER_NAME="$DATALAD_CONTAINER_NAME"
fi

#
# Prepare bind mounts
#

# singularity bind mounts system /tmp, which might result in side-effects
# Create a dedicated temporary directory to be removed upon completion
tmpdir=$(mktemp -d --suffix=singtmp)
info "created temp dir $tmpdir"
trap "rm -fr '$tmpdir' && info 'removed temp dir $tmpdir'" exit

singularity "$cmd" -e -c -W "$tmpdir" -H "$updir/binds/HOME" -B $PWD --pwd "$PWD" "$@"
#
# Prepare for storing bash history in cmd='shell' mode
#
# Will be non-empty if some post-run handling is needed
FINAL_BASH_HISTORY=
TEMP_BASH_HISTORY_LOCAL=
if [ "$cmd" = "shell" ]; then
# should be outside of $tmpdir so we could copy it there before
# trap cleans things up
histstamp=$(git describe --always)-$(date -Iseconds)
TEMP_BASH_HISTORY_LOCAL=$(mktemp -t bash_history.$histstamp.XXXXXXXXX)
TEMP_BASH_HISTORY_FILENAME=$(basename $TEMP_BASH_HISTORY_LOCAL)
TEMP_BASH_HISTORY="$tmpdir/tmp/$TEMP_BASH_HISTORY_FILENAME"
# singularity 2.x seems to mess with HISTFILE - cannot pass through!
if version_gt 3 "$(singularity_version)"; then
error "Can manipulate bash history only with singularity >= 3"
fi
# Expose it to singularity environment
export SINGULARITYENV_HISTFILE="/tmp/$TEMP_BASH_HISTORY_FILENAME"
# We will copy it only if it was clean and new changes emerged
# Handle (save) protocol of interactive sessions
if ! has_changes ; then
# TODO: place at the top of the dataset!?
FINAL_BASH_HISTORY=".repronim/bash_histories/$histstamp"
# TODO: cleanup TEMP_BASH_HISTORY in case of crash?
else
echo "W: uncomitted changes present, 'shell' mode will NOT commit bash history."
echo " You will find stored history at $TEMP_BASH_HISTORY_LOCAL"
fi
if [ "$#" -gt 1 ]; then
error "for 'shell' mode - do not provide any custom command. Got options: $@"
fi
cmd="exec"
args+=(bash)
fi

#
# The actual invocation
#
singularity "$cmd" -e -c -W "$tmpdir" -H "$updir/binds/HOME" -B "$PWD" --pwd "$PWD" "${args[@]}"


#
# Handle possible digital objects to save/be added to be saved
#
if [ ! -z "$FINAL_BASH_HISTORY" ]; then
if ! has_changes ; then
# TODO: someone might want to just record his wonderings around, so
# might be worth an option to force saving history only
echo "I: no changes to the tree detected. Bash history will not be saved."
echo " You will find stored history at $TEMP_BASH_HISTORY_LOCAL"
else
mkdir -p "$(dirname $FINAL_BASH_HISTORY)"
mv "$TEMP_BASH_HISTORY" "$FINAL_BASH_HISTORY"
# due to https://github.com/datalad/datalad/issues/3421 saving entire directory of histories
datalad save \
-m "[REPRONIM/CONTAINERS]: bash history for the interactive session
Actual changes might (or not, depending on the invocation) get committed in the next commit" \
"$(dirname $FINAL_BASH_HISTORY)"
fi
fi

if [ ! -z "$TEMP_BASH_HISTORY_LOCAL" ] && [ -e "$TEMP_BASH_HISTORY" ]; then
# So we did create it but did not move to be saved, so let's expose locally before it is wiped out
mv "$TEMP_BASH_HISTORY" "$TEMP_BASH_HISTORY_LOCAL"
fi

0 comments on commit a271e14

Please sign in to comment.