Computing Environment

This tutorial are conducted in a linux/unix Terminal session. In other words you must already be connected to a remote Linux machine at MGHPCC to continue.

The computer cluster at MGHPCC are built up of hundreds or thousands of Linux machines connected together with a network, shared storage and softwares. To leverage the environment for "big" computing tasks, we need to first learn some special tools, Linux command, shell scripting, and LSF commands, to "communicate" with the cluster.

Typeset Conventions

Command-line examples that you are meant to type into a terminal window will be shown indented in a constant width font, e.g.

echo $USER

Sometimes the accompanying text will include a reference to a Unix command. Any such text will also be in a constant width, boxed font. e.g. Type the ls command again.

Unix/ Linux Shell

A Unix/Linux shell is a command-line interpreter which provides a user interface for the Unix/Linux operating system. Users control the operation of a computer by submitting single commands or by submitting one or more commands via a shell script. Whatever you type at the command line is understood and interpreted by a program and then that program gives you an output after executing your command. This program that understands what you type is called the shell. Several common shell choices are available on MGHPCC:

Shell	Description
bash	a Bourne-shell (sh) compatible shell with many newer advanced features as well
tcsh	an advanced variant on csh with all the features of modern shells
zsh	an advanced shell which incorprates all the functionality of bash, tcsh, and ksh combined
csh	the original C-style shell

The default shell provided to MGHPCC users is the bash shell. To discover your current shell:

echo $SHELL

Environment Variables

Environment variables are a set of dynamically named values which can control the way running processes will behave on a computer. Many of the UNIX commands and tools require certain environment variables to be set. Many of these are set automatically for the users when they log in or load applications via the module command. To view your current set of environment variables do the following command:

env

Environment variables provide a way to influence the behaviour of software on the system. For example, the "LANG" environment variable determines the language in which software programs communicate with the user.

Environment variables consist of names that have values assigned to them. For example, on a typical system in the US we would have the value "en_US.UTF-8" assigned to the "LANG" variable.

To print the value of a variable:

echo $<NAME> # Eg. echo $HOME

Some Examples:

$ echo $USER
ml23a
$ echo $HOME
/home/ml23a
$ echo $PATH
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games

The last one shows the content of the $PATH environment variable, which displays a — colon separated — list of directories that are expected to contain programs that you can run. This includes all the the Unix commands, eg. ls, cd, pwd. These are files that live in directories which are run like programs (e.g. ls is just a special type of file in the /bin directory).

Knowing how to change your $PATH to include custom directories can be necessary sometimes (e.g. if you install some new bioinformatics software in a non-standard location).

Modules

A module manages environment variables needed to load a particular piece of software.

To see a list of modules that are currently loaded:

module list

To see a list of modules that are available to be loaded:

module avail

To see what environment variables would be set/changed if you load a specific module:

module show <module_name>

To load a module:

module load <module_name>

To unload a module:

module unload <module_name>

Modules/ Software/ Tools on HPC

MGHPCC servers have many bioinformatics programs/ software installed already. The biocluster uses 'modules' to systematically organize, version control, and load software and libraries.

Try the command module to see all of your available options with the tool.

Try the command module avail to see all of the loaded modules on the server.

Or, click here for a complete list of available modules and module names.