OVIS is a modular system for HPC data collection, transport, storage, -log message exploration, and visualization as well as analysis.
LDMS is a low-overhead, low-latency framework for collecting, transfering, and storing metric data on a large distributed computer system.
The framework includes:
- a public API with a reference implementation
- tools for collecting, aggregating, transporting, and storing metric values
- collectors for several common types of metrics
- Data transport over socket, RDMA (IB/iWarp/RoCE), and Cray Gemini as well as Aries
The API provides a way for vendors to expose system information in a uniform manner without being required to provide source code for accessing the information (although we advise it be included) which might reveal proprietary methods or information.
Metric information can be updated by a kernel module which runs only when applications yield the processor and transported using RDMA-like operations, resulting in minimal jitter during collection. LDMS has been run on 10,000 cores collecting over 100,000 metric values per second with less than 0.2% overhead.
You may obtain the source code by obtaining an official release tarball, or by cloning the ovis-hpc/ovis Git repository at github.
Official Release tarballs are available from the GitHub releases page:
https://github.com/ovis-hpc/ovis/releases
The tarball is avialble in the "Assets" section of each release. Be sure to download the tarball that has a name of the form "ovis-ldms-X.X.X.tar.gz".
The links that are named "Source code (zip)" and "Source code (tar.gz)" are automatic GitHub links that we are unable to remove. They will be missing the configure script, because they are raw source from git repository and not the official release tarball distribution.
To clone the source code, go to https://github/com/ovis-hpc/ovis, and click one the "Code" button. Or use the following command:
git clone https://github.com/ovis-hpc/ovis.git -b OVIS-4
- autoconf (>=2.63)
- automake
- libtool
- make
- bison
- flex
- libreadline
- openssl development library (for OVIS, LDMS Authentication)
- libmunge (for Munge LDMS Authentication plugin)
- Python >= 3.6 and Cython >= 0.25 (for the LDMS Python API and ldmsd_controller)
- doxygen (for the OVIS documentation)
Some LDMS plug-ins have dependencies on additional libraries.
For cray-related LDMS sampler plug-in dependencies, please see the man page of the
plug-in in ldms/man/
.
RHEL7/CentOS7 systems will require a the following packages at a minimum:
- autoconf
- automake
- libtool
- make
- bison
- flex
- openssl-devel
Additionally, the Python API and the ldmsd_controller command require Python and Cython. One way to obtain those packages is from EPEL (install the epel-release package, and then "yum update"). The packages from EPEL are:
- python3-devel
- python36-Cython
If you are interested in storing LDMS data in SOS, then first follow the instructions at https://github.com/ovis-hpc/sos to obtain, build, and install SOS before proceding.
cd <ovis source directory>
sh autogen.sh
./configure [--prefix=<installation prefix>] [other options]
make
make install
Run configure --help
for a full list of configure options.
- Ubuntu and friends
- CentOS and friends
- Cray XE6, Cray XK, Cray XC
The following LDMS sampler plugins are considered unsupported. Use are your own risk:
- perfevent sampler
- hweventpapi sampler
- switchx