HW-SD is a library and daemon for the discovery and announcement of hardware resources using ZeroConf. It enables auto-configuration of ad-hoc GPU clusters and multi-GPU machines.
The source code is hosted on github and documented on eyescale.github.io.
The HW-SD library uses modules which implement discovery using different protocols. Each module is a separate library, which can be selectively linked by applications to limit dependencies. Currently available are:
- gpu_dns_sd: Remote ZeroConf (Bonjour) discovery for GPUs announced by the daemon
- gpu_cgl: Local discovery of Carbon displays (Mac OS X only)
- gpu_glx: Local discovery of X11 servers and screens
- gpu_wgl: Local discovery of WGL_NV_gpu_affinity, WGL_AMD_gpu_association or Windows displays (Windows only)
- net_dns_sd: Remote ZeroConf (Bonjour) discovery for network interfaces announced by the daemon
- net_sys: Local discovery of network interfaces
When an application is run through VirtualGL, hwsd detects this and sets the FLAG_VIRTUALGL on all local GPUs, and additionally FLAG_VIRTUALGL_DISPLAY on the GPU used by VirtualGL for redirection. This is only implemented for GLX so far (more info).
The daemon uses all available local modules to query local GPUs and network interfaces to announce them using ZeroConf to the local network. The service type name is "_gpu-sd" and "_net-sd". The dns_sd discovery module gathers the information announced by all daemons on the local network. The following protocol is used by the daemon:
-
Session=default | <string>
-
NodeID=<UUID>
-
Hostname=<string> // optional, hostname for connections
-
GPU Count=<integer>
-
GPU<integer> Type=GLX | WGL | WGLn | WGLa | CGL
-
GPU<integer> Port=<integer> // X11 display number, 0 otherwise
-
GPU<integer> Device=<integer> // X11 screen number, wglEnumGpusNV index, CGDirectDisplayID
-
GPU<integer> Width=<integer>
-
GPU<integer> Height=<integer>
-
GPU<integer> X=<integer>
-
GPU<integer> Y=<integer>
-
GPU<integer> Flags=<integer> // optional flags (see gpuInfo.h)
-
Net Count=<integer>
-
Net<integer> Type=TYPE_ETHERNET | TYPE_INFINIBAND | TYPE_LOOPBACK | TYPE_UNKNOWN
-
Net<integer> Name=<string>
-
Net<integer> Hostname=<string>
-
Net<integer> MAC=<string> // ':' as separator
-
Net<integer> IPv4=<string> // ':' as separator
-
Net<integer> IPv6=<string> // ':' as separator
-
Net<integer> Linkspeed=<integer> // in Megabits per second
-
Net<integer> Up=<bool>
HWSD is a cross-platform library, designed to run on any modern operating system, including all Unix variants and the Windows operating system. Zeroconf support in Lunchbox is required for the DNS_SD module and applications. The following platforms and build environments are tested:
- Linux: Ubuntu 16.04, RHEL 6.8 (Makefile, Ninja)
- Windows: 7 (Visual Studio 2012)
- Mac OS X: 10.9 (Makefile, Ninja)
The build system is using CMake, with the standard CMake build process:
git clone --recursive https://github.com/Eyescale/hwsd.git
mkdir hwsd/build
cd hwsd/build
cmake -GNinja .. -DCLONE_SUBPROJECTS=ON
ninja
A ZeroConf implementation is required for the dns_sd module and the daemon. On Mac OS X it is part of the operating system, on Linux AVAHI is tested ('sudo apt-get install libavahi-compat-libdnssd-dev' on Ubuntu), on Windows use the Bonjour SDK. If no ZeroConf implementation is found, HW-SD is only compiled with local discovery modules.
Please file a Bug Report if you find any issue with this software.
An application can use the discovery by linking the relevant module libraries, instantiating the modules in the code and then quering the instantiated modules. The following will find all remote and local GPUs and local network interfaces on Windows:
#include <hwsd/hwsd.h>
hwsd::gpu::wgl::Module::use();
hwsd::gpu::dns_sd::Module::use();
const hwsd::GPUInfos& gpus = hwsd::discoverGPUInfos();
hwsd::net::sys::Module::use();
const hwsd::NetInfos& nets = hwsd::discoverNetInfos();
Filters are chainable functors which can be passed to the query function to discard information. The following filters are provided:
- DuplicateFilter eliminates duplicates, e.g., when one announcement is seen on multiple interfaces
- MirrorFilter eliminates the same GPU with a different type, e.g., when enabling both the cgl and glx module on Mac OS X.
- SessionFilter discards all resources not belonging to a given session