Skip to content
forked from husky-team/husky

A more expressive and most importantly, more efficient system for distributed data analytics.

License

Notifications You must be signed in to change notification settings

Kelvin-Ng/husky

 
 

Repository files navigation

Husky

Build Status Husky License

Husky is a distributed computing system designed to handle mixed jobs of coarse-grained transformations, graph computing and machine learning. The core of Husky is written in C++ so as to leverage the performance of native runtime. For machine learning, Husky supports relaxed consistency level and asynchronous computing in order to exploit higher network/CPU throughput.

For more details about Husky, please check our Wiki.

For bugs in Husky, please file an issue on github issue platform.

For further discussions, please send email to [email protected].

Dependencies

Husky has the following minimal dependencies:

  • CMake (Version >= 3.0.2, if >= 3.6.0, it should set CMAKE_PREFIX_PATH first when occurring errors to find the following dependencies)
  • ZeroMQ (including both libzmq and cppzmq)
  • Boost (Version >= 1.58)
  • A working C++ compiler (clang/gcc Version >= 4.9/icc/MSVC)
  • TCMalloc (In gperftools)
  • GLOG (Latest version, it will be included automatically)

Some optional dependencies:

Build

Download the latest source code of Husky:

$ git clone https://github.com/husky-team/husky.git

Or download latest release in Releases Notes.

We assume the root directory of Husky is $HUSKY_ROOT. Go to $HUSKY_ROOT and do a out-of-source build using CMake:

$ cd $HUSKY_ROOT
$ mkdir release
$ cd release
$ cmake -DCMAKE_BUILD_TYPE=Release ..  # CMAKE_BUILD_TYPE: Release, Debug, RelWithDebInfo
$ make help                            # List all build target
$ make -j{N} Master                    # Build the Husky master
$ make $ApplicationName                # Build the Husky application

It is available to compile static or shared library for those projects based on Husky.

$ make -j{N} husky                     # Build static library for default
$ cmake .. -DBUILD_SHARED_LIBRARY
$ make -j{N} husky-shared              # Build shared library

By the way, we provide a docker of Husky, which have already installed the dependencies. Try it in docker hub.

Configuration

Husky is supposed to run on any platform. Configurations can be stored in a configure file (INI format) or can be the command arguments when running Husky. An example file for configuration is like the following:

# Required
master_host=xxx.xxx.xxx.xxx
master_port=yyyyy
comm_port=yyyyy

# Optional
log_dir=path/to/log
hdfs_namenode=xxx.xxx.xxx.xxx
hdfs_namenode_port=yyyyy

# For Master
serve=1

# Session for worker information
[worker]
info=master:3

For single-machine environment, use the hostname of the machine as both the master and the (only) worker.

For distributed environment, first copy and modify $HUSKY_ROOT/scripts/exec.sh according to actual configuration. scripts/exec.sh depends on pssh.

Run a Husky Program

Run ./Master --help for helps. Check the examples in examples directory.

First make sure that the master is running. Use the following to start the master

$ ./Master --conf /path/to/your/conf

In the single-machine environment, use the following,

$ ./<executable> --conf /path/to/your/conf

In the distributed environment, use the following to execute workers on all machines,

$ cp $HUSKY_ROOT/scripts/exec.sh .
$ ./exec.sh <executable> --conf /path/to/your/conf

If MPI has been installed in the distributed environment, you may use the following alternatively,

$ cp $HUSKY_ROOT/scripts/mpi-exec.sh .
$ ./mpi-exec.sh <executable> --conf /path/to/your/conf

Run Husky Unit Test

Husky provides a set unit tests (based on gtest 1.7.0) in core/. Run it with:

$ make HuskyUnitTest
$ ./HuskyUnitTest

Documentation

Do the following to generate API documentation,

$ doxygen doxygen.config

Or use the provided script,

$ ./scripts/doxygen.py --gen

Then go to html/ for HTML documentation, and latex/ for LaTeX documentation

Start a http server to view the documentation by browser,

$ ./scripts/doxygen.py --server

License

Copyright 2016-2017 Husky Team

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

About

A more expressive and most importantly, more efficient system for distributed data analytics.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C++ 94.5%
  • CMake 4.4%
  • Other 1.1%