Skip to content
YANG Fan edited this page Sep 23, 2016 · 3 revisions

Skeleton

The skeleton of a C++ Husky program is as follows,

#include "core/engine.hpp"

using namespace husky;

void job() {
    // work here ...
}

int main(int argc, char** argv) {
    init_with_args(argc, argv);
    run_job(wc);
    return 0;
}

A Simple Config

The following shows a very simple configuration file for single-machine setting.

master_host=localhost
master_port=10086
comm_port=12306

[worker]
info=localhost:4

It's in INI format. At the beginning it defines the master host, the port that the master should use (master_port) and the port that the workers should use (comm_port). In the end it says that run four Husky worker threads on the same machine.

A Simple Program

Let's start by defining two kinds of objects.

class Staff {
   public:
    using KeyT = int
    Staff() = default;
    Staff(int key) : key_(key) {}
    const KeyT& id() const { return key_; }
   protected:
    KeyT key_;
};

class Boss {
   public:
    using KeyT = int
    Boss() = default;
    Boss(int key) : key_(key) {}
    const KeyT& id() const { return key_; }
   protected:
    KeyT key_;
};

Then we are going to read from HDFS and create some Staff objects.

void job() {
    HDFSLineInputFormat infmt;
    infmt.set_input("/path/to/input");

    auto& staff_list = ObjListFactory::create_objlist<Staff>();
    load(infmt, [&](boost::string_ref& chunk) {
        staff_list.add_object(std::stoi(chunk.to_string()));
    });
    globalize(staff_list);
    // work continues...
}

Notice that we globalize the staff_list after we load data from HDFS. This will make the objects globally visible, which allows global object interaction.

Next, we will let the staff talk to the boss. We need to create a channel between two object lists so that they can communicate.

void job() {
    // load data ...
    auto& boss_list = ObjListFactory::create_objlist<Boss>();
    auto& ch = ChannelFactory::create_push_combined_channel<int, SumCombiner<int>>(staff_list, boss_list);
    list_execute(staff_list, [&](Staff& staff) { ch.push(0, 1); }); // send `1` to the Boss object with ID 0
    list_execute(boss_list, [&](Boss& boss) {
        int count = ch.get(boss);
        log_msg("I'm boss. My ID is "+std::to_string(boss.id())+". I saw "+std::to_string(count)+" staff.");
    });
}

The Channel we create will apply SumCombiner to sum up all numbers. So in the next list_execute, the Boss object only receives one number. And notice that we didn't manually add any Boss object. This object is created upon receipt of incoming communication, using the constructor with the key as the only parameter.

Clone this wiki locally