forked from rcrowley/bashreduce
-
Notifications
You must be signed in to change notification settings - Fork 4
Quick Start
jesusabdullah edited this page Sep 13, 2010
·
6 revisions
- br somewhere handy in your path
- vanilla Unix shell tools: sort, awk, ssh, netcat 1
- password-less ssh to each machine you plan to use
The bashreduce installation is very straightforward. Just download and unpack the tarball or clone this repository using git. Now, let’s build the optional performance boosting utilities it comes with:
$ cd bashreduce/brutils $ make cc -O3 -Wall -c -o brp.o brp.c cc -o brp brp.o cc -O3 -Wall -c -o brm.o brm.c cc -o brm brm.o $ sudo make install install -c brp /usr/local/bin install -c brm /usr/local/bin
For convenience, put br somewhere on your PATH.
Edit /etc/br.hosts
and enter the machines you wish to use as workers, one host per line. Or specify your machines at runtime:
$ br -h "host1 host2 host3"
To take advantage of multiple cores, repeat the host name.
$ br < input > output
$ br -r "uniq -c" < input > output
$ LC_ALL='C' br -r "join - /tmp/join_data" < input > output
$ br -m "grep pattern" < input > output
1 There are several versions of netcat. Ubuntu/Debian has two variants: " openbsd " and " traditional ". br only works when using netcat traditional. In this way, all machines you wish to use as workers must to be installed netcat traditional.