Trade Sequence

This project demonstrates how to process time-series data using Apache Crunch using the simple example of sequencing trades for each stock by time.

Running

The compiled program can be run on a Hadoop cluster with:

hadoop jar target/tradesequence-0.0.1-SNAPSHOT-job.jar /your/hdfs/input/directory /your/hdfs/output/directory

Test data

A small test data JSON file is provided in src/main/avro. On a CDH5 cluster it can be converted to an Avro file using src/main/avro/create_test_avro.sh. On another Hadoop distribution you can alter the script to point to your avro-tools location. The Avro data file can be used as the input for the job.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Trade Sequence

Running

Test data

Files

README.md

Latest commit

History

README.md

File metadata and controls

Trade Sequence

Running

Test data