Skip to content

Almaz-KG/one-billion-row-challenge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

one-billion-row-challenge

Initial source: 1brc by gunnarmorling

The challenge

Initially, The One Billion Row Challenge (1BRC) was devoted to fun exploration of how fast Java programs can handle 1 billion rows. And the challenge goes far beyond the java-only state, with the implementations in a variety of languages and frameworks. Here is the list of all implementations

The Task

The task is to write a Java program which reads the file, calculates the min, mean, and max temperature value per weather station, and emits the results on stdout like this (i.e. sorted alphabetically by station name, and the result values per station in the format <min>/<mean>/<max>, rounded to one fractional digit)

Input lines example

Hamburg;12.0
Bulawayo;8.9
Palembang;38.8
St. John's;15.2
Cracow;12.6
Bridgetown;26.9
Istanbul;6.2
Roseau;34.4
Conakry;31.2
Istanbul;23.0

Output line:

{Abha=-23.0/18.0/59.2, Abidjan=-16.2/26.0/67.3, Abéché=-10.0/29.4/69.0, ...}

How to test your implementation

  1. Generate test data (1 billion rows)

$> python3 create_measurements.py 1_000_000_000

It will take about ~2 minutes to complete and produce a ~15Gb size file in data/ folder (of course depending on your machine)

  1. Implement your code in any prefered language/framework

  2. Measure the performance of your implementation

time $your_programm_command data/measurements.txt

About

The 1BRC implementations in different languages/frameworks

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published