1️⃣🐝🏎️ 🔥 The One Billion Row Challenge - But Mojo 🔥

After watching a youtube video reacting to another developer's experience re-creating Gunnar Morling's Java coding challenge in Golang, where you process one billion rows of simple formated data and output the names of the weather station along with its min, max, and average temperatures in alphabetical order to STDOUT. The data will be read from a file and on each row the data is formatted as follows <Name of Observation point>;<[-99.9, ..., 99.9]> where there are no more than 10,000 unique locations.

I am also looking to use this as an introductory project to start learning the finer points of Mojo after my few years writing python professionally. Some of the topics of interest are SIMD, concurrency, Mojo's data ownership model, and how mojo will interop with CPython.

Initial plans and project milestones

Tooling to help automate interation and validation
- Generate test file
- Timing
  - Python
  - Mojo
- Profiling
  - Python
  - Mojo
- Validation
  - Python
  - Mojo
- Logging performance across commits
Initial naive python implementation
Iterate, Profile, and Validate. Below is a list of what I expect will help decrease the total runtime of the script
- Converting to Mojo datastructures
- generators
- Interactions with the file
- Data typing and ownership
- Concurrency
- Removing un-needed validation
- efficiently writing to STDOUT

Performance/Implementation milestones

Short Commit Id	Row Count	Timestamp	Average Run Time	Runs	Note

example link to commit 00.0 sec Relevant goal reached or implementation made

Project setup instructions

curl https://pyenv.run | bash
Follow instructions supplied in STDOUT to add pyenv to $PATH
Follow this link for instructions to install all build requirements for your machine
pyenv install 3.12.2

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.vscode		.vscode
data_files		data_files
scripts		scripts
.gitignore		.gitignore
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

1️⃣🐝🏎️ 🔥 The One Billion Row Challenge - But Mojo 🔥

Initial plans and project milestones

Performance/Implementation milestones

Project setup instructions

About

Releases

Packages

Languages

SymbionicNigel/mojo-1brc

Folders and files

Latest commit

History

Repository files navigation

1️⃣🐝🏎️ 🔥 The One Billion Row Challenge - But Mojo 🔥

Initial plans and project milestones

Performance/Implementation milestones

Project setup instructions

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages