- Scala and Hadoop application for answering interesting questions about large data sets
- Use following technologies
- Hadoop MapReduce
- YARN
- HDFS
- Scala
- Hive
- Git + GitHub
- Dataset
- [All Analytics](https://dumps.wikimedia.org/other/analytics/)
- Question
- Which English wikipedia article got the most traffic on October 20?
- Analyze how many users will see the average vandalized wikipedia page before the offending edit is reversed.
- CLI (low)(easy)
- Menu provides users with queries to make on the data (high level not actual Hive syntax) (low)(easy)