MapReduce Program

Overview

Hadoop MapReduce program to find the most viewed Wiki page name and its pageview number.

In order to run this program properly, you will need to do the following prerequisites:

If all of the prerequisites above are met, go ahead and clone this repo by using the command below:

git clone https://github.com/revature-scalawags/MapReduce_Program.git

In order to create a jar file, use the command below:

sbt clean compile assembly

Once the jar file is created, copy the jar file located within /target/scala-2.13 direcotry and paste it to JVM or a local cluster.

In order to run the jar file in Hadoop, use the command below:

hadoop jar WordCount-assembly-1.0.jar mapreducer.EnglishPageViewCount input output

spark131008