MapReduce Program

Overview

Hadoop MapReduce program to find the most viewed Wiki page name and its pageview number.

Technologies

sbt
HDFS
YARN
MapReduce
Hadoop
Hive
Docker

Features

Create a jar file.
Find the most viewed English Wiki page on 10/20/2020.
Sort the output by values in either a descending order or an ascending order.

Getting Started / Usage

In order to run this program properly, you will need to do the following prerequisites:

Be sure to have Apache Hadoop installed in your JVM or a local cluster.
Be sure to create plugins.sbt under project folder in order to create jar file.

If all of the prerequisites above are met, go ahead and clone this repo by using the command below:

git clone https://github.com/revature-scalawags/MapReduce_Program.git

In order to create a jar file, use the command below:

sbt clean compile assembly

Once the jar file is created, copy the jar file located within /target/scala-2.13 direcotry and paste it to JVM or a local cluster.

In order to run the jar file in Hadoop, use the command below:

hadoop jar WordCount-assembly-1.0.jar mapreducer.EnglishPageViewCount input output

Contributors

spark131008

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.bsp		.bsp
.metals		.metals
.vscode		.vscode
input		input
output		output
project		project
src		src
target		target
.DS_Store		.DS_Store
.gitignore		.gitignore
.scalafmt.conf		.scalafmt.conf
README.md		README.md
build.sbt		build.sbt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MapReduce Program

Overview

Technologies

Features

Getting Started / Usage

Contributors

About

Releases

Packages

Languages

MichaelT950/MapReduce_Program

Folders and files

Latest commit

History

Repository files navigation

MapReduce Program

Overview

Technologies

Features

Getting Started / Usage

Contributors

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages