Michael-s-Project1

Project Description

Project 1's analysis consists of using big data tools to answer questions about datasets from MovieLens. There are a series of basic analysis questions, answered using Hive and MapReduce. The tools used are determined based on the context for each question. The output of the analysis includes MapReduce jar files so that the analysis is a repeatable process that works on a larger dataset, not just an ad hoc calculation.

Technologies Used

Scala - version 2.13.3
sbt - version 1.4.4
Apache Hive - version 3.1.2
HDFS
Yarn
MapReduce
Hadoop

Features

List of features ready

Given the dataset, finds the number movies rated 5 stars.
Displays list of movies rated 5 stars.
Given the dataset, finds the number of movies with 1 star or less.
Displays list of movies rated 1 star or less.

Getting Started

In order to run this program properly, Apache Hive must be installed and running properly on a JVM or local cluster. Once the prerequisite is met, clone this repo using the following command:

git clone https://github.com/revature-scalawags/Michael-s-Project1.git

Usage

Create an executable jar file using the following command:

sbt assembly

The executable jar file is created within the /target/scala-2.13 directory. Copy the .jar file into a JVM or local cluster. Execute the jar file using the following command:

hadoop jar word-count-assembly.jar input output

After the .jar file executes, you can read the output using:

hdfs dfs -cat output/part-r-00000

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.bsp		.bsp
.idea		.idea
project		project
src/main/scala		src/main/scala
target		target
README.md		README.md
build.sbt		build.sbt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Michael-s-Project1

Project Description

Technologies Used

Features

Getting Started

Usage

About

Releases

Packages

Languages

revature-scalawags/Michael-s-Project1

Folders and files

Latest commit

History

Repository files navigation

Michael-s-Project1

Project Description

Technologies Used

Features

Getting Started

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages