Movie Answers

CLI application built to use Hive for querying a Yarn cluster running on a local container to answer 4 interesting questions about popular and not so popular movies.

Table of contents

Description

Screen Grab

Tech Used

Usage

Project Info

Issues

Description

GroupLens Research has collected and made available their ratings datasets available at their MovieLens Website. The MovieLens 25M movie ratings stable benchmark dataset describes 5-star ratings and free-text tagging activity. 25,000,095 ratings and 1,093,360 tag applications are applied to 62,423 movies by 162,541 users. It includes tag genome data with 15 million relevance scores across 1,129 tags. The data was generated between January of 1995 and November of 2019. Released 11/2019.

This application uses Hive ontop of a Yarn cluster to query the MovieLens dataset and answers the following questions:

What are the most popular movies ever?
What are the 'worst' popular movies?
What are some good however, unpopular movies?
What movies correlate closely to their tag descriptions?

Screen:

Tech Used and Required

Scala and SBT: https://www.scala-lang.org/download/2.12.8.html
JDK (v11): https://jdk.java.net/15/
Hive-jdbc driver via library dependency: v3.1.2

Usage

These datasets can be acquired from movielens.
the dataset used was their 25M Dataset. The README for this data can be viewed here.

View files needed in your hdfs:

ratings.csv

tags.csv

genome-scores.csv (rqd for question 4 only)

genome-tags.csv (rqd for question 4 only)

This application will look for your Hive cluster running on the default http://localhost:10000.
No username or password is required.

Project:

Repo
My Github
Email: [email protected]

Known Issues:

None known at the moment.
If any are discovered, please feel free to contact me. Cheers. 😄

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.idea		.idea
.vscode		.vscode
project		project
src/main/scala		src/main/scala
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
build.sbt		build.sbt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Movie Answers

Table of contents

Description

Screen:

Tech Used and Required

Usage

Project:

Known Issues:

About

Releases

Packages

Languages

License

revature-scalawags/Page-Project1

Folders and files

Latest commit

History

Repository files navigation

Movie Answers

Table of contents

Description

Screen:

Tech Used and Required

Usage

Project:

Known Issues:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages