Project1 wikipedia viewcount

This project takes data files from wikipedia of how many views each page has as retrievable here https://dumps.wikimedia.org/other/pageviews/

It then combines the views from each country and discards any pieces with less than 100 views

This layer of the project gives the top 10 of the resulting page counts in order

Technologies Used

Hive
Big-Data-Europe's Hive docker container - https://github.com/big-data-europe/docker-hadoop
More in 1st part at the link at the bottom of this README

Features

Combine the the count of views from each region
Discards any pages with less than 100 views
Store the results in an easy to retrieve format

To-do list:

Implement a Easy to understand script that can be ran for the below lines

How to run

Set up Big Data Europes hive container

git clone https://github.com/big-data-europe/docker-hive && cd docker-hive
docker-compose up -d

copy your previous data to the namenode container then access it and prepare the data

docker cp {YOUR HADOOP NAMENODE CONTAINER ID}:output output
docker exec -it {YOUR docker-hive_namenode} bash
hadoop fs -get output output

Now you can leave the namenode container and run

sbt run

See Hadoop Data processing part of this project!

https://github.com/Trenton-Serpas/Tserpas-project1-hadoop

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
project		project
src/main/scala		src/main/scala
.gitignore		.gitignore
README.md		README.md
build.sbt		build.sbt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project1 wikipedia viewcount

Technologies Used

Features

How to run

See Hadoop Data processing part of this project!

About

Releases

Packages

Contributors 2

Languages

Trenton-Serpas/Tserpas-project1

Folders and files

Latest commit

History

Repository files navigation

Project1 wikipedia viewcount

Technologies Used

Features

How to run

See Hadoop Data processing part of this project!

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages