This repo contains my submitted and evaluated homeworks for the Big Data Computing course at UniPD - DEI 2021/22.
Group members:
- Stefano Binotto, ID: 2052421
- Edoardo Bastianello, ID: 2053077
- Giulia Pisacreta
Perform a market basket analysis exploiting the MapReduce algorithm explained in class. All the details are reported here.
Homework 1: folder
Solve the weighted variant of the k-center problem with outliers by implementing the kcenterOUT algorithm described during the lecture and reported here.
Homework 2: folder
Solve the k-center problem with z outliers by implementing the 2-round coreset-based MapReduce algorithm described during the lecture. All the details are reported here.
Homework 3: folder
This folder contains the guides I used to setup the Apache Spark framework on my machine, the CloudVeneto cluster and the tutorial for programming in Spark.