Skip to content

Saukkoriipi/PySpark_Project

Repository files navigation

PySpark Project

Author:

Mikko Saukkoriipi

Project done for the course:

Distributed Data Infrastructure
University of Helsinki
Fall 2020

The given project questions were:

  1. For the first data set (data-1.txt), provide the value of the median of the data set. You should provide the exact median, not an approximation, and sorting the complete data set is not an acceptable solution.

  2. The second data set (data-2.txt) contains the matrix A. Your task is to calculate  A*A^{T}*A and provide the resulting matrix as your answer in the same format as the input matrix.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published