Skip to content

Implementation of 3 different techniques to solve the Spotify Million Dataset Playlist Challenge, hosted on AICrowd.

Notifications You must be signed in to change notification settings

DomizianoScarcelli/spotify-recommender

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spotify Automatic Playlist Continuation

The final project for the Big Data Course A.Y. 2022/2023 at the University of Rome la Sapienza.

The project involves implementing 3 different techniques to solve the Spotify Million Dataset Playlist Challenge, hosted on AICrowd.

The methods are implemented using Pyspark in order for the data to work on a distributed system.

There is also a re-implementation of the MMCF: Multimodal Collaborative Filtering for Automatic Playlist Continuation by the "Hello World" team that classified in 2nd place in the challenge. The re-implementation consists in converting the Neural Network from Tensorflow v1 to PyTorch, and using Petastorm to create a PyTorch DataLoader from a Pyspark DataFrame in order to keep the data distributed.

The folder structure is the following:

  • core: contains the notebooks and other files that constitute the core algorithms that implement the recommender systems;
  • slides: contains the source code for the presentation made using Slidev.
  • webapp: contains the code for a demo app built with Vite + React + FastAPI that showcase the usage of the system;

Demo of the web app

demo.mp4

Tech Stack

Python • PySpark • PyTorch • Petastorm • Typescript • MinIO • React • Tailwind • FastAPI • Vite • MongoDB • Docker • Slidev

About

Implementation of 3 different techniques to solve the Spotify Million Dataset Playlist Challenge, hosted on AICrowd.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages