- Presentation Video: https://youtu.be/VeprPsJZU0Y
- Presentation ppt: https://github.com/Avenirrr/Practical-Data-Science-Project/blob/master/388Final_projectppt.pdf
- Detailed Report: https://github.com/Avenirrr/Practical-Data-Science-Project/blob/master/final_report.ipynb
We look into how the trending topics of US Films have changed since 1950 from IMDb datasets.
To determine whether features like genres, release dates and movie descriptions are the significant features, we look into the correlations of these features to how people vote.
We scraped US movie dataset of size 72,000 from IMDb using web scraping (html).
The statiscal and machine learning models used in this project are:
- Run regression to see if there are other variables that affect the popularity of topic. (eg. GDP, Engel's Coefficient)
- Run sentiment analysis on the reviews of popular/unpopular movies to see what was most loved/hated by the audiences.