Demonstrate the use of SparkSQL to determine key metrics about home sales data
Using home sales data, a few questions are answered:
- What is the average price for a four bedroom house sold in each year?
- What is the average price of a home for each year the home was built that have 3 bedrooms and 3 bathrooms?
- What is the average price of a home for each year built that have 3 bedrooms, 3 bathrooms, with two floors, and are greater than or equal to 2,000 square feet?
Jupyter notebook created in Google Colaboratory. Link to specific notebook is located in the Home_Sales.ipynb file.
Dataset used was provided by Rutgers University:
https://2u-data-curriculum-team.s3.amazonaws.com/dataviz-classroom/v1.2/22-big-data/home_sales_revised.csv