Skip to content

Demonstrate use of SparkSQL to determine key metrics about home sales data

Notifications You must be signed in to change notification settings

anthonybpino/home-sales

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

home-sales

Demonstrate the use of SparkSQL to determine key metrics about home sales data

Repo Outline:

Using home sales data, a few questions are answered:

  • What is the average price for a four bedroom house sold in each year?
  • What is the average price of a home for each year the home was built that have 3 bedrooms and 3 bathrooms?
  • What is the average price of a home for each year built that have 3 bedrooms, 3 bathrooms, with two floors, and are greater than or equal to 2,000 square feet?

Jupyter notebook created in Google Colaboratory. Link to specific notebook is located in the Home_Sales.ipynb file.

Notes & Resources:

Dataset used was provided by Rutgers University:
https://2u-data-curriculum-team.s3.amazonaws.com/dataviz-classroom/v1.2/22-big-data/home_sales_revised.csv

Apache_Spark_logo svg

About

Demonstrate use of SparkSQL to determine key metrics about home sales data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published