Skip to content

Latest commit

 

History

History
33 lines (23 loc) · 1.71 KB

File metadata and controls

33 lines (23 loc) · 1.71 KB

House Prices: Advanced Regression Techniques

Overview

This project is a solution to the Kaggle competition "House Prices: Advanced Regression Techniques", through a collaborative effort with fellow students, conducted under the guidance of Prof. Vered Aharonson and achieved a grade of 90%.

Project Description

This notebook is intended to be a practical project for an introductory machine learning course, predicting the sale prices of each home in Ames, Iowa, using 79 explanatory variables describing various aspects of residential homes. The dataset used is the Ames Housing dataset, compiled by Dean De Cock for use in data science education.

Key Features

  • Data Cleaning and Pre-processing: Handling missing values, encoding categorical variables, and feature scaling.
  • Feature Engineering: Creating new features from existing ones to improve model performance.
  • Model Selection: Implementing and comparing linear regression, K-Nearest Neighbours (KNN), Random Forest, and blended models such as Ridge and Lasso.
  • Model Evaluation: Using KPI of Root Mean Square Error to evaluate and compare model performance.
  • Visualization: Employing various visualization techniques to explore and present findings.

Technologies Used

  • Python
  • Jupyter Notebook
  • Pandas
  • NumPy
  • Scikit-Learn
  • Matplotlib
  • Seaborn
  • Scipy
  • OS

Results

The final model achieved a rank of 550th in the Kaggle competition at the time of submission, demonstrating effective teamwork, understanding, and application of machine learning techniques.