This project's aim is to predict the prices of properties (located in Buenos Aires) published in Properati.
Table of content
The company's property appraisers do the valuation in the traditional way, this means that the process is subjective to the appraiser's criteria.
Currently the process is slow, and there is a risk of under- or over-valuing a property. This in turn generates customer dissatisfaction.
Objective: Create a model based on advanced Machine Learning techniques to predict property prices based on their attributes.
- Descriptive statistics
- Data visualization
- Feature engineering
- Machine learning
- Python
- Numpy, Pandas, Scipy
- Matplotlib, Seaborn
- Scikit Learn
- XGBoost
In the project I studied in detail the predictor variables and their relationships with the target variable. Based on the exploratory analysis I found that the variables that best predict the price of a property are the surface area_covered and the number of bathrooms. An average error of 49k USD was achieved with the best model (XGBoost), which is equivalent to 16.8% average error.
To improve the performance of the model the following steps could be taken (ordered by ascending complexity):
- Only work with data from one city when training a model, this reduces variability and therefore decreases error
- Add the neighborhood variable in the training (this will increase the computer cost but can generate a non-spectacular but considerable gain)
- Create a model by type of property: one for apartments and another for houses, because although both are habitable properties they have very different behaviors. In this way you can focus actions to reduce the error independently.
- Complement the dataset with more specific data on the property's environment, such as the crime rate by area, socioeconomic status, and the number of businesses in the vicinity, among others.
- Work with geolocation data combined with an API map to obtain the number of stores and areas of interest around automatically, as well as prices of nearby houses.
You can visit my Personal Website, follow me on Twitter, connect with me on LinkedIn, or check out the rest of my projects on my GitHub.