AirBnB Predictions

Introduction

This project have be develop on data collected from InsideAirBnB, that are separated in listings and reviews (comments on the listings).

Folder structure

data_analysis: contains notebooks with the analysis of the data.
dataset: contains the data and some custom classes to work with them.
- listings: contains the data relative the listings.
- comments: contains the data relative the listing’s reviews of comments.
embeddings: contains the processed data in pikles that we obtain in the intermediate steps of the data_processing.
model/models: contains custom Neural Networks model developed for the project.
processing: contains notebooks used to preprocess the data cleaning them or generate embeddings using Sentence Models.
utils: contains various utils to process the data, special note for the amenities a special field present in the listing processed using clusters.
visualization: contains custom modules to visualize the data.

Running

0. Requirements

This code have been developed on python 3.11.3, we recommend an equal mayor related version

1. Install the environment

Environment setup

Using Pip

pip install -r requirements.txt

Using Conda

conda env create -f environment.yml

3. Dataset - Setup

We need to placed the dataset downloaded from InsideAirBnB, inside the dataset folders.

The lising.csv must be placed inside dataset/listings folder, you can place more than one all the csv files in the folder will be used (already in the folder).
The reviews.csv must be placed inside dataset/comments folder, you can place more than one all the csv files in the folder will be used (already in the folder).

4. Data Processing

Run in order the processing steps:

step1_merge_listings_comments.ipynb: connects the listings and reviews togheter.
step2_process_columns.ipynb: generates the embeddings for the comments-review for the listings.
step3_process_comments.ipynb: generates the embeddings, as well as, the processed ordinal and numeric data. E.G., prices or listing type (Apartment, Home, etc...).
step4_extraction_of_test_set.ipynb: merge the embeddings and pre-processed data and generate the train and test dataset files.

5. Data Analysis

Simply open any notebook in the data_analysis and run it, only remember that the analysis requires the embeddings to be computed. As such you need the data pre-processing first.

6. Experiments

You can simply run any notebook to evaluate our models on the data provided the experiments are separated as follow:

Name		Name	Last commit message	Last commit date
Latest commit History 113 Commits
.vscode		.vscode
data_analyisis		data_analyisis
dataset		dataset
embeddings		embeddings
experiments		experiments
model/models		model/models
processing		processing
utils		utils
visualization		visualization
.gitignore		.gitignore
enviorment.yml		enviorment.yml
listing_name_embeddings.pkl		listing_name_embeddings.pkl
negative_comments.png		negative_comments.png
neutral_comments.png		neutral_comments.png
readme.md		readme.md
requirements.txt		requirements.txt
term-associations.html		term-associations.html
word clouds.png		word clouds.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AirBnB Predictions

Introduction

Folder structure

Running

0. Requirements

1. Install the environment

3. Dataset - Setup

4. Data Processing

5. Data Analysis

6. Experiments

About

Releases

Packages

Contributors 3

Languages

MaGiiK02/AirBnB_score_prediction

Folders and files

Latest commit

History

Repository files navigation

AirBnB Predictions

Introduction

Folder structure

Running

0. Requirements

1. Install the environment

3. Dataset - Setup

4. Data Processing

5. Data Analysis

6. Experiments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages