The preprocessing of crimes incidents and tweets has already been done, and its result are saved in this repo. In case you would like to preform it by yourself again (takes ~ 2 hours), follow these steps:
- Make sure that the directory
data/processed/
exists and empty. - Place all the JSON files from Raw Collected
Tweets in
data/raw/tweets/
. - Export Chicago Crimes - 2001 to
present) as CSV file, and place it in
data/raw/
- Run the Pipeline Jupyter Notebook.