Team: David Gamez ([email protected]) and Ernesto Del Valle ([email protected])
The purpose of this project is to explore NYC collision data spanning from July 2012 through mid-November 2018. The dataset contains ~1.3mm collision records which we dissect across three dimensions in particular: [1] Location, [2] Contributing Factors and [3] Time Series.
We present an exploration of the above three dimensions through four central questions that formed our proposal and subsequent analysis:
- What borough carries the largest injuries and fatalities on an absolute basis? What if we normalize by population?
- What are the top zip codes or even particular crossings to be aware of?
- What are the most commonly identified contributing factors and vehicles? Do these suggest potential policy adjustments?
- What seasonality does the data show from a time-series perspective (if any)? Anything to highlight per injured subject (pedestrian/cyclist/motorist)? Per factor level (e.g. speed)?
None in particular. We have provided the following documents in this repository summarizing our work and results. Listed in reverse chronological order of creation, from latest to earliest:
- Final written report titled "W200 Project 2 Written Report (12-14-2018).pdf"
- Slides with main graphics (higher level summary) used for presentation in class and including full bibliography/sources visited. The title of this file is "W200 Project 2 Presentation (12-13-2018).pdf"
- Supporting IPython Jupyter Notebook as "Project_2_Team_2_NYC_Collisions.ipynb"
- Original proposal as "W200 Project 2 Proposal (11-28-2018).pdf"
-- README.md Ends --