For self-driving vehicles, object detection and tracking are essential tasks that allow vehicles to identify obstacles in its course. We hope to develop an AI system that performs object detection and tracking based on state-of-the-art research.
- Develop an AI system to recognize obstacles
- Determine distances to the identified objects and give collision warnings when needed
Access the application through this web-based UI
Deployed the Flask app on Google Cloud Platform. (Not always accessible due to limited funding to keep a GPU VM instance running on GCP)
Detection is performed with CenterNet model, pretrained on KITTI dataset 3DOP train-test split. CenterNet locolizes objects as their center points, and estimates all other properties (e.g. width and height of the bounding box) with regression heads. See CenterNet paper for details.
We applied the pre-trained CenterNet model (ddd_3dop) on ~11k images from Waymo perception data validation split. The model achieved 0.15 mAP (IoU 0.5) on this independent test data, 0.31 mAP on large objects.
Detection and tracking pipeline is generated using CenterTrack, pretrained on COCO for 80-category tracking. CenterTrack is a joint detection & tracking algorithm. It relies on CenterNet for detection, then associates the same objects from adjacent frames. The model was pretrained on nuScenes containing 700 image sequences, and validated on 150 sequences from nuScenes test data. The model achieved ~28% [email protected] over 7 categories on nuScenes test set. See CenterTrack paper for details.
We applied the pre-trained CenterTrack 3D model (nuScenes_3Dtracking) on train4 in Argoverse 3D tracking data. We leveraged the depth estimation from the algorithm output to adapt color of the bounding boxes, such that close enough objects are marked in red.
- Build API for model that takes video inputs, output detection, tracking, distance monitoring results as the illustration above.
- Build a web-based UI for the API, takes video uploads, display detection results.
- detra/ - scripts for model API and Flask web app
- demo_track.py, run_CenterTrack_3D_api.sh: code adapted from the CenterTrack module, takes videos/images input, output list of images with detection and tracking results
- app.py, detimg2git.py, static, templates: files for Flask web app
- run_CenterNet.sh, run_CenterTrack_3D.sh: scripts to run experiments with CenterNet and CenterTrack open-source code
- Dockerfile, .dockerignore, environment.yml - scripts to build docker image and conda environments for reproduction
- exploration/ - data collection and wrangling, experiment prototypes, exploration of LiDAR data
- examples/ - for detection result showcase, and visualization in the README
In addition to using monocular camara images, we explored popular detection algorithms using 3D LiDAR point cloud data. This directory contains the experiments using Point R-CNN on point cloud data from the Argoverse 3D tracking dataset.