This repository is the PyTorch implementation of the paper : Asynchronous Multi-Modal Fusion based 3D Object Detection.
The code includes 3 parts:
PART0: 00_3DMotionEstimation:
Self-Supervised Monocular Scene Flow Estimation
Junhwa Hur and Stefan Roth | Paper | Supplemental | Arxiv
Features: Scene flow estimation from two temporally consecutive monocular images.
PART1: 01_AF3D:
Features: Asynchronous multi-modal fusion to generate LiDAR point cloud for asynchronous frame.
PART2: 02_Unified3DDetection:
MMDetection3D is an open source object detection toolbox based on PyTorch, towards the next-generation platform for general 3D detection. It is a part of the OpenMMLab project developed by MMLab.
Model Zoo, supported methods and backbones are shown in the below table.
ResNet | ResNeXt | SENet | PointNet++ | HRNet | RegNetX | Res2Net | |
SECOND | ☐ | ☐ | ☐ | ✗ | ☐ | ✓ | ☐ |
PointPillars | ☐ | ☐ | ☐ | ✗ | ☐ | ✓ | ☐ |
FreeAnchor | ☐ | ☐ | ☐ | ✗ | ☐ | ✓ | ☐ |
VoteNet | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ |
H3DNet | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ |
3DSSD | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ |
Part-A2 | ☐ | ☐ | ☐ | ✗ | ☐ | ✓ | ☐ |
MVXNet | ☐ | ☐ | ☐ | ✗ | ☐ | ✓ | ☐ |
CenterPoint | ☐ | ☐ | ☐ | ✗ | ☐ | ✓ | ☐ |
SSN | ☐ | ☐ | ☐ | ✗ | ☐ | ✓ | ☐ |
Features: Since image modality is transferred to LiDAR modality, all LiDAR-based, fusion-based 3D object detectors can be applied.
PART 4: 03_eAP
Follow the of SceneFlow and mm3DDet to install the required enviroment.
Download the following datasets from KITTI Benchmark.
(2) KITTI 3D Object Detection Dataset
(3) KITTI Scene Flow Dataset (For training Scene Flow Estimation Algorithms, could be ignored.)
Retrive previous frame of 3D object detection dataset
python3 01_AF3D/
# The following parameter need to be modified:
# dKITTI3D = '../data/kitti_3d'
# dKITTIRAW = '../data/kitti_raw/data_raw'
# dsavepath = '../data/kitti_asyn3d'
# dsaveimage2 = os.path.join(dsavepath,'image_2')
# dsaveimage3 = os.path.join(dsavepath,'image_3')
# dsavevelodyne = os.path.join(dsavepath,'velodyne')
# dsavecalibcam = os.path.join(dsavepath,'calib_cam')
# dsavecalibvelo = os.path.join(dsavepath,'calib_velo')
# Generate LiDAR
./00_3DMotionEstimation/scripts/ --save_disp=True, --save_disp2=True --save_flow=True
# Generate LiDAR
python3 01_AF3D/
# The following parameter need to be modified:
# ddisp0 = '../data/monosf_selfsup_kitti_3ddet/disp_0'
# ddisp1 = '../data/monosf_selfsup_kitti_3ddet/disp_1'
# dflow = '../data/monosf_selfsup_kitti_3ddet/flow'
# dasyn3d = '../data/kitti_asyn3d'
# dbinsave = '../data/kitti_asyn3d_recbin'
# dtxtlabel = '../data/kitti_3d/training/label_2'
# dtxtcalib = '../data/kitti_3d/training/calib'
# Train generated LiDAR/LiDAR
python3 tools/ ${CONFIG_FILE} --work_dir ${YOUR_WORK_DIR} [optional arguments]
# Validation
python3 tools/ ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}] [--show]
Follow the instruction of mm3DDet.
# Evaluation
python3 04_eAP/