Learning-Deep-Learning

MMF: Multi-Task Multi-Sensor Fusion for 3D Object Detection

June 2019

tl;dr: Use auxiliary tasks (ground estimation and depth completion) and sensor fusion boost 3D object detection. MMF is fast running at 13 FPS.

Overall impression

This paper is built on ContFuse and two-stage sensor fusion methods such as MV3D and AVOD. MMF and ContFuse is similar to AVOD that it uses fused feature for proposal generation. And MMF and ContFuse method is anchor-free. However MMF is better than ContFuse in that it uses depth estimation for a dense pseudo-lidar point cloud.

The paper is also influenced by HDNet which exploits HDmap and estimates ground height for 3D detection.

MV3D –> AVOD –> ContFuse –> MMF

This boost the 2D hard by more than 8 AP (from 80 to 88) among real-time models. (RRC from sensetime performs really well for 2D OD but runs at 3.6 s/frame)

Key ideas

Technical details

Notes