Multi-View 3D Object Detection Network for Autonomous Driving

Mar 2019

tl;dr: sensor fusion framework to take in lidar point cloud and RGB images as input and predict oriented 3D bboxes. The 3D point cloud is encoded to a multi-view (birds eye view and front view) representation.

Overall impression

The paper is one of the pioneering work to integrate RGB and lidar point cloud data. It sets a good baseline in the task of 3D proposal geenration, 3D detection and 3D localization. Its performance is surpassed by AVOD, F-pointnet, edgeconv, point RCNN etc.

MV3D uses point cloud for 3D proposal generation and uses sensor fusion to refine 3D bbox. AVOD uses fused feature map for proposal generation.

Key ideas

Technical details