MVF: End-to-End Multi-View Fusion for 3D Object Detection in LiDAR Point Clouds

November 2020

tl;dr: Improve point embedding with dynamic voxelization and multiview fusion.

Overall impression

This paper is from the 1st author of VoxelNet.

Both VoxelNet and PointPillars uses PointNet to learn point embeddings, and generate pseudo-3D volume or pseudo-2D image to use 3D and 2D convolution. This paper improves the point embedding process by aggregating multiple views, and is a plug-and-play module that can be integrated into pointpillars.

Note that both PointPillars and the successor MVF are both still using anchors for prediction. The entire procedure is not well described. See Pillar OD for a better description.

Key ideas

Technical details