BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird’s-Eye View Representation

June 2022

tl;dr: Early camera-lidar fusion in BEV space.

Overall impression

BEVFusion breaks the long-lasting common practice that point-level fusion is the golden choice for multi-sensor perception systems.

BEVFusion’s main contribution seems to be the efficient implementation of the Voxel Pooling operation. The speed bottleneck in voxelpooling is also noted and solved in BEVDepth.

BEVFusion is single image based, no temporal module.

Key ideas

Technical details