What You See is What You Get: Exploiting Visibility for 3D Object Detection

June 2020

tl;dr: Visibility augmented deep voxel representation, with occupancy grid feature map.

Overall impression

Representing point cloud as xyz fundamentally destroys the difference of free space and uncertainty space (both contains no lidar points). Convolution cannot differentiate such difference based on such a representation.

WYSIWYG adds an occupancy grid feature map to the existing feature map, very much like coord conv and cam conv.

Key ideas

Technical details