Rethinking Pseudo-LiDAR Representation

August 2020

tl;dr: Reformulate the depth information with RGB makes a huge difference.

Overall impression

The paper builds on top of the Frustum-Pointnet version of pseudo-lidar, with a two-step process. First a 2D detector finds the car, and then crop a frustum with the bbox, then a pointnet is used to place a 3D bbox around the point cloud. PatchNet starts from this cropped patch.

The idea is simple, instead of RGB, fill in the XYZ values in the original image. It is similar to the idea of CoordConv and CamConv, and perhaps should have been called 3DCoordConv. Along the line of work in lidar object detection, it is similar to the spherical or cylindrical view rendering of lidar point cloud into a raster image.

Key ideas

Technical details