RVNet: Deep Sensor Fusion of Monocular Camera and Radar for Image-based Obstacle Detection in Challenging Environments

January 2020

tl;dr: Fuse radar to camera with sparse pseudo-image as input and two output branches for small and large object detection.

Overall impression

This paper uses similar method to convert radar pins into pseudo-image as in distant object detection. It is called “sparse radar image”.

Critics: The “dense radar image” does not make sense to me as it warps a 169-dim feature into a 13x13 image and apply 2D conv. There is no guarantee of the order in the 169-d feature and thus 2D conv seems a bit random.

Key ideas

Technical details