VoxNet: A 3D Convolutional Neural Network for Real-Time Object Recognition

Feb 2019

tl;dr: Convert range data such as point cloud and RGBD image to volumetric data, and perform 3D recognition.

Overall impression

Converting the point cloud data (sparse) into 3D volumetric data (dense) may benefit from the progress people have made in 3D image process, including video recognition and medical image analysis. Computationally it may not make sense. The 3D CNN is quite simple. It is very effective in classification but may not be ideal for semantic segmentation and object detection tasks.

Key ideas

Technical details