KM3D-Net: Monocular 3D Detection with Geometric Constraints Embedding and Semi-supervised Training

September 2020

tl;dr: Work the geometric reasoning with pseudo-inverse optimization into the neural network.

Overall impression

KM3D-Net is based on the previous work from the same author, RTM3D. KM3D-Net is highly practical, and works the 3D geometry reasoning module into the neural network to speed things up. Geometric constraint modules in Deep3DBox, FQNet and RTM3D are time consuming.

The semi-supervised learning approach is quite interesting and showed that it is possible to get meaningful results just from as few as 500 labeled images. Maybe it is a good direction to dig with the self-consistency cues in UR3D and MoVi-3D. The self-supervised learning are done on

The removal of the depth prediction directly from the neural network makes it possible to do geometric data augmentation and introduce self-supervised loss.

This is the currently the SOTA, much better than previous SOTA M3D-RPN.

A quick summary of CenterNet monocular 3D object detection.

Key ideas

Technical details