Shift R-CNN: Deep Monocular 3D Object Detection with Closed-Form Geometric Constraints

October 2019

tl;dr: Extend the work of deep3Dbox by regressing residual center positions.

Overall impression

The paper has a good summary on mono 3DOD in introduction.

The geometric constraints become a closed-formed one. This is similar to deep3Dbox but slightly different (over-constraint vs exact-constraint).

The idea of shift RCNN and FQNet are quite similar. Both builds on deep3Dbox and refines the first guess. But FQNet passively densely sample around the GT and train a regressor to tell the difference to GT, shift RCNN actively learns to regress the difference. The followup work of FQNet is RAR-Net which also actively predicts the offset, but does that iteratively with a DRL agent.

Key ideas

Technical details