FQNet: Deep Fitting Degree Scoring Network for Monocular 3D Object Detection

September 2019

tl;dr: Train a network to score the 3D IOU of a projected 3D wireframe with GT.

Overall impression

This paper extends the idea of deep3dbox beyond tight fitting, as deep3dbox depends much on the performance of 2D object detector. If the 2D object detector is inaccurate, then it will greatly affect 3D box accuracy.

The idea is to add a refinement stage to deep3dbox by densely sample around the 3D seed location (obtained by tight 2D/3D constraints), then score the 2D patches with rendered 3D wireframes.

The idea is quite clever, but the optimization step to generate the 3D seed location is very time consuming and not very practical.

The idea of shift RCNN and FQNet are quite similar. Both builds on deep3Dbox and refines the first guess. But FQNet passively densely sample around the GT and train a regressor to tell the difference to GT, shift RCNN actively learns to regress the difference.

Key ideas

Technical details