Learning-Deep-Learning

MonoPSR: Monocular 3D Object Detection Leveraging Accurate Proposals and Shape Reconstruction

July 2019

tl;dr: 3DOD by generating 3D proposal first and then reconstructing local point cloud of dynamic object.

Overall impression

This is from the authors of AVOD. The structure of the The centroid proposal stage is quite accurate (average absolute error ~1.5 m).

This paper is heavily influenced by deep3dbox, in particular the leverage of 2bbox and estimation of orientation and dimension as a first step.

The reconstruction branch regresses a local point cloud of the object and compares with the GT in point cloud and camera (after projection). The paper did not talk about the incremental boost of this branch, and seems to be just a fancy regularization branch (multi-task).

Key ideas

Technical details

Notes