GS3D: An Efficient 3D Object Detection Framework for Autonomous Driving

October 2019

tl;dr: Get 3D bbox proposal (guidance) from 2D bbox + prior knowledge, then refine 3D bbox.

Overall impression

This paper also regresses 2D bbox and orientation with conventional 2DOD architecture, then get a coarse 3D position, then refine. The approach of generating initial 3D location is similar to FQNet and MonoPSR.

The depth estimation method is practical. The quality aware loss is also easy to implement than IoU net to predict quality of bboxes. However the usefulness of surface feature extraction is doubtful.

The paper still uses Caffe in 2019 is a bit of a shocker.

Key ideas

Technical details