6D-VNet: End-to-end 6DoF Vehicle Pose Estimation from Monocular RGB Images

January 2020

tl;dr: Directly regress 3D distance and quaternion direction from RoIPooled features.

Overall impression

This is an extension of mask RCNN, by extending the mask head to regress fine-grained vehicle model (such as Audi Q5), quaternion and distance.

Key ideas

Technical details