Stereo Vision-based Semantic 3D Object and Ego-motion Tracking for Autonomous Driving

June 2020

tl;dr: object SLAM that uses 3d mod proposals from each frame.

Overall impression

The demo is quite impressive. Almost as good as cube slam. The 3D object proposal step is quite simple yet effective for cars like sedans with strong shape priors, even simpler than the deep3dbox method.

Many insightful comments from the paper:

End to end 3D regression need lots of training data and require heavy workload to precisely label all the object bboxes in 3D. Instance 3D detection produces frame-independent results, which are not consistent enough for continuous perception in autonomous driving.

Purely depending on instance 2D bbox limits its performance in predicting pose for truncated object.

The paper proposed a novel object bundle adjustment (BA). The method can track 3D objects and recover he dynamic sparse point cloud with instance accuracy and temporal consistency.

The method can track the object continuously even for the extremely truncated case where object pose is hard for instance inference.

Key ideas

Technical details