Joint Monocular 3D Vehicle Detection and Tracking [Notes]

October 2019

tl;dr: Add 3D tracking with LSTM based on mono3d object detection.

Overall impression

An autonomous driving system has to estimate: 1) 2d detection, 2) distance, orientation, 3d size 3) association and 4) prediction of future movement.

The authors rely on mono 3DOD and perform tracking on it. The estimation of 3d bbox largely extends deep3dbox. It also directly regresses inverse distance (1/d, or disparity), and is trained with roiAligned features instead of using image patches such as deep3Dbox.

Key ideas

Technical details