Learning-Deep-Learning

3D-LaneNet: End-to-End 3D Multiple Lane Detection

March 2020

tl;dr: First paper on Monocular 3D Lane line detection.

Overall impression

3D LaneNet is the first work on 3D lane line detection. This is very close to Tesla’s 3D lane line detection.

3D LaneNet does not need fragile assumptions such as the flat ground assumption, only assumes zero camera roll wrt to the local road surface. The network estimates the camera height and pitch (together with known camera intrinsics, they are all that are needed to determine the homography between BEV and perspective). –> This is different from the direct prediction of H in LaneNet.

The network architecture is a dual-pathway backbone which translates between image and BEV space. This is similar to Qualcomm’s deep radar detector. However the transformation parameter is estimated on the fly by a localization network, similar to Sfm-learner, which is essentially a special case of Spatial Transformer Network. This is another way to lift features to BEV than Orthogonal Feature Transform.

The system also works for 2D lane line detection, and reaches near SOTA performance on TuSimple dataset. The regression of different lanes at preset longitudinal (y location in perspective image) is widely used in industry.

Key ideas

Technical details

Notes