Learning Object-specific Distance from a Monocular Image

October 2019

tl;dr: Train a neural network to learn distance from patches.

Overall impression

The use of CNN to directly predict distance for given objects in the image. Extract features from patches should be effective in predicting distance.

The paper also demonstrated that the vehicle category is key to the good performance, which makes perfect sense as the size of the same type of vehicle is small.

Dense depth estimation is too costly in both memory footprint and processing time for autonomous driving and object level distance should be good enough.

Refer to DisNet for a much simpler and practical way to estimate depth.

Key ideas

Technical details