Learning-Deep-Learning

MonoPair: Monocular 3D Object Detection Using Pairwise Spatial Relationships

June 2020

tl;dr: mono3D with pair wise relation and non-linear optimization.

Overall impression

This work is inspired by CenterNet. it not only predicts the 3d bbox from the center of the bbox (similar to RTM3D but without predicting the eight points directly). It is similar to the popular solutions to the Kaggle mono3D competition.

The main idea is to predict distance of each instance and relative distance between neighboring pairs, and their corresponding uncertainties, then use nonlinear optimization (with g2o) for joint optimization. It refines the detection results based on spatial relationships. The mining of pair-wise relationship if similar to MoNet-3D.

MonoPair improved accuracy dramatically, especially for heavily occluded scenario.

Key ideas

Technical details

Notes