ObjMotionNet: Self-supervised Object Motion and Depth Estimation from Video

August 2020

tl;dr: Train a PoseNet-like ObjMotionNet to predict 6 DoF pose change of each object in monodepth.

Overall impression

The idea of predicting object level motion is similar to Struct2Depth. It focuses on the depth of the foreground objects, similar to ForeSeE. It predicts 6 DoF object pose change, similar to VelocityNet.

The paper only deals with objects with rigid motion, such as cars and trucks, and does not deal with non-rigidly moving objects such as pedestrians.

Key ideas

Technical details